Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addresses bug where the number of missings in a row is not calcu… #239

Merged
merged 1 commit into from Oct 21, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 2 additions & 2 deletions DESCRIPTION
@@ -1,7 +1,7 @@
Package: naniar
Type: Package
Title: Data Structures, Summaries, and Visualisations for Missing Data
Version: 0.4.2.9001
Version: 0.4.3.9000
Authors@R: c(
person("Nicholas", "Tierney",
role = c("aut", "cre"),
Expand Down Expand Up @@ -38,7 +38,7 @@ ByteCompile: TRUE
Suggests:
knitr,
rmarkdown,
testthat,
testthat (>= 2.1.0),
rpart,
rpart.plot,
covr,
Expand Down
6 changes: 6 additions & 0 deletions NEWS.md
@@ -1,3 +1,9 @@
# naniar 0.4.3.9000 (2019/10/21)

## Big Fix

- Address bug where the number of missings in a row is not calculated properly - see [238](https://github.com/njtierney/naniar/issues/238) and [232](https://github.com/njtierney/naniar/issues/232). The solution involved using rowSums(is.na(x)), which was 3 times faster.

# naniar 0.4.2.9001 (2019/04/17)

## Minor Changs
Expand Down
25 changes: 4 additions & 21 deletions R/prop-pct-var-case-miss-complete.R
Expand Up @@ -95,30 +95,13 @@ pct_complete_var <- function(data){
#'
prop_miss_case <- function(data){
test_if_null(data)

test_if_dataframe(data)

temp <- data %>%
# which rows are complete?
stats::complete.cases() %>%
mean()

# Return 1 if temp is 1
# Prevent error when all the rows contain a NA and then mean is 1
# so (1 -1)*100 = 0, whereas function should return 1
if (temp == 1) {
return(1)
}

if (temp == 0) {
# Return 0 if temp is 0
# Prevent error when no row contains a NA and then mean is 0
# so (1 -0)*1 = 1, whereas function should return 0.
return(0)
}

return((1 - temp))
# How many missings in each row?
n_miss_in_rows <- rowSums(is.na(data))

# What is the proportion of rows with any missings?
mean(n_miss_in_rows > 0)
}

#' @export
Expand Down
20 changes: 20 additions & 0 deletions R/utils.R
Expand Up @@ -217,3 +217,23 @@ quo_to_shade <- function(...){
class_glue <- function(x){
class(x) %>% glue::glue_collapse(sep = ", ", last = ", or ")
}

simple_names <- function(x){
paste0("x",ncol(seq_len(x)))
}

diag_na <- function(nrow = 5,
ncol = 5){

dna <- diag(x = NA,
nrow = 4,
ncol = 4)
suppressMessages(
tibble::as_tibble(dna,
.name_repair = "unique")) %>%
set_names(paste0("x",seq_len(ncol(.))))
}




53 changes: 53 additions & 0 deletions tests/testthat/test-prop-cases-not-zero.R
@@ -0,0 +1,53 @@
mdf <- data.frame(x = NA)

test_that("prop missing / complete are 0 or 1 where there is one variable", {
expect_equal(prop_miss_case(mdf), 1)
expect_equal(n_case_complete(mdf), 0)
expect_equal(prop_complete_case(mdf), 0)
})

df_diag_na <- diag_na(10)

test_that("prop missing / complete are 0 or 1 where no complete cases", {
expect_equal(prop_miss_case(df_diag_na), 1)
expect_equal(n_case_complete(df_diag_na), 0)
expect_equal(prop_complete_case(df_diag_na), 0)
})

# This tests against
bad_air_quality <- tibble::tribble(
~Ozone, ~Solar.R, ~Wind, ~Temp, ~Month, ~Day,
NA, 190, 7.4, 67, 5, 1,
36, NA, 8, 72, 5, 2,
12, 149, NA, 74, 5, 3,
18, 313, 11.5, NA, 5, 4,
NA, NA, 14.3, 56, NA, 5,
28, NA, 14.9, 66, 5, NA,
NA, 190, 7.4, 67, 5, 1,
36, NA, 8, 72, 5, 2,
12, 149, NA, 74, 5, 3,
18, 313, 11.5, NA, 5, 4,
NA, NA, 14.3, 56, NA, 5,
28, NA, 14.9, 66, 5, NA
)

library(dplyr)
library(tibble)

bad_na_df <- bad_air_quality %>%
summarise(n_missing = n_case_miss(.),
n_complete = n_case_complete(.),
prop_missing = prop_miss_case(.),
prop_complete = prop_complete_case(.))

expected_bad_na_df <- tibble(
n_missing = 12L,
n_complete = 0L,
prop_complete = 0,
prop_missing = 1
)

test_that("prop_miss_case returns same as mean_",{
expect_equal(bad_na_df, expected_bad_na_df)
})