Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unite(na.rm = TRUE) only removes character NAs #765

Closed
davidhunterwalsh opened this issue Sep 27, 2019 · 4 comments
Closed

unite(na.rm = TRUE) only removes character NAs #765

davidhunterwalsh opened this issue Sep 27, 2019 · 4 comments
Labels
bug strings 🎻

Comments

@davidhunterwalsh
Copy link

@davidhunterwalsh davidhunterwalsh commented Sep 27, 2019

The NA values from all-NA columns get concatenated in as "NA" strings, as if na.rm = FALSE. If this is intended, then that's not documented.

library(tidyverse)

data <- tribble(
  ~Name,    ~Postalcode, ~Parent,  ~Parent2, ~Parent3, 
  "Paul",   "4732",      "Mother", NA,       NA,       
  "Edward", "9045",      NA,       NA,       NA, 
  "Mary",   "3476",      "Mother", NA,       NA,       
  NA,       NA,          NA,       NA,       NA,      
  NA,       "2468",      NA,       NA,       NA
)

# The NAs from both Parent2 and Parent3 are pasted in as strings, while the NAs
# from Parent1 are properly removed
data %>% unite(Parent_full, Parent:Parent3, sep = "|", na.rm = TRUE)
#> # A tibble: 5 x 3
#>   Name   Postalcode Parent_full 
#>   <chr>  <chr>      <chr>       
#> 1 Paul   4732       Mother|NA|NA
#> 2 Edward 9045       NA|NA       
#> 3 Mary   3476       Mother|NA|NA
#> 4 <NA>   <NA>       NA|NA       
#> 5 <NA>   2468       NA|NA

# Add a value anywhere in Parent3, and all its NAs get removed, but Parent2 is
# still getting pasted in in the middle
data[[2, "Parent3"]] <- "Uncle"
data %>% unite(Parent_full, Parent:Parent3, sep = "|", na.rm = TRUE)
#> # A tibble: 5 x 3
#>   Name   Postalcode Parent_full
#>   <chr>  <chr>      <chr>      
#> 1 Paul   4732       Mother|NA  
#> 2 Edward 9045       NA|Uncle   
#> 3 Mary   3476       Mother|NA  
#> 4 <NA>   <NA>       NA         
#> 5 <NA>   2468       NA

# Add a value to Parent3, and now there's no columns with all NAs, so no NAs are
# pasted in (also, concatenating all-missing values results in "" instead of an NA)
data[[1, "Parent2"]] <- "Aunt"
data %>% unite(Parent_full, Parent:Parent3, sep = "|", na.rm = TRUE)
#> # A tibble: 5 x 3
#>   Name   Postalcode Parent_full
#>   <chr>  <chr>      <chr>      
#> 1 Paul   4732       Mother|Aunt
#> 2 Edward 9045       Uncle      
#> 3 Mary   3476       Mother     
#> 4 <NA>   <NA>       ""         
#> 5 <NA>   2468       ""

Created on 2019-09-28 by the reprex package (v0.3.0)

@davidhunterwalsh davidhunterwalsh changed the title unite(na.rm = TRUE) fails to remove NAs if they come from a column that is all NAs unite(na.rm = TRUE) fails to remove NAs if they come from a column that is all NAs Sep 27, 2019
@davidhunterwalsh
Copy link
Author

@davidhunterwalsh davidhunterwalsh commented Sep 27, 2019

@batpigandme

This comment has been minimized.

@mruessler

This comment has been minimized.

@hadley
Copy link
Member

@hadley hadley commented Nov 24, 2019

Minimal reprex:

library(tidyr)

df <- tibble(
  x = "x",
  lgl = NA,
  dbl = NA_real_,
  chr = NA_character_
)

df %>% unite(out, c("x", "lgl"), na.rm = TRUE) %>% .$out
#> [1] "x_NA"
df %>% unite(out, c("x", "dbl"), na.rm = TRUE) %>% .$out
#> [1] "x_NA"
df %>% unite(out, c("x", "chr"), na.rm = TRUE) %>% .$out
#> [1] "x"

Created on 2019-11-24 by the reprex package (v0.3.0)

@hadley hadley changed the title unite(na.rm = TRUE) fails to remove NAs if they come from a column that is all NAs unite(na.rm = TRUE) only removes character NAs Nov 24, 2019
@hadley hadley added bug strings 🎻 labels Nov 24, 2019
@hadley hadley closed this as completed in cc1f210 Nov 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug strings 🎻
Projects
None yet
Development

No branches or pull requests

4 participants