Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

are_na() should not work on lists #558

Closed
njtierney opened this issue Jul 8, 2018 · 12 comments
Closed

are_na() should not work on lists #558

njtierney opened this issue Jul 8, 2018 · 12 comments
Labels

Comments

@njtierney
Copy link

@njtierney njtierney commented Jul 8, 2018

Hello!

At the moment in is.nan and is.infinite don't work on dataframes, so I wonder if it would be useful to create is_inf, are_inf, and is_nan and are_nan equivalents?

Happy to submit a PR for this if you think it would be in scope

is.infinite(airquality)
#> Error in is.infinite(airquality): default method not implemented for type 'list'
is.nan(airquality)
#> Error in is.nan(airquality): default method not implemented for type 'list'

purrr::map_df(airquality, is.infinite)
#> # A tibble: 153 x 6
#>    Ozone Solar.R Wind  Temp  Month Day  
#>    <lgl> <lgl>   <lgl> <lgl> <lgl> <lgl>
#>  1 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  2 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  3 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  4 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  5 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  6 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  7 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  8 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  9 FALSE FALSE   FALSE FALSE FALSE FALSE
#> 10 FALSE FALSE   FALSE FALSE FALSE FALSE
#> # ... with 143 more rows
purrr::map_df(airquality, is.nan)
#> # A tibble: 153 x 6
#>    Ozone Solar.R Wind  Temp  Month Day  
#>    <lgl> <lgl>   <lgl> <lgl> <lgl> <lgl>
#>  1 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  2 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  3 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  4 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  5 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  6 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  7 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  8 FALSE FALSE   FALSE FALSE FALSE FALSE
#>  9 FALSE FALSE   FALSE FALSE FALSE FALSE
#> 10 FALSE FALSE   FALSE FALSE FALSE FALSE
#> # ... with 143 more rows

Created on 2018-07-08 by the reprex package (v0.2.0).

@lionel-
Copy link
Member

@lionel- lionel- commented Jul 8, 2018

In general we consider it is not good UI to auto-map atomic vector functions over lists.

@lionel- lionel- closed this Jul 8, 2018
@njtierney
Copy link
Author

@njtierney njtierney commented Jul 8, 2018

Sorry if I wasn't clear, what I was hoping would be possible would be are_inf and is_inf in the same way that rlang currently has is_na and are_na functions - the map part was an illustration of the output they might produce.

@lionel-
Copy link
Member

@lionel- lionel- commented Jul 8, 2018

I am not sure I understand because are_na() works on atomic vectors, not on lists/dfs?

@njtierney
Copy link
Author

@njtierney njtierney commented Jul 8, 2018

But are_na does work on data.frames and vectors - which is great!

> rlang::are_na(head(airquality))
  Ozone Solar.R  Wind  Temp Month   Day
1 FALSE   FALSE FALSE FALSE FALSE FALSE
2 FALSE   FALSE FALSE FALSE FALSE FALSE
3 FALSE   FALSE FALSE FALSE FALSE FALSE
4 FALSE   FALSE FALSE FALSE FALSE FALSE
5  TRUE    TRUE FALSE FALSE FALSE FALSE
6 FALSE    TRUE FALSE FALSE FALSE FALSE
> rlang::are_na(head(airquality$Ozone))
[1] FALSE FALSE FALSE FALSE  TRUE FALSE

There is not consistent behaviour in Base R for is.infinite and is.nan - I would expect this to return the same output for a dataframe as does is.na. It does not:

> is.infinite(airquality)
Error in is.infinite(airquality) : 
  default method not implemented for type 'list'

It only works on a column, or a list

is.infinite(airquality$Ozone)
  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [10] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [19] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [28] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [46] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [55] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [64] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [82] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [91] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[100] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[118] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[127] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[136] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[145] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Same as for is.nan

> is.nan(airquality)
Error in is.nan(airquality) : 
  default method not implemented for type 'list'
> is.nan(airquality$Ozone)
  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [10] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [19] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [28] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [46] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [55] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [64] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [82] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [91] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[100] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[118] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[127] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[136] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[145] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Do you think rlang would support more consistency across these functions

  • is.infinite and is.nan ?
@lionel-
Copy link
Member

@lionel- lionel- commented Jul 8, 2018

But are_na does work on data.frames and vectors - which is great!

Well I think that's a bug ;)

@lionel- lionel- changed the title is_inf, are_inf, is_nan, and are_nan are_na() should not work on lists Jul 8, 2018
@lionel- lionel- reopened this Jul 8, 2018
@lionel-
Copy link
Member

@lionel- lionel- commented Jul 8, 2018

I think our current viewpoint is that it's sloppy typing for a function to work on both T and Coll<T>. But the connection between vector functions and df functions is something that we plan to explore as part of the vctrs and tidyr package, IIRC.

cc @hadley @jennybc

@hadley
Copy link
Member

@hadley hadley commented Jul 8, 2018

NULL is the equivalent of NA for a list, so I think are_na(list(1, NULL)) would return c(FALSE, TRUE). But we need to think this through. It certainly should always return a logical vector, not sometimes a matrix.

@njtierney
Copy link
Author

@njtierney njtierney commented Aug 13, 2018

I think it would be helpful to have a verb/function that does always return a data.frame / matrix:

  • is_na returns TRUE/FALSE (scalar)
  • are_na returns vector of TRUE/FALSE (vector)
  • are_na_<class> returns class <class>. e.g., are_na_df returns a data.frame of NA?

What do you think?

@hadley
Copy link
Member

@hadley hadley commented Aug 13, 2018

Yes, but I think that would be out of scope for rlang, and sounds more like a vctrs function to me.

@njtierney
Copy link
Author

@njtierney njtierney commented Aug 13, 2018

OK fair enough - should I add this to the existing issue on sum(is.na()) helpers?

@hadley
Copy link
Member

@hadley hadley commented Aug 13, 2018

Yes - I think the principle that we could now follow is that are_na(x) has type logical, but shape that matches x. I think that is a succinct description of the behaviour that you desire.

@hadley
Copy link
Member

@hadley hadley commented Aug 13, 2018

(And actually that's a nice description of a vectorised function - the shape of the output matches the shape of the input(s))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants