New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow map_*() to consult .null for length 0 elements #254

Closed
jennybc opened this Issue Oct 24, 2016 · 14 comments

Comments

Projects
None yet
3 participants
@jennybc
Member

jennybc commented Oct 24, 2016

It's baaack. See #231 and #110. When map_*() is used for indexing, we can now use .null to specify a value if the key is missing or has explicit value of NULL. But now I am struggling with the case where the element is present, is not NULL, but has length zero.

library(purrr)
list(list("a" = 1L), list("a" = list(), "b" = 2L)) %>% 
  map_int("a", .null = NA_integer_)
#> Error: Result 2 is not a length 1 atomic vector

I have real examples of this phenomenon from the GitHub API.

cc @jeremystan

@jennybc jennybc changed the title from Allow map_*() to treat length 0 elements as NAs to Allow map_*() to consult .null for length 0 elements Oct 25, 2016

@lionel-

This comment has been minimized.

Member

lionel- commented Oct 26, 2016

Should this be another argument .empty?

@jennybc

This comment has been minimized.

Member

jennybc commented Oct 28, 2016

At least when JSON is the source of the list, it seems morally ok to handle via .null and not introduce another argument. That's what I took away from this conversation re: jsonlite, which is the source of my list() elements (jeroen/jsonlite#154).

@lionel-

This comment has been minimized.

Member

lionel- commented Oct 28, 2016

otoh NULL and empty vectors are really different things in R.

It's true that we already conflate void elements with NULL values, but that's mainly because void elements are generally treated as NULL in R due to its lisp roots.

@jennybc

This comment has been minimized.

Member

jennybc commented Oct 28, 2016

I see your point. But once you've committed to making an atomic vector of a certain type, I would guess that void, NULL, and empty almost always become NA. If they need to be handled differently, maybe people should do some work with map() first? I suppose that is what I could / should be doing above.

@hadley

This comment has been minimized.

Member

hadley commented Nov 21, 2016

Related to #199

@hadley hadley added the bug label Mar 3, 2017

@hadley hadley closed this in d24d1b5 Mar 4, 2017

@lionel-

This comment has been minimized.

Member

lionel- commented Mar 7, 2017

This feels wrong to me, conflating the notions of null and empty. Empty is a property of vectors, while null is a property of any R object (which are all option types thanks to null).

We have the same problem in rlang. is_empty() returns FALSE for NULL, but we don't have a predicate that returns TRUE for both empty vectors and NULL. I think that if the notions had to be conflated (which I don't think is a good idea), it should be that empty vectors can also be NULL rather than NULL objects can also be empty vectors, which this change implies.

It would still be useful to have a term that means "either NULL or empty". Maybe lengthless, since NULL has length 0. Or "blank"?

@lionel-

This comment has been minimized.

Member

lionel- commented Mar 7, 2017

To be more precise, I don't think it's a bad idea to treat NULLs and empty the same way, I do it all the time by checking the length, e.g. if (!length(x)) .... Just that we should promote a consistent vocabulary for these fundamental R properties.

@hadley

This comment has been minimized.

Member

hadley commented Mar 7, 2017

We were partially doing this anyway and it makes life easier; but maybe we should change .null to .empty?

@lionel-

This comment has been minimized.

Member

lionel- commented Mar 7, 2017

yes I now see the rlang definition for is_empty() is in fact based on length > 0. So maybe it wouldn't be too bad to consider that "empty" = empty + null?

@hadley

This comment has been minimized.

Member

hadley commented Mar 7, 2017

Yeah. I sometimes think of NULL as an object is a vector with no type but length 0.

@jennybc

This comment has been minimized.

Member

jennybc commented Mar 7, 2017

You could attack this from another angle and name the argument .default, instead of .null or .empty. This conveys it is used whenever the input does not provide anything suitable but doesn't make specific assertions about why the input is unsuitable. There is precedent for default, both with and without the dot, in dplyr::recode(), dplyr::nth(), dplyr::lead(), dplyr::lag(), etc.

@jennybc

This comment has been minimized.

Member

jennybc commented Mar 7, 2017

In fact, dplyr seems to have a function for this: dplyr:::default_missing().

@hadley

This comment has been minimized.

Member

hadley commented Mar 7, 2017

Another option would be .absent

@lionel-

This comment has been minimized.

Member

lionel- commented Mar 7, 2017

I like .default as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment