Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check recursive indexing and NULLs #173

Closed
hadley opened this issue Feb 3, 2016 · 17 comments
Closed

Check recursive indexing and NULLs #173

hadley opened this issue Feb 3, 2016 · 17 comments

Comments

@hadley
Copy link
Member

hadley commented Feb 3, 2016

list("a" = list())[[c("a", "b", "c")]]

Maybe this is an R bug? Should at least bring up on R-devel, and consider a work around (tryCatch()?) in as_function.character()

@hadley
Copy link
Member Author

hadley commented Apr 13, 2016

I think we need a new recursive indexing operator that behaves consistently with respect to missing values. [[ is not consistent because:

list(a = list())[[c("a", "b", "c")]]
#> Error in list(a = list())[[c("a", "b", "c")]]: no such index at level 2
list(a = list())[["a"]][["b"]][["c"]]
#> NULL

It would also be nice to have be able to mix subsetting by name and position, so you could do (e.g.)

x <- list(a = list(list(), list(c = 3))
x[[list("a", 1, "c")]]

And we also need to be able to specify a default value to use when the indexed object does not - this makes it easier to use with map_int() etc. This will make this function slightly dangerous because it will always succeed (i.e. there are no situations in which it will fail), but I think that is necessary for the most common use case.

For efficiency, this will need to be implemented in C

cc @jennybc

@hadley
Copy link
Member Author

hadley commented Apr 13, 2016

What should happen if you do recursive indexing on an object that is not recursive?

x <- list(1:10)
index(x, 1, 2, 3)

Should that be an error?

@lionel-
Copy link
Member

lionel- commented Apr 13, 2016

Should that be an error?

That would be consistent with base R

letters[[1]][[2]]
#> Error in letters[[1]][[2]] : subscript out of bounds

@hadley
Copy link
Member Author

hadley commented Apr 13, 2016

@lionel- I think the point here is to not be consistent with base R - we need a function that always returns a value so you can use it to regularise irregular data structures

@lionel-
Copy link
Member

lionel- commented Apr 13, 2016

ok. Then IIUC it shouldn't be an error since recursive data structures always end with a non-recursive vector (or a NULL value).

We can return a NA or a NULL, but in either case we'll have the problem of distinguishing being actual NA/NULL values and implicitly filled ones. Maybe better to go with NULL then since NAs are used for actual data so we shouldn't have ambiguity for these.

@hadley
Copy link
Member Author

hadley commented Apr 13, 2016

@lionel- the default value will be user selectable

@smbache
Copy link
Member

smbache commented Apr 13, 2016

Often wish for something along the lines of Options: https://fsharpforfunandprofit.com/posts/the-option-type/

@lionel-
Copy link
Member

lionel- commented Apr 13, 2016

R's NULL is pretty close, I guess it's only missing the ability to get attributes. If it were the case we could give a class or some other attribute to the filled in NULLs to distinguish them from other NULLs. ggplot2 uses a structure(list(NULL), class = "waiver")) for that purpose.

But this should probably be tackled in R core and not a package. Not sure it's possible to make NULL a proper SEXP type without breaking compatibility though (it's currently a singleton).

@smbache
Copy link
Member

smbache commented Apr 13, 2016

Yeah ... Similar but still not the same...

@lionel-
Copy link
Member

lionel- commented Apr 13, 2016

No actually it's the same. I thought Fsharp none could be subtyped (hence my comment above) but I was mistaken. So all R data structures are actually option types.

@smbache
Copy link
Member

smbache commented Apr 13, 2016

Take a look at the differences mentioned in the article. Some of them would apply still...

@lionel-
Copy link
Member

lionel- commented Apr 13, 2016

I did read the article. If you'd like to continue this discussion let's do it by mail so we don't ping Hadley all the time ;)

@smbache
Copy link
Member

smbache commented Apr 13, 2016

Not necessary :)

Point was more that NULL could be a valid value in the list and SOME (of whatever) would be returned (could then be NULL) and NONE would indicate that the element was non-existing. Just a thought..

@lionel-
Copy link
Member

lionel- commented Apr 13, 2016

And what if none was an actual element of the list? We'd have the same problem. NULL is none, no need to have two of them ;)

@smbache
Copy link
Member

smbache commented Apr 13, 2016

IYSS :)

@hadley
Copy link
Member Author

hadley commented Apr 13, 2016

I think I have an implementation - I'm going to work on some tests and then I'll check it in, and @jennybc can play around with it and give me her feedback :) (for now you'll need to use

@hadley hadley closed this as completed Apr 13, 2016
@hadley hadley reopened this Apr 13, 2016
@hadley
Copy link
Member Author

hadley commented Sep 6, 2016

I'm pretty sure this is solid now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants