Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbind_list/rbind_all should accept lists as inputs #749

Closed
hadley opened this issue Nov 5, 2014 · 7 comments
Closed

rbind_list/rbind_all should accept lists as inputs #749

hadley opened this issue Nov 5, 2014 · 7 comments

Comments

@hadley
Copy link
Member

@hadley hadley commented Nov 5, 2014

Not entirely sure how this should work, but often you get lists that are structure like a data frame (e.g. each element is a vector of the same length), but don't have the write attributes. It would be nice to be able to convert directly to one data frame, rather than having to convert each one first.

x <- list(list(a = 1, b = 2), list(a = 3, b = 4))
rbind_all(x)
@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Nov 6, 2014

As a first approximation, we could decide on the R side that the lists are rbindable, make data frames and then just call the regular rbind_all, e.g.

> x %>% lapply( as.data.frame )  %>% rbind_all
Source: local data frame [2 x 2]

  a b
1 1 2
2 3 4

@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Nov 6, 2014

The concern I guess is "is the data copied" when passed through as.data.frame. Apparently not:

> x %>% sapply( function(.) sapply(., pryr::address ) )
  [,1]             [,2]
a "0x7f9f1c381c58" "0x7f9f1c381cb8"
b "0x7f9f1c381c88" "0x7f9f1c381ce8"
>
> x %>% lapply(as.data.frame) %>% sapply( function(.) sapply(., pryr::address ) )
  [,1]             [,2]
a "0x7f9f1c381c58" "0x7f9f1c381cb8"
b "0x7f9f1c381c88" "0x7f9f1c381ce8"

A benefit however is that we can leverage as.data.frame recycling:

> x <- list(list(a = 1, b = 2), list(a = 3, b = 4:6))
> x %>% lapply( as.data.frame )  %>% rbind_all
Source: local data frame [4 x 2]

  a b
1 1 2
2 3 4
3 3 5
4 3 6

@romainfrancois
Copy link
Member

@romainfrancois romainfrancois commented Nov 6, 2014

@kevinushey perhaps that means we need as_data_frame which would be to as.data.frame what data_frame is to data.frame

@hadley
Copy link
Member Author

@hadley hadley commented Nov 20, 2014

Good point. I'll write an minimal as_data_frame(). I think an R implementation should be fine

@hadley hadley closed this in 8e65109 Nov 20, 2014
@hadley
Copy link
Member Author

@hadley hadley commented Nov 20, 2014

Speed difference is crazy! (and as.data.frame will only get slower as the number of columns increases)

> l2 <- replicate(26, sample(letters), simplify = FALSE)
> names(l2) <- letters
> microbenchmark::microbenchmark(
+   as_data_frame(l2),
+   as.data.frame(l2)
+ )
Unit: microseconds
              expr      min        lq    median       uq      max neval
 as_data_frame(l2)   48.952   62.2085   71.6335   88.539  170.927   100
 as.data.frame(l2) 3734.746 4062.8385 4378.8375 4979.390 8336.778   100

@lionel-
Copy link
Member

@lionel- lionel- commented Nov 22, 2014

looks awesome!

I have a question regarding coding style. When I started programming I followed Hadley's style but I had a hard time figuring out what to do with class predicates and coercion functions. I first named them is_class() and as_class() but then I checked the code of various rstudio and hadleyverse packages where the convention seems to be is.class() and as.class().

So, wouldn't dplyr's API be more consistent if that function was named as.data_frame()?

@lionel-
Copy link
Member

@lionel- lionel- commented Nov 22, 2014

I just posted an issue about this on r-pkgs

hadley/r-pkgs#150

krlmlr pushed a commit to krlmlr/dplyr that referenced this issue Mar 2, 2016
@lock lock bot locked as resolved and limited conversation to collaborators Jun 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants