Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bind_rows(): accept lists? #1104

jennybc opened this issue Apr 25, 2015 · 6 comments

bind_rows(): accept lists? #1104

jennybc opened this issue Apr 25, 2015 · 6 comments


Copy link

@jennybc jennybc commented Apr 25, 2015

This tweet reminded me of this wish.

Sometimes I am disappointed that bind_rows() insists that its input already be data.frames. In that sense, it's not a drop-in replacement for data.table::rbindlist(). Any chance bind_rows() might accept a list of lists?

> my_list <- list(list(x = 1, y = 'a'), list(x = 2, y = 'b'))
> data.table::rbindlist(my_list) %>% str
Classes ‘data.table’ and 'data.frame':  2 obs. of  2 variables:
 $ x: num  1 2
 $ y: chr  "a" "b"
 - attr(*, ".internal.selfref")=<externalptr> 

> dplyr::bind_rows(my_list)
Error: object at index 1 not a data.frame
Copy link

@romainfrancois romainfrancois commented Apr 25, 2015

I can probably relax that a bit. starting from here:

List rbind_all( StrictListOf<DataFrame, NULL_or_Is<DataFrame> > dots ){
    return rbind__impl(dots) ;

Maybe instead of a StrictListOf<DataFrame, NULL_or_Is<DataFrame> > I could use something like a StrictListOf<Bindable> where Bindable is allowed to be NULL, a data.frame or a list with some constraints, e.g. that all the components have equal lengths or something.

Copy link

@hadley hadley commented Apr 25, 2015

I think it would be reasonable to accept:

  • NULL (ignore)
  • data frame (as currently)
  • a named list where each element is the same length

Maybe call it dataframeable or something like that? dataframeish?

(If we accept lists and null here, we should consider doing it for other verbs too, although maybe it's not so important because bind_rows() is usually done once early in the data import process)

Copy link

@lionel- lionel- commented Apr 25, 2015

But then how is bind_rows() going to be able to make the difference between a list to be taken as a data frame and a list of data frames, in case this list happens to be named? Wouldn't the data frames be taken as components of a list-column inside one dataframeable list, instead of a list of data frames to bind together?

Also linked to #992 if it gets merged.

Copy link

@Mullefa Mullefa commented Apr 30, 2015

+1 for accepting NULL's.

As an example, it would be convenient if both these cases returned the iris data set (similar functionality to rbind() in this respect):

bind_rows(iris, NULL)
bind_rows(list(iris, NULL))

@romainfrancois romainfrancois self-assigned this Apr 30, 2015
Copy link
Member Author

@jennybc jennybc commented Apr 30, 2015

Related wish for the new and improved bind_rows(): an ID variable. If you are row binding a list of data.frames or conformable lists, there's a high chance you want the names from the original list to come in as a variable in the result. This is one of the very best things about, e.g., plyr::ldply(), which I still resort to often in dplyr-ish projects.

Copy link

@lionel- lionel- commented Apr 30, 2015

I wrote a PR for this, #825, which will need to be adapted to what I did in #992 (if it is still relevant).

@hadley hadley added this to the 0.5 milestone May 19, 2015
romainfrancois added a commit that referenced this issue Aug 15, 2015
romainfrancois added a commit that referenced this issue Sep 15, 2015
@lock lock bot locked as resolved and limited conversation to collaborators Jun 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants