Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a cbind method for tbl_df? #85

Closed
gavinsimpson opened this issue May 27, 2016 · 12 comments
Closed

Add a cbind method for tbl_df? #85

gavinsimpson opened this issue May 27, 2016 · 12 comments
Assignees

Comments

@gavinsimpson
Copy link

gavinsimpson commented May 27, 2016

I recently needed to do the equivalent of cbind(foo = 1:3, bar) where bar was a tbl_df, where I wanted foo to end up as the first column/variable in the resulting tbl_df.

The dplyr solutions suggested to me involved mutate() + select(..., everything()) or bind_cols(), which seems to void a reason for pulling the tibbles out of that package into this one.

Having this would be a nice usability boon for the pkg.

@krlmlr krlmlr added the ready label Jun 13, 2016
@krlmlr
Copy link
Member

krlmlr commented Jun 13, 2016

I've checked methods(class = "data.frame") in a new R session, indeed it looks like cbind() and rbind() are the only methods we want to have in tibble that are not overridden yet. Have I missed anything?

@krlmlr krlmlr self-assigned this Jun 13, 2016
@krlmlr krlmlr added in progress and removed ready labels Jun 13, 2016
@krlmlr
Copy link
Member

krlmlr commented Jun 13, 2016

97a0e10 contains a draft, but especially the cases when binding a data frame to a non-data frame are tricky. There's cbind2() but it's only triggered for S4 objects -> not currently for tibbles. I'm shelving this for now.

@krlmlr
Copy link
Member

krlmlr commented Jun 14, 2016

Related: #34 for rbind().

@Deleetdk
Copy link

Deleetdk commented Dec 2, 2016

I second this issue. My use case is this:

Browse[1]> orpha_ext_id_extractor(x$ExternalReferenceList)
# A tibble: 1 × 2
  ICD_10   OMIM
   <chr>  <chr>
1  Q77.3 607131
Browse[1]> d
# A tibble: 1 × 4
                                           name Orpha_id DisorderFlag name_variants
                                          <chr>    <chr>        <chr>        <list>
1 Multiple epiphyseal dysplasia, Al-Gazali type   166024          476     <chr [1]>
Browse[1]> cbind(d,
        orpha_ext_id_extractor(x$ExternalReferenceList)
  ) %>% str
'data.frame':	1 obs. of  6 variables:
 $ name         : chr "Multiple epiphyseal dysplasia, Al-Gazali type"
 $ Orpha_id     : chr "166024"
 $ DisorderFlag : chr "476"
 $ name_variants:List of 1
  ..$ Synonym: chr "Multiple epiphyseal dysplasia-macrocephaly-distinctive facies syndrome"
 $ ICD_10       : chr "Q77.3"
 $ OMIM         : chr "607131"

I need to work with list columns, so I want to use tibbles to sidestep some problems. It's a bit annoying if they keep getting implicitly converted back to regular data frames.

I took a stab at it. This is probably a terrible way to do, and maybe buggy for other use cases. I left in my mistakes and attempts in case they are useful to someone else.

cbind.tbl_df = function(...) {
  #data table related?
  if (!identical(class(..1), "data.frame")) {
   for (x in list(...)) {
     if (inherits(x, "data.table"))
       return(data.table::data.table(...))
   }
  }

  #obvious attempt
  # data_frame(...)
  #Error in lazyeval::lazy_dots(...) : Promise has already been forced 
  #dont know what this means
  
  #another attempt
  #list(...) %>% set_names("x" + seq_along(.)) %>% as_data_frame
  #Error: Each variable must be a 1d atomic vector or list.

  #maybe hack solution...
  #data.frame(..., check.names = FALSE) %>% as_data_frame
  #this vectorizes the list column yielding 3 rows instead of 1
  
  #make a long named list, then to d_f
  l = list(...)
  
  #we need to know whether to remove one level of lists
  #otherwise we would remove list columns supplied to the function
  df_vec = map_lgl(l, inherits, "data.frame")
  
  #works but complicated!
  # #peel one layer of the dfs
  # l2 = do.call(what = c, args = list(l[[which(df_vec)]]))
  # 
  # #join with the other arguments without peeling off a layer
  # l3 = c(l[which(!df_vec)], l2)
  # 
  # #finally, convert to d_f
  # l3 %>% as_data_frame
  
  #loop
  l4 = list()
  for (i in seq_along(l)) {
    #peel or not
    if (df_vec[i]) {
      #peel
      l4 = c(l4, l[[i]])
    } else {
      #not
      l4 = c(l4, l[i])
    }
  }
  
  l4 %>% as_data_frame
}

Test:

> cbind(v = 0, d = data_frame(a = 1), l = list(1:3))
# A tibble: 1 × 3
      v     a         l
  <dbl> <dbl>    <list>
1     0     1 <int [3]>

So we get the right output.

@krlmlr
Copy link
Member

krlmlr commented Jan 26, 2017

#34 (comment)

@krlmlr
Copy link
Member

krlmlr commented Apr 17, 2017

Unfortunately, a cbind method seems to require patching the base R function, like data.table does. Please use dplyr::bind_cols() instead.

#34 (comment)

@krlmlr krlmlr closed this as completed Apr 17, 2017
@gavinsimpson
Copy link
Author

Can't bind_cols() (and bind_rows() sensu #34 ) be moved to tibble? It doesn't make much sense to me for functions to manipulate tibbles to reside in dplyr.

Does that need a separate issue (perhaps one that I have overlooked)?

@krlmlr
Copy link
Member

krlmlr commented Apr 19, 2017

We should do this eventually, but this also seems to require moving considerable parts of the C++ code, especially for bind_rows(). This code seems better suited in tibble anyway.

@jeffreypullin
Copy link

Would this make more sense now, given that the vctrising of tibble/dplyr means that bind_rows/bind_cols should have relatively concise implementations?

(I ran into this issue today in a small package I'm developing)

Cheers!

@jennybc
Copy link
Member

jennybc commented Aug 6, 2019

@jeffreypullin If it helps, there is now tibble::add_column(), which addresses the original gap that started this thread. A lot has changed over 3 years.

@jeffreypullin
Copy link

Hi Jenny thanks for that - looks very useful. Apologies for not noticing it in the docs.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2020

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants