Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

show dimensions of list columns with DT #3671

Closed
randomgambit opened this issue Jun 30, 2019 · 4 comments · Fixed by #4154
Closed

show dimensions of list columns with DT #3671

randomgambit opened this issue Jun 30, 2019 · 4 comments · Fixed by #4154
Milestone

Comments

@randomgambit
Copy link

Hello everyone,

I am all IN for a better integration of DT in the tidyverse ecosystem. DT is really great and I think some features of the tidyverse could be very useful in a DT setting.

In particular, I am a huge fan of purrr and list columns. Consider this.

> tibble(group = c(1,1,1,2,2,2),
+            val = list(list(1,2,3),list(1,2,3),list(1,2,3),
+                       list(1,2,3),list(1,2,3),list(1,2,3))) 
# A tibble: 6 x 2
  group val       
  <dbl> <list>    
1     1 <list [3]>
2     1 <list [3]>
3     1 <list [3]>
4     2 <list [3]>
5     2 <list [3]>
6     2 <list [3]>

As you can see, I can clearly see the dimension (or length) of each element of the list-column.
As far as I know, this is not possible with DT

>  data.table(group = c(1,1,1,2,2,2),
+            val = list(list(1,2,3),list(1,2,3),list(1,2,3),
+                       list(1,2,3),list(1,2,3),list(1,2,3))) 
   group    val
1:     1 <list>
2:     1 <list>
3:     1 <list>
4:     2 <list>
5:     2 <list>
6:     2 <list>

This is a bit unfortunate. Here the idea is to use furrr (which leverages purrr) to easily run computations in parallel across the list columns.


> mydf[,  newval := future_map(val, ~length(.x))]
> mydf
   group    val newval
1:     1 <list>      3
2:     1 <list>      3
3:     1 <list>      3
4:     2 <list>      3
5:     2 <list>      3
6:     2 <list>      3

Being able to see the dimension of the embedded list (or DT) is very important when one creates list columns with whole data.frames and run regressions on them. What do you think? Could the printing of DT show an output similar to the printing of the tibble?

Thanks!

@randomgambit
Copy link
Author

Similarly, this shows the dimensions of the embedded DT. Why not the opposite as well?


seq(1:3) %>% map(., ~ data.table(group = .x,
                                 mycol = seq(1, .x))) %>%
  enframe()
# A tibble: 3 x 2
   name value           
  <int> <list>          
1     1 <df[,2] [1 x 2]>
2     2 <df[,2] [2 x 2]>
3     3 <df[,2] [3 x 2]>

@franknarf1
Copy link
Contributor

Related maybe, re user-chosen printing rules: #3414 , #1523 (comment) (and other items in that thread)

Side note: future_map(val, ~length(.x)) looks like a lot of overhead for lengths(val) from base R..?

@randomgambit
Copy link
Author

@franknarf1 yes, thats. the future_map call was an example. In real life, I would run much more complex operations in parallel :)

@MichaelChirico
Copy link
Member

#3414 will indeed be the start to this

@jangorecki jangorecki changed the title feature request: show dimensions of list columns with DT show dimensions of list columns with DT Jul 30, 2019
@mattdowle mattdowle added this to the 1.12.9 milestone Jan 3, 2020
@jangorecki jangorecki modified the milestones: 1.12.11, 1.12.9 May 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants