Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map_dfr() column binds vectors #472

Open
garrettgman opened this issue Mar 9, 2018 · 7 comments
Open

map_dfr() column binds vectors #472

garrettgman opened this issue Mar 9, 2018 · 7 comments
Labels
feature map 🗺️
Milestone

Comments

@garrettgman
Copy link
Member

@garrettgman garrettgman commented Mar 9, 2018

map_dfr() does the same thing as map_dfc() when given a list of vectors: it column binds the vectors into a data frame.

list_of_vecs <- list(a = c(1,1,1, 1), b = c(2, 2, 2, 2), c = c(3, 3, 3, 3))
list_of_vecs %>% map_dfr(~.x)
## # A tibble: 4 x 3
##       a     b     c
##   <dbl> <dbl> <dbl>
## 1     1     2     3
## 2     1     2     3
## 3     1     2     3
## 4     1     2     3

I'd expect map_dfr() to row bind the vectors into a data frame with one long column, or to throw an error.

@jennybc
Copy link
Member

@jennybc jennybc commented Mar 9, 2018

Past thread re: difficulty of row binding: #179

@hadley hadley added feature map 🗺️ labels May 5, 2018
@lionel-
Copy link
Member

@lionel- lionel- commented Nov 29, 2018

I'd expect map_dfr() to row bind the vectors into a data frame with one long column, or to throw an error.

Perhaps taking each vector as a row would be more natural? They'd need consistent internal names.

Right now they are taken as columns because bind_rows() and bind_cols() have somewhat sloppy semantics and interpret lists as data frames. This will need to wait on the new vctrs-based tools.

@lionel- lionel- added this to the vctrs milestone Nov 29, 2018
@jennybc
Copy link
Member

@jennybc jennybc commented Nov 29, 2018

tibble is also in a holding pattern on a related matter. I might add a discussion of this to my existing slot in Monday's group meeting.

@CottonRockwood
Copy link

@CottonRockwood CottonRockwood commented Feb 7, 2019

My expectation would be that map_dfr would r_bind the results as rows if they are vectors while map_dfc would combine the the vectors as columns.

@DavisVaughan DavisVaughan mentioned this issue Jun 18, 2019
2 tasks
@hadley
Copy link
Member

@hadley hadley commented Jan 24, 2020

Should switch to vec_rbind() and vec_cbind() and see what breaks.

@leungi
Copy link

@leungi leungi commented Apr 13, 2020

Faced similar issues when trying to get a tidy tibble from a vector of file paths off fs::dir_ls().

Ended up with purrr::map() + tibble::enframe() + tidyr::unnest() combo.

> library(dplyr)
> 
> file_ls <- fs::dir_ls("./raw_data/Guidelines/",
 regexp = "*.md"
 )
> 
> file_ls %>%
 .[1:2] %>%
 purrr::map_dfr(readr::read_file) %>%
 bind_rows(.id = "doc_id")
# A tibble: 1 x 3
  doc_id `./raw_data/Guidelines/ `./raw_data/Guidelines/
  <chr>  <chr>                                   <chr>                                 
  1      **Coiled~                **Operation~
> 
> file_ls %>%
 .[1:2] %>%
 purrr::map(readr::read_file) %>% 
 tibble::enframe() %>%
 tidyr::unnest()
# A tibble: 2 x 2
  name                                       value                                     
  <chr>                                      <chr>                                     
  ./raw_data/Guidelines/                     **Coiled~
  ./raw_data/Guidelines/                     **Operation~

@iago-pssjd
Copy link

@iago-pssjd iago-pssjd commented Apr 24, 2020

I have a problem I believe it is related to this issue. Let me know if it is. I start with:

mtcars %>%
  split(.$cyl) %>%
  map(~ lm(mpg ~ wt, data = .x)) %>%
  map(summary) %>%
  map_dbl("r.squared")

Let me test a variation. I am using CRAN version of dplyr, so I use yet group_nest instead of nest_by, so later it will be probably easier to do than this:

mtcars %>%
  group_nest(cyl) %>%
  mutate(model = purrr::map(.data$data, ~lm(mpg ~ wt, data = .x)),
         smodl = purrr::map(.data$model, summary),
         ramod = purrr::map_dbl(.data$smodl, "r.squared"),
         aramod = purrr::map_dbl(.data$smodl, "adj.r.squared"))

Then, my goal is: why have one to use 2 times map_dbl inside mutate when it should be possible to use once map_dfc or map_dfr inside bind_cols?

What follows is my attempt to get it. First, I define 3 functions and write in another way the last previous code:

rlm.sq <- function(slmod){
  slmod$r.squared
}
arlm.sq <- function(slmod){
  slmod$adj.r.squared
}
frlm.sq <- function(slmod){
  data.frame(r = slmod$r.squared, a =  slmod$adj.r.squared)
}
mtcars %>%
  group_nest(cyl) %>%
  mutate(model = purrr::map(.data$data, ~lm(mpg ~ wt, data = .x)),
         smodl = purrr::map(.data$model, summary),
         ramod = purrr::map_dbl(.data$smodl, ~rlm.sq(.x)),
         aramod = purrr::map_dbl(.data$smodl, ~arlm.sq(.x)))
# next, a test
mtcars %>%
  group_nest(cyl) %>%
  mutate(model = purrr::map(.data$data, ~lm(mpg ~ wt, data = .x)),
              smodl = purrr::map(.data$model, summary)) %>%
              slice(1) %>%
              pull(smodl) %>%
             extract2(1) %>%
             frlm.sq()

It works, but when I try

mtcars %>%
  group_nest(cyl) %>%
  mutate(model = purrr::map(.data$data, ~lm(mpg ~ wt, data = .x)),
         smodl = purrr::map(.data$model, summary)) %>%
         bind_cols(purrr::map_dfr(.data$smodl, ~frlm.sq(.x)))

I do not get anything. Maybe I should use map_dfc instead of map_dfr?:

mtcars %>%
  group_nest(cyl) %>%
  mutate(model = purrr::map(.data$data, ~lm(mpg ~ wt, data = .x)),
         smodl = purrr::map(.data$model, summary)) %>%
         bind_cols(purrr::map_dfc(.data$smodl, ~frlm.sq(.x)))

Same result.

Actually, in my real code I use neither lm function nor frlm.sq, but, instead of .data$smodl I have a data frame which I summarise to 3 scalar variables besides the grouping variables.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature map 🗺️
Projects
None yet
Development

No branches or pull requests

7 participants