Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different output when using col_names and col_types w/ ignored col between dev and CRAN versions #1215

Closed
niheaven opened this issue Jun 2, 2021 · 2 comments

Comments

@niheaven
Copy link

niheaven commented Jun 2, 2021

After updating to readr dev version, some behaviors change if I read a file without colnames and I want to specify some while ignoring other cols.

  • With CRAN 1.4.0
sessioninfo::package_info("readr", dependencies = FALSE)
#>  package * version date       lib source        
#>  readr     1.4.0   2020-10-05 [1] CRAN (R 4.1.0)
#> 
#> [1] E:/Documents/R/win-library/4.1
#> [2] D:/Scoop/apps/r/4.1.0/library
x <- tibble::tribble(~x, ~y, ~z,
                     1, 2, 3,
                     4, 5, 6,
                     7, 8, 9)
readr::write_csv(x, "test.csv", col_names = FALSE)
readr::read_csv("test.csv",
                col_names = c("a", "b"),
                col_types = "-ii")
#> # A tibble: 3 x 2
#>       a     b
#>   <int> <int>
#> 1     2     3
#> 2     5     6
#> 3     8     9
  • With dev
sessioninfo::package_info("readr", dependencies = FALSE)
#>  package * version    date       lib source                          
#>  readr     1.9.9.9000 2021-06-02 [1] Github (tidyverse/readr@df90fb9)
#> 
#> [1] E:/Documents/R/win-library/4.1
#> [2] D:/Scoop/apps/r/4.1.0/library
x <- tibble::tribble(~x, ~y, ~z,
                     1, 2, 3,
                     4, 5, 6,
                     7, 8, 9)
readr::write_csv(x, "test.csv", col_names = FALSE)
readr::read_csv("test.csv",
                col_names = c("a", "b"),
                col_types = "-ii")
#> # A tibble: 3 x 2
#>       b    X3
#>   <int> <int>
#> 1     2     3
#> 2     5     6
#> 3     8     9
readr::read_csv("test.csv",
                col_names = c("unused", "a", "b"),
                col_types = "-ii")
#> # A tibble: 3 x 2
#>       a     b
#>   <int> <int>
#> 1     2     3
#> 2     5     6
#> 3     8     9

As you can see, the recent version read correct column, but assigned incorrect names. If there is a little column, it's worth. But if there are lots of columns that only few of them is needed and I only want to assign names to these needed ones, 1.4.0's params are preferred.

I try to use new col_select param, but cannot find a way that behaves just like the former one.

@jimhester
Copy link
Collaborator

tidyverse/vroom#293 is tracking this issue, unfortunately it isn't entirely trivial to fix.

@niheaven
Copy link
Author

niheaven commented Jun 2, 2021

Thanks, so now I'll mark my code with this notable change 😭

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants