Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warning about parsing errors (like readr does) #247

Closed
matthiasgomolka opened this issue Jul 3, 2020 · 2 comments
Closed

Add warning about parsing errors (like readr does) #247

matthiasgomolka opened this issue Jul 3, 2020 · 2 comments
Labels
feature a feature request or enhancement

Comments

@matthiasgomolka
Copy link

I noticed that vroom does not throw a warning when the parsing according to a column specification fails. This seems very dangerous, since the use would have to check manually if the parsing went fine. For large data (where I would like to use vroom instead of readr) this is likely to be infeasible. Here is an example:

library(readr)
library(vroom)

# readr + correct specification
(read_csv(
    "A,B\n1,2020-01-01",
    col_types = cols(A = col_integer(), B = col_date())
))
#> # A tibble: 1 x 2
#>       A B         
#>   <int> <date>    
#> 1     1 2020-01-01


# readr + incorrect specification
(read_csv(
    "A,B\n1,2020-01-01",
    col_types = cols(B = col_integer(), A = col_date())
))
#> Warning: 2 parsing failures.
#> row col               expected actual         file
#>   1   A date like              1      literal data
#>   1   B no trailing characters -01-01 literal data
#> # A tibble: 1 x 2
#>   A              B
#>   <date>     <int>
#> 1 NA            NA
# readr warns about parsing errors


# vroom + correct specification
(vroom(
    "A,B\n1,2020-01-01",
    col_types = cols(A = col_integer(), B = col_date())
))
#> # A tibble: 1 x 2
#>       A B         
#>   <int> <date>    
#> 1     1 2020-01-01


# vroom + incorrect specification
(vroom(
    "A,B\n1,2020-01-01",
    col_types = cols(B = col_integer(), A = col_date())
))
#> # A tibble: 1 x 2
#>   A              B
#>   <date>     <int>
#> 1 NA            NA
# vroom does not warn about parsing errors

Created on 2020-07-03 by the reprex package (v0.3.0)

Would it be possible to more or less copy & paste the respective code from readr to vroom to gain the safety? Alternatively, vroom could fall back to character, like data.table does when a column can't be parsed as specified.

@jimhester
Copy link
Collaborator

Yes, this is a known limitation in the current implementation. It is not trivial to do however as the parsing is done lazily, but it is planned to add something like readr's warnings for future work.

@jimhester jimhester added the feature a feature request or enhancement label Aug 3, 2020
@jimhester
Copy link
Collaborator

Fixed by #284

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants