Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in vroom_materialize #384

Closed
OfekShilon opened this issue Nov 22, 2021 · 5 comments
Closed

Error in vroom_materialize #384

OfekShilon opened this issue Nov 22, 2021 · 5 comments

Comments

@OfekShilon
Copy link

> csv.url <- "https://www.stats.govt.nz/assets/Uploads/International-trade/International-trade-June-2021-quarter/Download-data/overseas-trade-indexes-June-2021-quarter-provisional-csv.csv"
> read.csv(csv.url)  # succeeds
> data.table::fread(csv.url) # succeeds
> vroom::vroom(csv.url)
...
#Error: (converted from warning) One or more parsing issues, see `problems()` for details                                                          
> vroom::problems()
#Error in vroom_materialize(x, replace = FALSE) : 
#  argument "x" is missing, with no default

This happens with many (all?) other files at this sample collection. Adding explicit delim="," doesn't make a difference.

@jimhester
Copy link
Collaborator

You need to pass problems() the result object, e.g.

url <- "https://www.stats.govt.nz/assets/Uploads/International-trade/International-trade-June-2021-quarter/Download-data/overseas-trade-indexes-June-2021-quarter-provisional-csv.csv"

library(vroom)
x <- vroom(url)
#> Warning: One or more parsing issues, see `problems()` for details
#> Rows: 96770 Columns: 13
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (9): Series_reference, STATUS, UNITS, Subject, Group, Series_title_1, Se...
#> dbl (3): Period, Data_value, MAGNTUDE
#> lgl (1): Series_title_5
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

problems(x)
#> # A tibble: 250 × 5
#>      row   col expected           actual                                 file 
#>    <int> <int> <chr>              <chr>                                  <chr>
#>  1 35227    13 1/0/T/F/TRUE/FALSE Percentage change from previous period ""   
#>  2 35228    13 1/0/T/F/TRUE/FALSE Percentage change from previous period ""   
#>  3 35229    13 1/0/T/F/TRUE/FALSE Percentage change from previous period ""   
#>  4 35230    13 1/0/T/F/TRUE/FALSE Percentage change from previous period ""   
#>  5 35231    13 1/0/T/F/TRUE/FALSE Percentage change from previous period ""   
#>  6 35232    13 1/0/T/F/TRUE/FALSE Percentage change from previous period ""   
#>  7 35233    13 1/0/T/F/TRUE/FALSE Percentage change from previous period ""   
#>  8 35234    13 1/0/T/F/TRUE/FALSE Percentage change from previous period ""   
#>  9 35235    13 1/0/T/F/TRUE/FALSE Percentage change from previous period ""   
#> 10 35236    13 1/0/T/F/TRUE/FALSE Percentage change from previous period ""   
#> # … with 240 more rows

Created on 2021-11-22 by the reprex package (v2.0.1)

@jimhester
Copy link
Collaborator

And if you want to fix the column guessing maybe try using vroom(guess_max = 1000), which then correctly guesses the column types for this example.

@OfekShilon
Copy link
Author

OfekShilon commented Nov 22, 2021

Thanks @jimhester! Why does explicit delim="," fail here?

@jimhester
Copy link
Collaborator

The problem is coming from guessing the column types, not the delimiter guessing.

@sbearrows
Copy link
Contributor

I'm closing this issue but we agree that the prompt for using problems() could be more straight forward and there is an open issue in readr to accomplish this.
tidyverse/readr#1322

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants