Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"One or more parsing issues" warning but problems() prints nothing #1376

Closed
zackw opened this issue Mar 3, 2022 · 4 comments
Closed

"One or more parsing issues" warning but problems() prints nothing #1376

zackw opened this issue Mar 3, 2022 · 4 comments

Comments

@zackw
Copy link

zackw commented Mar 3, 2022

Consider the following CSV file. (This is cut down from an actual grade sheet exported from Canvas, with all the real data replaced by fake values, preserving structure.)

Name,ID,Assignment 1,Assignment 2,Assignment 3
,,Manual Posting,Manual Posting,
Points Possible,,25,25,40
Able,able@example.com,12,24,39
Baker,baker@example.com,25,25,37
Charlie,charlie@example.com,22,10,37

The line with "Manual Posting" in the first two assignment columns is junk -- I have no idea what it means or why it's even there -- but naturally enough it makes read_csv use character vectors for those columns. That's not what I want so I give a column type spec. No problem there.

The bug is that when I give a column type spec, I get a "One or more parsing issues" warning but problems() prints nothing:

> read_csv("grades.csv", col_types="ccddd")
# A tibble: 5 x 5
  Name            ID                `Assignment 1` `Assignment 2` `Assignment 3`
  <chr>           <chr>                      <dbl>          <dbl>          <dbl>
1 NA              NA                            NA             NA             NA
2 Points Possible NA                            25             25             40
3 Able            able@example.com              12             24             39
4 Baker           baker@example.com             25             25             37
5 Charlie         charlie@example.~             22             10             37
Warning message:
One or more parsing issues, see `problems()` for details 
> problems()
>
@zackw
Copy link
Author

zackw commented Mar 3, 2022

p.s. I imagine that the complaint that's not getting passed through is something like this:

> parse_double(c("", "Manual Posting", "12", "24"))
Warning: 1 parsing failure.
row col expected         actual
  2  -- a double Manual Posting

[1] NA NA 12 24
attr(,"problems")
# A tibble: 1 x 4
    row   col expected actual        
  <int> <int> <chr>    <chr>         
1     2    NA a double Manual Posting

It would be nice to be able to squelch this specifically. parse_double takes an argument with a list of strings to treat as NA with no complaint, but I don't see any way to pass that down from a cols() specification. I can supply that list directly to read_csv but then it applies to all columns which may be too broad. Maybe col_* could pass na=/locale=/trim_ws= arguments down to the corresponding parse_* function?

@sbearrows
Copy link
Contributor

It looks like you are using problems() incorrectly. The documentation for problems says it requires a data frame to check for problems. But we agree that the output from read_csv() is a little confusing so we have an open issue here #1322 to update this warning message.

Having column specific NA options is also something we have considered but is not currently on the top of our todo list, but maybe in the future!

@zackw
Copy link
Author

zackw commented Apr 7, 2022

The documentation says that if you don't supply an argument to problems() it defaults to using .Last.value, so the demo I provided in the original bug report should have worked for you. Moreover, if I do supply an argument to problems() that doesn't change anything.

> d <- read_csv("grades.csv", col_types="ccddd")

Warning message:                   
One or more parsing issues, see `problems()` for details 
> problems(d)
>

Please reopen.

@jennybc
Copy link
Member

jennybc commented Apr 7, 2022

I cannot reproduce this:

> library(readr)
> read_csv("grades.csv", col_types="ccddd")
# A tibble: 5 × 5                                                                                                             
  Name            ID                  `Assignment 1` `Assignment 2` `Assignment 3`
  <chr>           <chr>                        <dbl>          <dbl>          <dbl>
1 NA              NA                              NA             NA             NA
2 Points Possible NA                              25             25             40
3 Able            able@example.com                12             24             39
4 Baker           baker@example.com               25             25             37
5 Charlie         charlie@example.com             22             10             37
Warning message:
One or more parsing issues, see `problems()` for details 
> problems()
# A tibble: 2 × 5
    row   col expected actual         file                             
  <int> <int> <chr>    <chr>          <chr>                            
1     2     3 a double Manual Posting /Users/jenny/rrr/readr/grades.csv
2     2     4 a double Manual Posting /Users/jenny/rrr/readr/grades.csv

> d <- read_csv("grades.csv", col_types="ccddd")
Warning message:                                                                                                              
One or more parsing issues, see `problems()` for details 
> problems(d)
# A tibble: 2 × 5
    row   col expected actual         file                             
  <int> <int> <chr>    <chr>          <chr>                            
1     2     3 a double Manual Posting /Users/jenny/rrr/readr/grades.csv
2     2     4 a double Manual Posting /Users/jenny/rrr/readr/grades.csv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants