Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have read_* functions parse non-traditional logicals #818

Closed
StevenMMortimer opened this issue Mar 16, 2018 · 9 comments
Closed

Have read_* functions parse non-traditional logicals #818

StevenMMortimer opened this issue Mar 16, 2018 · 9 comments
Labels
Milestone

Comments

@StevenMMortimer
Copy link

@StevenMMortimer StevenMMortimer commented Mar 16, 2018

Functions like read_delim() and read_csv() do not recognize columns with logical data coded as "0/1" or "true/false", but it does recognize "T/F". This behavior is different than parse_logical() which accepts all of the examples listed above. Is it possible to have these functions parse similar to parse_logical()?

read_delim() Behavior

lines <- "a,b
A,true
B,true
C,false
"
read_delim(lines, ",")
# A tibble: 3 x 2
  a     b    
  <chr> <chr>
1 A     true 
2 B     true 
3 C     false

Desired Behavior

lines <- "a,b
A,true
B,true
C,false
"
read_delim(lines, ",")
# A tibble: 3 x 2
  a     b    
  <chr> <lgl>
1 A     TRUE 
2 B     TRUE 
3 C     FALSE

parse_logical() Behavior

readr::parse_logical(c("T", "F", "true", "false", 0, 1))
[1]  TRUE FALSE  TRUE FALSE FALSE  TRUE
@jimhester
Copy link
Member

@jimhester jimhester commented Mar 16, 2018

You need to specify the columns are logical types when you read the data.

library(readr)
lines <- "a,b
A,true
B,true
C,false
"
read_delim(lines, ",", col_type = "cl")
#> # A tibble: 3 x 2
#>   a     b    
#>   <chr> <lgl>
#> 1 A     TRUE 
#> 2 B     TRUE 
#> 3 C     FALSE

Created on 2018-03-16 by the reprex package (v0.2.0).

@jimhester jimhester closed this Mar 16, 2018
@StevenMMortimer
Copy link
Author

@StevenMMortimer StevenMMortimer commented Mar 16, 2018

Thanks, @jimhester, but I think you mistook my feature request for a question. I know about read_delim(lines, ",", col_type = "cl") (that's how I created the desired behavior block of code). I want to know why functions like read_delim are not smart enough to recognize non-traditional logicals on their own; however parse_logical() can. My feature request/enhancement is to replicate the parse_logical() behavior inside of read_delim() so it is more robust to different types of logicals.

@jimhester
Copy link
Member

@jimhester jimhester commented Mar 16, 2018

The guesser currently does this

return x == "T" || x == "F" || x == "TRUE" || x == "FALSE";

But the parser does an insensitive compare

if (strncasecmp(string.first, "true", 4) == 0) {

So yes we could change the guesser to do the same. If you would like to open a PR to do this it would be great, if not we will get to it eventually, but it may take some time.

@jimhester jimhester reopened this Mar 16, 2018
@StevenMMortimer
Copy link
Author

@StevenMMortimer StevenMMortimer commented Mar 16, 2018

Thanks for taking a second look, @jimhester. No rush on implementing.

@jennybc
Copy link
Member

@jennybc jennybc commented Mar 16, 2018

Here's how this is done in readxl, in case there is interest in doing same. It delegates the comparison back to R:

https://github.com/tidyverse/readxl/blob/ed499fb299934b2c3355e239644d118498edbdab/src/utils.h#L116-L126

@jimhester
Copy link
Member

@jimhester jimhester commented Mar 16, 2018

Thanks @jennybc, using Rf_StringTrue seems like a good idea

@jennybc
Copy link
Member

@jennybc jennybc commented Mar 16, 2018

@jimhester It was probably your idea 😁

@jimhester
Copy link
Member

@jimhester jimhester commented Mar 16, 2018

Oh yeah (tidyverse/readxl#277 (comment)), still thank you for reminding me about it and using it for this!

@jimhester jimhester added this to the 1.2.0 milestone Nov 15, 2018
@jimhester jimhester closed this in 4a0cada Nov 16, 2018
@lock
Copy link

@lock lock bot commented May 15, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators May 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants