Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tabs are not trimmed when trim_ws = TRUE #767

Closed
mirkhosro opened this issue Dec 20, 2017 · 7 comments
Closed

Tabs are not trimmed when trim_ws = TRUE #767

mirkhosro opened this issue Dec 20, 2017 · 7 comments
Labels
bug

Comments

@mirkhosro
Copy link

@mirkhosro mirkhosro commented Dec 20, 2017

When using read_csv to read a CSV file that contains tabs on its rows, the trim_ws argument does not seem to have any effect. The tab characters do not get stripped anyway.
Take this CSV file for example

X,Y
x1,y1
x2	,y2
x3,y3	
x4,y4

There is a tab after x2 and after y3 (not visible). And the code below demonstrates the issue

> data <- read_csv("test.csv", trim_ws = TRUE, col_types = "cc")
> data[, 1]
# A tibble: 4 x 1
       X
   <chr>
1     x1
2 "x2\t"
3     x3
4     x4
> data[, 2]
# A tibble: 4 x 1
       Y
   <chr>
1     y1
2     y2
3 "y3\t"
4     y4

As you can see the tab characters are there and not trimmed.
I'm running the code using R 3.3.3 and readr 1.1.1. on macOS High Sierra.

@mirkhosro mirkhosro changed the title Tabs are not trimmed when `trim_ws = TRUE` Tabs are not trimmed when trim_ws = TRUE Dec 20, 2017
@batpigandme
Copy link
Member

@batpigandme batpigandme commented Dec 20, 2017

Could you please turn this into a reprex so we can be sure we're troubleshooting the exact same thing? Thanks

@mirkhosro
Copy link
Author

@mirkhosro mirkhosro commented Dec 20, 2017

Here it is. I hope I've done it correctly.

library(readr)
csv_file <- "X,Y\nx1\t,y1\nx2,y2\t\nx3 ,y3 \n"
data <- read_csv(csv_file, trim_ws = TRUE, col_types = "cc")
data[, 1]
#> # A tibble: 3 x 1
#>        X
#>    <chr>
#> 1 "x1\t"
#> 2     x2
#> 3     x3
data[, 2]
#> # A tibble: 3 x 1
#>        Y
#>    <chr>
#> 1     y1
#> 2 "y2\t"
#> 3     y3
@mirkhosro
Copy link
Author

@mirkhosro mirkhosro commented Feb 17, 2018

(bump) this bug seems to be still there and left open, are you able to reproduce it using the reprex above?

@jimhester jimhester added the bug label May 4, 2018
@jimhester jimhester closed this in 3533e79 Nov 14, 2018
@mirkhosro
Copy link
Author

@mirkhosro mirkhosro commented Nov 14, 2018

Thanks for the fix @jimhester . But it seems to only handle the space and tab characters and not "whitespace" in general. Wouldn't it better to capture white space using \s regex?

@jimhester
Copy link
Member

@jimhester jimhester commented Nov 14, 2018

No.

@mirkhosro
Copy link
Author

@mirkhosro mirkhosro commented Nov 14, 2018

Thanks for your extensive explanation :)

@lock
Copy link

@lock lock bot commented May 13, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators May 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants