delim = "" should generate clear error #557

cboettig · 2016-11-29T06:32:04Z

Consider this minimal example with a classic CO2 dataset:

base R version works fine:

co2 <- read.delim("ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt",
                  sep = "", comment = "#", 
                  col.names = c("year", "month", "decimal_date", "average", "interpolated", "trend", "days"),
                  na.strings = c("-1", "-99.99"))
co2 %>% head()

readr function not so much

co2 <- read_delim("ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt", trim_ws = TRUE,
                  delim = "", comment = "#", 
                  col_names = c("year", "month", "decimal_date", "average", "interpolated", "trend", "days"),
                  col_types = c("iiddddi"),
                  na = c("-1", "-99.99"))
co2 %>% head()

The problem seems to be in the file being whitespace delimited, read.delim seems to interpret sep="" somewhat surprisingly (but conveniently in this case) as "any number of spaces". read_delim does not.

I haven't figured out a way to parse this file with readr functions, though I could be missing something obvious. It seems like an option for delim_whitespace (as in pandas), or perhaps better, the ability to use regex expressions for delimiters would help?

(A bit unrelated, but it might also be convenient for the comment symbol to permit regex patterns?)

The text was updated successfully, but these errors were encountered:

lukas-rokka · 2016-11-30T20:27:22Z

Use ´readr::read_table()` for whitespace separated columns.

cboettig · 2016-11-30T20:34:41Z

@lukas-rokka Thanks, that's great. Unfortunately it looks like read_table lacks an argument for comment symbol though -- is there a good reason for this or could it be added?

(of course one could use skip but having to count comment lines is obviously not ideal).

cboettig · 2016-12-06T23:54:07Z

proposed fix in PR #563

Fixes #557

hadley changed the title ~~unexpected behavior of delim in read_delim~~ delim = "" should generate clear error Dec 22, 2016

hadley added feature a feature request or enhancement read 📖 labels Dec 22, 2016

jimhester added a commit that referenced this issue Feb 3, 2017

read_delim() now signals an error if given an empty delimiter

579b9c4

Fixes #557

hadley mentioned this issue Feb 8, 2017

read.table compatibility #607

Closed

jimhester mentioned this issue Feb 10, 2017

read_delim() now signals an error if given an empty delimiter #613

Merged

jimhester closed this as completed in #613 Feb 15, 2017

jimhester added a commit that referenced this issue Feb 15, 2017

read_delim() now signals an error if given an empty delimiter (#613)

32eb778

Fixes #557

lock bot locked and limited conversation to collaborators Sep 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

delim = "" should generate clear error #557

delim = "" should generate clear error #557

cboettig commented Nov 29, 2016

lukas-rokka commented Nov 30, 2016

cboettig commented Nov 30, 2016

cboettig commented Dec 6, 2016

delim = "" should generate clear error #557

delim = "" should generate clear error #557

Comments

cboettig commented Nov 29, 2016

lukas-rokka commented Nov 30, 2016

cboettig commented Nov 30, 2016

cboettig commented Dec 6, 2016