-
Notifications
You must be signed in to change notification settings - Fork 292
Closed
Labels
Description
Consider this minimal example with a classic CO2 dataset:
base R version works fine:
co2 <- read.delim("ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt",
sep = "", comment = "#",
col.names = c("year", "month", "decimal_date", "average", "interpolated", "trend", "days"),
na.strings = c("-1", "-99.99"))
co2 %>% head()readr function not so much
co2 <- read_delim("ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt", trim_ws = TRUE,
delim = "", comment = "#",
col_names = c("year", "month", "decimal_date", "average", "interpolated", "trend", "days"),
col_types = c("iiddddi"),
na = c("-1", "-99.99"))
co2 %>% head()
The problem seems to be in the file being whitespace delimited, read.delim seems to interpret sep="" somewhat surprisingly (but conveniently in this case) as "any number of spaces". read_delim does not.
I haven't figured out a way to parse this file with readr functions, though I could be missing something obvious. It seems like an option for delim_whitespace (as in pandas), or perhaps better, the ability to use regex expressions for delimiters would help?
(A bit unrelated, but it might also be convenient for the comment symbol to permit regex patterns?)