New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing failure- read_fwf includes a column of NAs when told to ignore the column? #322

Closed
ajdamico opened this Issue Nov 27, 2015 · 3 comments

Comments

Projects
None yet
4 participants
@ajdamico
Contributor

ajdamico commented Nov 27, 2015

sorry if i'm misreading something. thanks!

library(readr)
fwf_sample <- system.file("extdata/fwf-sample.txt", package = "readr")
cat(read_lines(fwf_sample))

# works
read_fwf(fwf_sample, fwf_widths(c(2, 5, 3)),col_types='ddd')

# includes a column of NAs because the column contains data even though it was told to skip?
read_fwf(fwf_sample, fwf_widths(c(2, 5, 3)),col_types='d-d')
@mlaviolet

This comment has been minimized.

mlaviolet commented Dec 30, 2015

I ran into the same problem, getting parsing errors while trying to read selected fields from a large BRFSS survey data file. Took a couple of hours to realize what was happening. The help page for read_fwf now says "The width of the last column will be silently extended to the next line break" so R reads the entire remainder of the line following what's supposed to be the last field. I don't understand the rationale for this change; the old behavior needs to be restored or an argument added to say "I'm done, go to the next line."

@ghaarsma

This comment has been minimized.

Contributor

ghaarsma commented Jan 19, 2016

A temporary workaround, is to first create the col_positions and then remove the column names that you don't need.

fwf_sample <- system.file("extdata/fwf-sample.txt", package = "readr")
cat(read_lines(fwf_sample))
col_positions <- fwf_widths(c(2, 5, 3))
col_types <- 'd-d'

col_positions$col_names <- col_positions$col_names[!strsplit(col_types,'')[[1]] %in% c('_','-')]

read_fwf(fwf_sample,col_positions = col_positions ,col_types = col_types)

Note that there are other issues with the read_fwf (See #300) that can be related.

@hadley

This comment has been minimized.

Member

hadley commented Jun 2, 2016

Deleted the irrelevant comments. Possibly related to #371

@hadley hadley modified the milestone: 0.3.0 Jul 13, 2016

@hadley hadley closed this in 44492d1 Jul 13, 2016

@lock lock bot locked and limited conversation to collaborators Sep 25, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.