Implement comment skipping for read_fwf() #334

yportier · 2015-12-12T17:02:42Z

I have fixed width text files with headers, the size of which (the amount of lines to skip) varies greatly from file to file. I could first read a file and figure out the amount of lines to skip before reloading it with the skip value set, but it feels clumsy and it would be nice to be able to do that in one go. Headers/Comments being often identifiable by the first few characters (in my case, the line starts by the letter H).
It would also be nice to have the possibility to save those skipped lines (the whole header) in a variable or a file at the same time as one may need to parse it further to collect some information from it.

jennybc · 2015-12-12T17:23:47Z

I was about to say that the new-ish argument comment should work for the header skipping (as long as data lines don't start with H). See #68, now closed/fixed. But now I see that comment is not (yet?) available for read_fwf().

yportier · 2015-12-14T09:02:47Z

Something like comment could definitely work, yes.
It may be worth nothing though that in the case of a header, it is not necessary to test every single line as it is clear that after we've encountered the first row of data, there is no more header.

dholstius · 2016-05-19T14:44:46Z

Two suggestions for interested parties (not just Hadley!)

Assign comments to the returned object using 'comment()<-'.
Design an API for a more general 2-part approach. Maybe 'meta=list(prefix="#", parser=as.character, simplify=FALSE)' would accomplish the above? Support for emerging CSV metadata conventions could be implemented as extensions.

hadley · 2016-06-02T11:42:26Z

@holstius that's unfortunately rather difficult to do without losing the performance benefits of the way that readr is structured.

I also modified the fixed width example file to be a little more substantial that the previous examples. Fixes tidyverse#334

hadley changed the title ~~Feature Request: conditional skip~~ Add comment argument to read_fwf() Jun 2, 2016

hadley changed the title ~~Add comment argument to read_fwf()~~ Implement comment skipping for read_fwf() Jun 2, 2016

hadley added feature a feature request or enhancement ready labels Jun 2, 2016

hadley modified the milestone: 0.3.0 Jul 13, 2016

jimhester added a commit to jimhester/readr that referenced this issue Jul 14, 2016

Allow skipping lines in fwf based on a comment string

1b11e8c

I also modified the fixed width example file to be a little more substantial that the previous examples. Fixes tidyverse#334

jimhester mentioned this issue Jul 14, 2016

Allow skipping lines in fwf based on a comment string #476

Merged

jimhester self-assigned this Jul 14, 2016

jimhester added in progress and removed ready labels Jul 14, 2016

jimhester closed this as completed in #476 Jul 14, 2016

jimhester removed the in progress label Jul 14, 2016

lock bot locked and limited conversation to collaborators Sep 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement comment skipping for read_fwf() #334

Implement comment skipping for read_fwf() #334

yportier commented Dec 12, 2015

jennybc commented Dec 12, 2015

yportier commented Dec 14, 2015

dholstius commented May 19, 2016

hadley commented Jun 2, 2016