Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed width files #57

Open
aetiologicCanada opened this issue Feb 1, 2019 · 4 comments
Open

Fixed width files #57

aetiologicCanada opened this issue Feb 1, 2019 · 4 comments

Comments

@aetiologicCanada
Copy link

This is an enhancement request, but I can't see how to designate it as such.

disk.frame looks to be wonderfully valuable. Many thanks in advance.

It would be helpful if the csv reading capacity could be extended to fixed-width files, as these files (often in the form of logs, etc) are typically massive.

The readr::read_fwf() is a nice implementation of fwf input, and might be a model for work on something comparable for this package.

Many thanks

@xiaodaigh
Copy link
Collaborator

Sounds useful. The problem with all of these is that the functions don't naturally allow for chunk-by-chunk reading. I have made a feature request to the chunked package which is the only package I know that does chunk by chunk reading.

@xiaodaigh
Copy link
Collaborator

@aetiologicCanada can you share a self contained example of a fwf file and how to use readr?

I tried

data(cars)
library(gdata)
write.fwf(cars, "test.fwf")
f = file("test.fwf")
readr::read_fwf("test.fwf", n_max=1)

it doesn't seem to work

@aetiologicCanada
Copy link
Author

aetiologicCanada commented Feb 5, 2019

data(cars)
library(gdata) 
library(tidyverse)
library(fs)
f = here::here("test.fwf") 

gdata::write.fwf(cars, f) 
junk <- readr::read_fwf(f, skip = 1, readr::fwf_positions(
  start = c(1,4),
  end   = c(2,6),
  col_names = c("A", "B")
))

@xiaodaigh
Copy link
Collaborator

Maybe log an issue with readr so they can provide a read_fwf_chunked function like the readr::read_csv_chunked. Once they have that, we can use disk.frame::add_chunk to easily create a disk.frame

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants