Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Add fwf_cols function #616
This adds a helper function
# 3. Paired vectors of start and end positions read_fwf(fwf_sample, fwf_positions(c(1, 30), c(10, 42), c("name", "ssn"))) # 4. Named list of start and end positions read_fwf(fwf_sample, fwf_cols(list(name = c(1, 10), ssn = c(30, 42))))
I was thinking about whether a wrapper around a data frame would be useful and almost included a version in it, but decided against it. My thinking was there's two main way you could get the column specifications (1) if the column specifications are a data frame with two (variable, widths) or three columns (variable, start, end), or (2) they are entering it by hand.
If the column specifications are already in a data frame (i'll call is
fwf_postions(cols$start, cols$end, cols$varname)
To me, that's still pretty clear, and not too much typing.
The second case is entering it by hand (when it's not too many columns). In that case, having the columns as argument names and the widths or (start, end) as values seems most natural.
# with widths fwf_cols(foo = 1, bar = 5) # with (start, end) tuples fwf_cols(foo = c(1, 4), bar = c(5, 10)
This came up when I was helping a student read a fixed-width file. I foolishly didn't RTFM before writing code, and assumed that the format was something like what I was just wrote. When we got an error and actually read the documentation, I was too lazy to adjust change the code and used
What if we allowed
Then you'd have:
read_fwf(fwf_sample, tibble(name = c(1, 10), ssn = c(30, 42))) read_fwf(fwf_sample, tibble(name = 10, skip = 20, ssn = 12))
I don't know. It seems more natural and easy to document that
If a user is able to write the following, it's about as concise as the code above, and I'd say as readable.
read_fwf(fwf_sample, fwf_cols(name = c(1, 10), ssn = c(30, 42))) read_fwf(fwf_sample, fwf_cols(name = 10, skip = 20, ssn = 12))
And the following would still work:
x <- tribble( ~ col_name, ~start, ~ end name, 1, 10, ssn, 30, 42 ) read_fwf(fwf_sample, x)
Now I have it so that