Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
read_table corrupts last column names #166
When reading a simple whitespace separated file with read_table()
The parsed colname for the final column is corrupt:
> read_table("test_nonmem2.txt") Source: local data frame [2 x 3] MAID TIME AMT\n 1.000 1 1 0 2.5 2 1 0 0.0
(Note: a cut/paste error means the headers in my above example weren't aligned)
The above (with aligned column starts) still causes a column misread:
> read_table("test.txt", col_names=TRUE) MAID TIME AMT\n1.0000 1 1 0 2.5 2 1 0 0.0
But if I pad the column names they are read correctly:
> read_table("test.txt", col_names=TRUE) MAIDxxxxxx TIMExxxxxx AMTxxxxxxx 1 1 0 2.5 2 1 0 0.0
So length of column names must match the data width for read_table()?
This is related to issue #121, because read_table() works about 20x faster than read.table() on NONMEM files, with the only problem being the parsing of the header for column names.
That's what I've done for now, but I would hazard that the probability that the column names are ever the same width as the data in a FWF is very very low, and the times when the current read_table header parsing works as a user expects would be rare.
Would there be any benefit to parsing the header slightly differently by default for fwf? Or as an option e.g.
Or am I an outlier? :)