Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
readxl skip and column designation not working if there are blanks #101
I have an MS excel sheet containing around 16,000 rows and 16 columns. There is some un-needed text above the column headers, so I need to skip the first three rows when reading the data in (4th row contains column headers, 5th row is the start of the data).
However I noticed that using skip=3 resulted in the column headers not being read in and everything started on the 5th row.
Also, the 2nd and 3rd column of my data contain text (names) but in one file the first 1500 cells in these columns are blank because the names are missing. These columns are misidentified as numeric unless I specify col_types. The 4th and subsequent columns contain data for all rows. I tried adding an initial column with the sequence number but the result was the same.
I think both behaviors are due to the missing data; not sure if it would be possible to correct this by going to the first cell that has data in it for each column and then determining the column type from there and below (I guess if the data is mixed numeric and text, the column type would default to text and in order to do this multiple cells would have to be evaluated?)
If the automated column type assignation would work even when there is some missing data, this would be helpful as although not too much of a problem to manually specify the types for 16 columns, I often have many more than this.
Would be great if there was a fix for this - otherwise the package works great and I found it much faster than the java based ones.
I have experienced a similar glitch.
Concrete examples of the above would be helpful to work against.
How to provide a readxl reprex
We're in a much better position to address your issue if you can provide a reprex (reproducible example). Provide as much of this as you can:
How to provide the xls/xlsx file? In order of preference: