Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Improve Reading and Writing of Multi-Index Columns #3571
Comments
|
@pblelloch I linked the issues that r open about this it's really a matter of someone just having time to do it |
|
possibly related #3323. |
|
One issue is that it's not clear when you're reading a csv file whether you just want tuples or u want a MultiIndex (which is why I linked to #3323). (This is ignoring passing additional parameters like |
pblelloch
commented
May 11, 2013
|
I would like to be able to round trip, so that if I write a CSV file with multi-indexed columns and read that back in I get the same multi-indexed columns. In addition it would be nice if the indices (if that’s the correct word) were written to different rows, so that when I read this into something like Excel it looks good. What’s not clear to me is where to write the names of the index in the case where both your row and columns indices have names. Currently the row index names are written to the 1st row, but that doesn’t leave space to write the last column index name. I’m not sure what the answer is to that L. From: Phillip Cloud [mailto:notifications@github.com] One issue is that it's not clear when you're reading a csv file whether you just want tuples or u want a MultiIndex (which is why I linked to #3323 pydata#3323 ). — |
|
yes this would be a problem for back compat header=[0,1] is very clear a single row of tuples is not, but I think should auto make a mi (or maybe an option for that) |
|
I would prefer to clobber the tuple as not-a-multiindex and just make it one whenever there are tuples (across the board), but it's a back-compat killer like u said and i have no feeling for how common it is to use tuples without using mis. something like |
jreback
referenced
this issue
May 11, 2013
Merged
ENH: allow to_csv to write multi-index columns, read_csv to read with header=list arg #3575
|
closed via #3575 |
pblelloch commentedMay 10, 2013
link to #1651 (to_csv) and #3141 (read_csv)
Currently (0.11) when the read_csv and to_csv methods handle multi-index row labels fine, but don't do as well with multi-index column labels. For the column index the to_csv method writes them out as a tuple into the 1st row of the CSV file. This reads back in as a tuple. It would be better if it actually wrote out each element of the multi-index as a row of the CSV file and you could then specify a range of rows for the header on read_csv to reconstruct the multi-index column header. I'm thinking of something like "header=[0,1]" to read in the first two rows of the CSV file as a 2 element multi-index column header. What's not clear to me is where you read/write the names of the indices.