Problem with xlsx with merged cells and multiline header #320
Comments
Hi @mcarans, There is some magic behind this issue because Openpyxl 3.0.3 gives this list of merged ranges:
while "Current" and "Second projection" are definitely merged but at the same time definitely no on this list although "First projection" is here "W10:AL10". I'm closing it, for now, could you please:
? |
@roll I got a different result from openpyxl. I saved the link I gave in the OP as test.xlsx then:
|
Is it opynpyxl 3.0.3?
вт, 19 мая 2020 г., 10:59 Mike <notifications@github.com>:
… @roll <https://github.com/roll> I got a different result from openpyxl. I
saved the link I gave in the OP as test.xlsx then:
wb = load_workbook('/home/mcarans/Downloads/test.xlsx')
sheet_ranges = wb['IPC']
print(sheet_ranges.merged_cells.ranges)```
[<CellRange A10:E11>, <CellRange G10:V10>, <CellRange W10:AL10>, <CellRange AM10:BB10>, <CellRange G11:H11>, <CellRange K11:L11>, <CellRange M11:N11>, <CellRange O11:P11>, <CellRange Q11:R11>, <CellRange S11:T11>, <CellRange U11:V11>, <CellRange W11:X11>, <CellRange AA11:AB11>, <CellRange AC11:AD11>, <CellRange AE11:AF11>, <CellRange AG11:AH11>, <CellRange AI11:AJ11>, <CellRange AK11:AL11>, <CellRange AM11:AN11>, <CellRange AQ11:AR11>, <CellRange AS11:AT11>, <CellRange AU11:AV11>, <CellRange AW11:AX11>, <CellRange AY11:AZ11>, <CellRange BA11:BB11>, <CellRange A13:D13>]
Please can you check.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#320 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEICU4MNBSSNUBMTN2ULCDRSI345ANCNFSM4M7PKX3A>
.
|
@roll Yes I can confirm it is 3.0.3. |
@mcarans |
@roll |
@mcarans Could you please try |
Super, that looks good, thx @roll ! |
Overview
When trying to read the file below, tabulator reads the headers on rows 10-12 incorrectly:
http://mapipcissprd.us-east-1.elasticbeanstalk.com/api/public/population-tracking-tool/data/2017,2020/?page=1&limit=1&condition=A&export=true&country=AF
I used options headers=[10,12], fill_merged_cells=True:
stream = Stream('http://mapipcissprd.us-east-1.elasticbeanstalk.com/api/public/population-tracking-tool/data/2017,2020/?page=1&limit=1&condition=A&export=true&country=AF', headers=[10,12], fill_merged_cells=True, format='xlsx')
I also tried fill_merged_cells=False.
An example of what's missing can be seen in column K of the spreadsheet. The header should be for that one "Current Phase 1 #" but instead is just "Phase 1 #".
Pandas is able to read it using:
df = pandas.read_excel(url, header=[9, 10, 11])
Please preserve this line to notify @roll (lead of this repository)
The text was updated successfully, but these errors were encountered: