-
-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tablib no longer able to import Google Docs exported xlsx #465
Comments
Isn't this the same issue as #226 ? That was supposed to be fixed in tablib 1.1.0. |
#226 certainly looks related, but the fix seems to be specifically for csv files, not xlsx, so it doesn't look to be the same. I'm not really familiar with tablib internals though, so perhaps I'm missing something. The test.xslx is attached to my original post, so you shouldn't need a Google account to download the test case. |
Oh right, I'll have a look soon. |
That might be a Google Docs pecularity. If I save the same file with LibreOffice as .xlsx, I'm getting two values ( |
Yes, I agree it's very likely something changed in Google Docs. I tried commit 71ff737 and it fixes the test case, but I think there is still a problem if the first row has fewer columns than a subsequent row. See the attached test case.
still gives me an InvalidDimensions error. |
Sure, however your latest example looks clearly wrong to me. tablib expects generally at least a full row of headers. So I don't think we can support all use cases. |
Hmmh, I don't know. I quite often have a fixed format table and then use a cell to the right of a row to make random comments. For example:
It'd be sad if tablib can't load such a file. |
tablib used to be able to read xlsx files exported from Google Docs just fine. Now it fails with InvalidDimensions if not all columns in the spreadsheet are populated with values. For example, download this spreadsheet as xlsx (also attached below) and then try to read it like this:
python3 -c "import tablib; tablib.Databook().load(open('test.xlsx', 'rb').read(), 'xlsx')"
and it'll result in an InvalidDimensions exception. The problem seems to be due to the fact that the first row has values in 2 columns, but the second row has an empty cell in the second column. I'm guessing something changed in the way Google exports the xlsx files?
test.xlsx
The text was updated successfully, but these errors were encountered: