-
Notifications
You must be signed in to change notification settings - Fork 765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug-57342] Excel compatible Zip64 implementation #154
Conversation
For more information see https://github.com/rzymek/opczip
Can one of the admins verify this patch? |
src/ooxml/java/org/apache/poi/xssf/streaming/OpcOutputStream.java
Outdated
Show resolved
Hide resolved
thanks - merged with https://svn.apache.org/repos/asf/poi/trunk@1861196 |
…f Rzymkowski. This closes #154 git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1861196 13f79535-47bb-0310-9956-ffa450edef68
…f Rzymkowski. This closes apache#154 git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1861196 13f79535-47bb-0310-9956-ffa450edef68
@rzymek Could you give me an example because i try generate a large excel with 37000 rows and 2500 coluns and file still corrupted using apache 4.1.2 |
@rzymek my current code failed `import java.io.File; import org.apache.commons.compress.archivers.zip.Zip64Mode; public class TestExcel {
}` |
Zip64Mode.AsNeeded is more correct - always may mean you use zip64 mode when you don't need it |
Are you getting "corrupted file" error from Excel or OpenOffice or something else? OpenOffice Calc has a limit of 1024 columns (Excel's limit is 16k columns). Other than that, the code looks ok. Zip64Mode needs to be Always in this case to enable ZIP64 handling compatible with Excel. |
@rzymek thanks for clarifying - do you know what effect setting Zip64Mode.Always has if you create a small spreadsheet - will this file cause problems for Excel? |
As far as I checked, Zip64Mode.Always does not cause problem with Excel even in small files. When it comes to Excel and big files (XML over 4Gb), then ZIP64 must be declared in the zip entry header before the actual zip entry contents. |
Thanks @rzymek - we might want to make Zip64Mode.Always the default - needs some experimentation before we'd make that change though |
Exactly. I think that custom zip64 implementation should sit as an option for a few versions (it's only enabled when Zip64Mode.Always). |
@rzymek I tested with Libre Office... now i tested with MS Excel and the problem was solved, Is a limitation of Libre office, with you told us? Tks a lot! |
https://bz.apache.org/bugzilla/show_bug.cgi?id=57342
I did an in depth analysis of this issue. Turns out the problem is not with the OOXML data generated by POI. The problem has to do with the ZIP format. Specifically with ZIP64 extension. That's why it's all OK up until sheet1.xml reaches over 4GB (uncompressed).
I have all the details written up in a blog post: https://rzymek.github.io/post/excel-zip64/
Short story: Excel will want to repair the file if uncompressed size of a zip entry exceeds 4GB and ZIP's Local File Header (LFH) does not specify zip spec version 4.5
This pull request uses custom (Excel compatible) Zip64 implementation when Zip64Mode is set to Always.