Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turn on ZIP64 support #228

Closed
evanmiller opened this issue May 15, 2019 · 10 comments
Closed

Turn on ZIP64 support #228

evanmiller opened this issue May 15, 2019 · 10 comments

Comments

@evanmiller
Copy link
Contributor

I get an error when creating an especially large XLSX file (see #227). The underlying error is ZIP_BADZIPFILE, triggered at

https://github.com/jmcnamara/libxlsxwriter/blob/master/third_party/minizip/zip.c#L1731

I believe this can be fixed by changing the last argument to zipOpenNewFileInZip4_64 be 1 instead of 0 here:

https://github.com/jmcnamara/libxlsxwriter/blob/master/src/packager.c#L1063-L1069

I.e. that turns on Zip64 support which permits 4GB+ uncompressed file sizes to be written.

@jmcnamara
Copy link
Owner

Agreed. That could/should be added. It is supported in the Python version of the library: https://xlsxwriter.readthedocs.io/workbook.html#workbook-use-zip64

@jmcnamara
Copy link
Owner

I've added this to master.

From the docs:

/**
 * @brief Allow ZIP64 extensions when creating the xlsx file zip container.
 *
 * @param workbook Pointer to a lxw_workbook instance.
 *
 * Use ZIP64 extensions when writing the xlsx file zip container to allow
 * files greater than 4 GB.
 *
 * @code
 *     workbook_use_zip64(workbook);
 * @endcode
 */

And from a sample file generated with the ZIP64 extensions:

$ zipdetails zip64.xlsx

0000 LOCAL HEADER #1       04034B50
0004 Extract Zip Spec      2D '4.5'
0005 Extract OS            00 'MS-DOS'
0006 General Purpose Flag  0000
     [Bits 1-2]            0 'Normal Compression'
0008 Compression Method    0008 'Deflated'
000A Last Mod Time         00210000 'Tue Jan  1 00:00:00 1980'
000E CRC                   3A495D61
0012 Compressed Length     0000014F
0016 Uncompressed Length   0000048F
001A Filename Length       0013
001C Extra Length          0014
001E Filename              '[Content_Types].xml'
0031 Extra ID #0001        0001 'ZIP64'
0033   Length              0010
0035 PAYLOAD

@evanmiller
Copy link
Contributor Author

Would use_zip64 make more sense as a field in lxw_workbook_options? That would seem to provide a more consistent API than a one-off function like this.

@jmcnamara
Copy link
Owner

Would use_zip64 make more sense as a field in lxw_workbook_options

That was my first thought but the Python version uses a function/method (use_zip64()) so I went with that.

The lxw_workbook_options are mainly for options that need to be defined at construction time or are very common. The zip64 option isn't in either of those categories and doesn't seem to be very commonly required. You are only the second person to request or mention it across the Perl/Python/C versions. So I'll leave it as a function for now, but I'll think about it up until the release.

@evanmiller
Copy link
Contributor Author

Well, I have been using libxlsxwriter in shipping code for a number of years, and Zip64 wasn't something I knew I needed until recently. It took some fairly deep debugging to figure out that its absence was triggering write failures. If it had been an advertised option sooner, I certainly would have enabled it. So I suspect that many clients of libxlsxwriter will choose to enable Zip64 once they are made aware of it.

One advantage of using an option instead of a function is that in the future, you will be able to make Zip64 the default (opt-out) without changing the API. The new API is opt-in only.

@jmcnamara
Copy link
Owner

Okay. I'm persuaded. I'll enable it in the lxw_workbook_options.

@jmcnamara jmcnamara reopened this Jun 8, 2019
jmcnamara added a commit that referenced this issue Jun 8, 2019
@jmcnamara
Copy link
Owner

I've refactored the use_zip64 option into the constructor. It can now be called as follows:

int main() {

    lxw_workbook_options options = {.constant_memory = LXW_FALSE,
                                    .tmpdir = NULL,
                                    .use_zip64 = LXW_TRUE};

    lxw_workbook  *workbook  = workbook_new_opt("zip64.xlsx", &options);
    lxw_worksheet *worksheet = workbook_add_worksheet(workbook, NULL);

    ...

When you get a chance try it out and see what you think. The docs have been updates as well.

@jmcnamara
Copy link
Owner

When you get a chance try it out and see what you think.

After I fix the failing test cases on Linux.

jmcnamara added a commit that referenced this issue Jun 8, 2019
jmcnamara added a commit that referenced this issue Jun 8, 2019
@jmcnamara
Copy link
Owner

Okay. Master is fixed again.

@evanmiller
Copy link
Contributor Author

The new code is working as expected. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants