Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append/Insert files into an existing Zip file? #397

Closed
cdancy opened this issue Feb 2, 2024 · 14 comments
Closed

Append/Insert files into an existing Zip file? #397

cdancy opened this issue Feb 2, 2024 · 14 comments

Comments

@cdancy
Copy link

cdancy commented Feb 2, 2024

Hey All,

Does this library support appending/inserting files into an existing zip? I've got a use-case where the disk space is limited and so short having to download X number of files and then zip them up I'd like to download them into a buffer in memory and then add that data as entries within a zip as I'm iterating over a list. Is that possible here? I saw you can append with tar just didn't see anything for zips.

Thanks in Advance,
Chris

@mholt
Copy link
Owner

mholt commented Feb 12, 2024

I don't think it's possible with the zip format, even the zip command line tool temporarily extracts the archive and re-creates it, apparently: https://wpguru.co.uk/2018/12/how-to-add-files-to-an-existing-zip-archive-on-macos-and-linux/

Tar is appendable because the format is basically just like delimiter-separated files, so you can easily add to it.

Closing, but feel free to continue discussion if needed.

@mholt mholt closed this as not planned Won't fix, can't repro, duplicate, stale Feb 12, 2024
@cdancy
Copy link
Author

cdancy commented Feb 13, 2024

@mholt I found this library which does exactly what I need. I found some github issues online where this was requested by the below developer to be included in OOTB golang but core devs pushed back saying there wasn't enough need for it but if the community really wanted it they could look at putting it in.

https://github.com/STARRY-S/zip

@mholt
Copy link
Owner

mholt commented Feb 13, 2024

Ah wow, I didn't know that was possible since even the official zip tooling doesn't support it.

Looks complicated. I can see why it's not obvious...

Let me see if I can get that library working with this one.

@mholt mholt reopened this Feb 13, 2024
@mholt
Copy link
Owner

mholt commented Feb 13, 2024

Hmmm... I cannot get it to create a zip file that isn't corrupted (according to unzip -vl test.zip). Were you able to get it to work?

@mholt
Copy link
Owner

mholt commented Feb 13, 2024

D'oh -- just kidding. I forgot to close the Updater.

I have a commit that works with a single test I did (:sweat_smile:) so you can use it if you want!

@mholt mholt closed this as completed in 43a073e Feb 13, 2024
@cdancy
Copy link
Author

cdancy commented Feb 13, 2024

@mholt nice! Let me know how it goes and if you're able to optimize things one way or another or "make the impl better" whatever that means in this context :) Once you've got something I can give it a go here. We've basically got a highly parallel process which keeps open the "zip updater", and as files are finished, we're appending them to the zip and then shipping that off. I don't think that other library is maintained or active anymore so if you can whip something up I'd be more than happy to take yours ;)

@mholt
Copy link
Owner

mholt commented Feb 13, 2024

I mean, that lib works as far as I can tell, I don't know of another pure Go solution. You can try the latest commit I just pushed and see if it works for you. But if you're doing custom concurrency stuff maybe just best to use that lib directly...

@cdancy
Copy link
Author

cdancy commented Feb 13, 2024

@mholt I'll give it a go and see how things fare. We haven't rolled out to production yet and so are still coding things up.

@cdancy
Copy link
Author

cdancy commented Feb 15, 2024

@mholt with this new addition, and the way you implemented it, is it possible here to use a different compressor to get a smaller zip size? We're not necessarily concerned about how fast it takes to build the zip but more if we can get the zip size as small as possible. The files inside are all textual.

@mholt
Copy link
Owner

mholt commented Feb 15, 2024

@cdancy Set the Compression field of your Zip struct: https://pkg.go.dev/github.com/mholt/archiver/v4#Zip.Compression (e.g. zip.Deflate)

@cdancy
Copy link
Author

cdancy commented Feb 16, 2024

@mholt I'm having no luck :( No matter what I use I can't get the zip of the file to not be significantly larger than what I get with the starry-zip library. Same 4 files I'm using there and here results in a 57K versus 764K respectively. Maybe I'm doing something wrong?

        // have to use zip.NewWriter otherwise library was complaining that zip was not valid
        
	zipWriter := zip.NewWriter(zipFile)
	zipWriter.SetComment("Hello, World!")
	zipWriter.Close()

	zipper := archiver.Zip{
		Compression: flate.BestCompression,
	}

	err = zipper.Insert(context.Background(), zipFile, files)
	require.NoError(t, err)

@mholt
Copy link
Owner

mholt commented Feb 16, 2024

@cdancy I think you might need to use zip.Deflate instead of flate.BestCompression, which is probably a uint8 that isn't recognized, so maybe it treats it as "store" instead of "compress".

@cdancy
Copy link
Author

cdancy commented Feb 16, 2024

@mholt yeah I tried that as well but still nothing. When I open the zip written by the starry golang lib I see Defl:N compression used by default but no matter what I do here I only ever see Stored. I'm on mac-os so not sure if that plays into things at all.

L105342MUS:kadiv14103107114 dancc$ unzip -vl example.zip
Archive:  example.zip
Hello, World!
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
  687922  Stored   687922   0% 12-31-1979 00:00 2f6acde1  file1.txt
  354646  Stored   354646   0% 12-31-1979 00:00 9150eeac  file2.txt
   77921  Stored    77921   0% 12-31-1979 00:00 be9df062  file3.txt
  390349  Stored   390349   0% 12-31-1979 00:00 c6890ae4  file4.txt
--------          -------  ---                            -------
 1510838          1510838   0%                            4 files
L105342MUS:kadiv14103107114 dancc$ unzip -vl example-1.zip
Archive:  example-1.zip
Hello, World!
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
   77921  Defl:N     5752  93% 00-00-1980 00:00 be9df062  file1.txt
  390349  Defl:N    28739  93% 00-00-1980 00:00 c6890ae4  file2.txt
  354646  Defl:N    18367  95% 00-00-1980 00:00 9150eeac  file3.txt
  687922  Defl:N    40838  94% 00-00-1980 00:00 2f6acde1  file4.txt
--------          -------  ---                            -------
 1510838            93696  94%                            4 files

I don't know ...

@mholt
Copy link
Owner

mholt commented Feb 16, 2024

I'm not sure if reusing the zip file after the zip writer wrote to it is a good idea. What if you have a fresh open file for the insert?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants