Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unzip speed too low! #120

Closed
1178209138 opened this issue Oct 31, 2022 · 14 comments
Closed

unzip speed too low! #120

1178209138 opened this issue Oct 31, 2022 · 14 comments

Comments

@1178209138
Copy link

1178209138 commented Oct 31, 2022

I use extract() function to unzip a .zip file(.zip file size is 119MB,Original folder size is 1.46GB and contains a large number of files).At the beginning,speed can reach 39MB/s, but the speed slow down over time. When about 700MB files are extracted, the speed is only 0.1MB/s.
Finally it takes about one hour to finish unzip.
By the way, it takes only 10 second to finish the job when i use rar.
zipper::Unzipper unzipper(zip_file_path); unzipper.extract(tmp_folder.GetString()); unzipper.close();

@Lecrapouille
Copy link
Collaborator

@1178209138 I stopped maintaining this repo and specially the master branch (extract). This project does not manage threads and there is some local array[1024] vector::resize(8k) that are created at each iteration. So yes, this is not an optimized lib. Can you share your files (maybe privately) ? This can help benchmarking (else a ls -la command can list them with their size I can try to generate them with /dev/random). Did you use this lib for zipping ? Does this took hours ?

@Lecrapouille
Copy link
Collaborator

Lecrapouille commented Oct 31, 2022

I quickly checked zipping then unzipping a folder of mp4 with subtitles (the expanse s4: ~5GB) and compare time with the Linux zip command, the time is quite similar (~3min for zipping ~1min for unzipping). Our lib make my desktop get some freezes when unzipping.

Of course, naive openmp on the list of file to compress makes a direct segfault. I duno if minizip and zlib-ng are multi-thread.

@1178209138
Copy link
Author

1178209138 commented Nov 1, 2022

How can i share my files with you? E-mail? File size too big to upload here.
In addition, I use this lib in window. Haven't do zipping yet.

@1178209138
Copy link
Author

1178209138 commented Nov 1, 2022

@Lecrapouille Ok ,I'll send it soon.
update, It's a file named "hy_yj_zg_sc.zip", about 119MB, sent.

@Lecrapouille
Copy link
Collaborator

Lecrapouille commented Nov 1, 2022

@1178209138 yep there is bottleneck somewhere. I add my benchmark:

Zipper lib:
Zip "The Expanse S4":    206947.200 ms => 3 Minutes 26 Seconds (=-) / 1.2
Unzip "The Expanse S4":  116280.597 ms => 1 Minutes 56 Seconds (--) / 3.4
Zip hy_yj_zg_sc:          16780.517 ms => 16 Seconds           (==) * 1.1
Unzip hy_yj_zg_sc:        94578.933 ms => 1 Minutes 34 Seconds (--) / 8.2

Linux zip tool:
Zip "The Expanse S4":    171211.721 ms => 2 Minutes 51 Seconds (=+)
Unzip "The Expanse S4":   33556.513 ms => 33 Seconds           (++)
Zip hy_yj_zg_sc:          18668.122 ms => 18 Seconds           (==)
Unzip hy_yj_zg_sc:        11414.059 ms => 11 Seconds           (++)

My config: Ryzen 1800x, SSD, 16 GB RAM, only one core is running (Zipper lib and Linux zip tool)

@1178209138
Copy link
Author

@Lecrapouille 1 Minutes 34 Seconds seems acceptable. So this problem is caused by window system?

@Lecrapouille
Copy link
Collaborator

@1178209138 I added my config in the previous message. Ideas:

  • have you SSD or HDD ? This can impact performance.
  • have you compiled the v2.x.y branch (the most up-to-date in terms of fixes but I'm not sure it can compile on Windows).
  • Can you rerun on Linux ?
  • Is Rar multithreaded (you can check CPU loads if one or all cores are 100%).
  • Finally, it takes about one hour to finish unzipping. How many times (approximately) have you needed to unzip ? Really hours ? Or just a way speaking ?

@1178209138
Copy link
Author

1178209138 commented Nov 2, 2022

@Lecrapouille
SSD. I use Rar tool to unzip, and it takes just 10 seconds
I compiled the master code. So you are using v2.x.y branch code?
I don't have Linux environment now.
Rar is not multithreaded, both Rar and zipper got the same CPU loads about 12%.
I just do unzip once. Precisely, it takes about 59 minutes.(using Vistual Studio 2019 debug tool to count).
By the way, I use zipper's debug version(compiled with debug x64), does it maters?

My config: Intel i5-10500, SSD, 16 GB RAM, only one core is running (Zipper lib and Rar zip tool)

@Lecrapouille
Copy link
Collaborator

Lecrapouille commented Nov 2, 2022

@1178209138

  • your config is quite similar to my config: no CPU or SDD are the bottleneck point.
  • Are you serious: 1 hour with the same zip you sent me !!! This library is not the bottleneck point since people would have complained before you :)
  • using Vistual Studio 2019 debug tool to count: this seems to me the key to your issue. You have to do careful introspection job: how you have compiled this lib (compilation flags ...) ? Analyze your Windows: if antivirus blocking you ? Windows is considering your application has untrusted application ? ... Same behavior when 1/ your run your application (same binary) on a different Windows/desktop 2/ when compiled with a different Windows/desktop (different binary) ... I'm not a Windows user, so I cannot help you so much on this point.
  • So you are using v2.x.y branch code yes this branch https://github.com/sebastiandev/zipper/tree/v2.x.y that continues on this repo https://github.com/Lecrapouille/zipper but with drawback my Makefile does not support Windows.

@1178209138
Copy link
Author

1178209138 commented Nov 3, 2022

@Lecrapouille I compiled with -debug -x64 flags, I'll use release version of this lib do testing again.
Tested, the performance of release version lib is similar with debug version lib.
Maybe main code have some bugs. But v2.x.y branch code don't support cmake, it's hard to compile in window.

@Lecrapouille
Copy link
Collaborator

@1178209138 the v2.x.y does not fix bottleneck point compared to the master just behavior fixes. The compilation by itself is not complicated, you just have to download and compile thirdparts in the external as the 2 scripts shell do. I'll have to think how to compile for Windows, I do not use Windows.

@Lecrapouille
Copy link
Collaborator

@1178209138 have you solved your issue ?

@1178209138
Copy link
Author

1178209138 commented Nov 14, 2022

@Lecrapouille I have not compiled v2.x.y branch yet ,because I have been busy with my work recently.
So, you optimize unzip in your new repo? When I have time. I'll try it out.

@Lecrapouille
Copy link
Collaborator

I close this ticket since I'm no longer maintaining this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants