Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scanning issue very large zip files #71

Open
dpritipalsingh opened this issue Dec 21, 2021 · 6 comments
Open

Scanning issue very large zip files #71

dpritipalsingh opened this issue Dec 21, 2021 · 6 comments

Comments

@dpritipalsingh
Copy link

I would like to report an issue on Windows, regarding the scanning of very large zip files.
This seems to be problematic for very large zip files with sizes around 2-4GB containing a large amount of files and/or a very large jar file.
The tools seems to be scanning for a couple of hours (on this zip file alone), after which I manually abort.
When extracted the tool does not have a problem with scanning the jar file(s).
All the zips are oracle related software.
As a workaround I use the -e (exclude) option to exclude these very large zip files.
I just wanted to tip you on this issue.
Perhaps you can add an option: do not scan zip files and/or disable the scanning of very large zip files.
I appreciate all the good work you guys have done.

@yunzheng
Copy link
Member

Hi, thanks for reporting! Do you know your Python version, or are you running the binaries?

There is an issue with Python < 3.7 where ZipFile objects cannot seek in memory so the workaround was to read the file into memory. I assume this could be part of the problem if that is the case.

@dpritipalsingh
Copy link
Author

dpritipalsingh commented Dec 22, 2021

I'm using the pre-compiled binary for windows version 1.2.0 and don't have python installed. I would really appreciate an additional scan option to skip zip files with a specified size and larger (i.e. >1GB) in order to improve the scan speed.

@yunzheng
Copy link
Member

I'm using the pre-compiled binary for windows version 1.2.0 and don't have python installed. I would really appreciate an additional scan option to skip zip files with a specified size and larger (i.e. >1GB) in order to improve the scan speed.

That is interesting, I will check if we can make such a feature but I would also like to know what causes it. Is it possible to share one of the zip files?

@dpritipalsingh
Copy link
Author

dpritipalsingh commented Dec 22, 2021

Unfortunately, I cannot share these files. I'm running a debug -vv at the moment to see if it logs (with 2>&1) anything usefull, if that will help? The problem is that when it scans very large zip files the scan speed is significantly degraded depending on the size of the zipfile and probably it's contents. After 2-4 hours I just manually abort because it takes way too long for 1 zipfile.

@dpritipalsingh
Copy link
Author

dpritipalsingh commented Dec 23, 2021

The oracle weblogic 12.2.1.4.0 zip (containing 1 jar) causing issues can be downloaded from the oracle url described here (login required): https://github.com/oracle/docker-images/blob/main/OracleFMWInfrastructure/dockerfiles/12.2.1.4/fmw_12.2.1.4.0_infrastructure_Disk1_1of1.zip.download
Additionally scanning a 4,5GB zip file containing oracle 12 installation files and jars (and not containing the file mentioned above) took 10 hours of scanning time to complete succesfully.

@yunzheng
Copy link
Member

thanks, i got the download link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants