New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive disk read: Fix location of central directory #1514
Excessive disk read: Fix location of central directory #1514
Conversation
…tral Directory in EOCD and where the OECD was found. This prevents large reads when a zip archive is preceded by other data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice! I have a couple of suggestions for clearer naming and comments, but nothing serious.
Use the size of the Central Directory and the offset of the EOCD to calculate the real position. This trick doesn't work for Zip64 as easily as we are not scanning backwards to find the PK\x06\x06 entry. Interestingly, it is never checked so it could be trying to parse bad files. Updated based on review
Thanks for the feedback. Taken it on board and updated. As noted, I can't just fix zip64 formats as I'd have to implement the scanning backwards like a standard EOCD is found but for zip64. If we assume a small comment it should be doable but it's really outside what I want to do because I should be getting back to the project that is using libarchive. |
seems like an unrelated issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
@@ -3484,6 +3492,8 @@ read_zip64_eocd(struct archive_read *a, struct zip *zip, const char *p) | |||
|
|||
/* Save the central directory offset for later use. */ | |||
zip->central_directory_offset = archive_le64dec(p + 48); | |||
/* TODO: Needs scanning backwards to find the eocd64 instead of assuming */ | |||
zip->central_directory_offset_adjusted = zip->central_directory_offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for introducing those variables; this is a lot easier to read now.
In archive_read_support_format_zip.c, it is noted that the slurping of the central directory starts from where it has been told it should be but just reads forward until it finds it.
I have worked on a fix that passes the tests that calculates the actual location of the CentralDirectory and stores it as a separate member of Zip. I was hoping to store the actual location in the existing member but I discovered that I was going to have to rework how it calculated the offset of each individual file as well.
This was discussed in this issue but I should have created a pull request to begin with.