Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 20239 - chameleon zip-file should be rejected by std.zip #7198

Merged
merged 1 commit into from Oct 7, 2019
Merged

Fix issue 20239 - chameleon zip-file should be rejected by std.zip #7198

merged 1 commit into from Oct 7, 2019

Conversation

ghost
Copy link

@ghost ghost commented Sep 25, 2019

The lines around 640 where wrongly indented. I corrected them, although this is not directly related to the bug fix. Hope, this doesn't disturb.

@ghost ghost requested a review from CyberShadow as a code owner September 25, 2019 19:46
@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @crocopaw! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Auto-close Bugzilla Severity Description
20239 normal chameleon zip-file should be rejected by std.zip

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + phobos#7198"

Copy link
Member

@CyberShadow CyberShadow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding and reading a directory from the end seems like an optimization attempt for slow disk devices. But, std.zip already has the data in a void[] buffer. In this case, wouldn't it be simpler and more correct to simply walk all archive members from the beginning to the end?

std/zip.d Outdated Show resolved Hide resolved
std/zip.d Outdated Show resolved Hide resolved
std/zip.d Outdated Show resolved Hide resolved
@ghost
Copy link
Author

ghost commented Sep 26, 2019

Finding and reading a directory from the end seems like an optimization attempt for slow disk devices. But, std.zip already has the data in a void[] buffer. In this case, wouldn't it be simpler and more correct to simply walk all archive members from the beginning to the end?

That was my first idea too. But when I dived more deeply into the data format of zip files it was soon clear, that this is impossible. The chunks inside a zip file do not need to be alligned (nor sorted). There can be other stuff before (think of self extracting zip files) and between (e.g. removed files from the archive, where only the entry in the end record has been deleted). Even overlapping is not prohibitted by the specs. This gives a lot of possibilities for malware. Avoiding being trapped by malware makes it necessary to start at the end of central directory record.

Unfortunately zip allowes for a comment (of apriori unknown size) after that record, which again gives leeway for malware (see bug report). I think, best one can do is to scan these 16k of bytes. While the old approach stopped when it found one correct entry, the new version looks if there is another correct entry and rejects the whole zip file in this case. Having more than one correct entry is a strong indicator of having to do with malware, because in normal zip files, there should never be comments containing x05/x06 bytes. And the chances, that the data before the intended record form a valid second entry is almost zero too, but in rare circumstances possible.

And one more reason to start at the end: Although the whole file is kept in memory at the moment, I think it might be reasonable to improve this later by replacing the 'void[] buffer' with a random-access-range. This will speed things up in some circumstances (think of a large zip-archive, where only few small files need to be retrieved and therefore only part of the archive needs to be read from file).

Copy link
Member

@CyberShadow CyberShadow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation.

Please address the constness nit. LGTM otherwise.

@ghost
Copy link
Author

ghost commented Sep 26, 2019

I moved the problematic enum back into the code. I'll cope with those magic constants in a separate PR.

@ghost
Copy link
Author

ghost commented Sep 27, 2019

See PR #7201 for a solution of the problem with the enum.

@ghost
Copy link
Author

ghost commented Oct 6, 2019

Resolved merge conflicts with PR #7201.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants