-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Explanation
I've recently played around with stacked filters for streams and I've realized that you can build enormous zip bombs by nesting FlateDecode filters like in this file: bomb.pdf
The file isn't actually a valid PDF file and most PDF parsers I tested recognize that without decompressing the whole content first. pypdf however tries to decompress the whole stream before parsing it. This normally works but the file I provided unpacks to over 1PB of zero bytes.
A fix for this would be to stream decompression as you can process decompressed data from zlib before you've finished decompressing the whole thing.
I'm pretty sure that this would require a significant amount of changes to the decompression logic but this could also be seen as a security flaw as parsing a small untrusted PDF file could lead to a DOS.
I'm not really sure what policy you have for these kinds of issues but I wanted to report it in case someone might want to fix it.