Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not decrypting streams in encrypted pdf #159

Closed
RudolfVonKrugstein opened this issue Jan 16, 2023 · 6 comments
Closed

Not decrypting streams in encrypted pdf #159

RudolfVonKrugstein opened this issue Jan 16, 2023 · 6 comments

Comments

@RudolfVonKrugstein
Copy link

Hi,

I am opening an encrypted pdf and it errors when decoding a stream (FlateDecode).

Debugging the code, I notice that it tries to decode the stream "as-is" without decrypting it first. This of course already fails with the header.

I am working my way through the code, but have not found out where it should be decided to decrypt, where the password comes from and stuff like that.
Maybe someone with more experience has an Idea where to change that?

@s3bk
Copy link
Contributor

s3bk commented Jan 16, 2023

Hello,

the decryption happens here: https://github.com/pdf-rs/pdf/blob/master/pdf/src/file.rs#L75

@s3bk
Copy link
Contributor

s3bk commented Jan 16, 2023

@RudolfVonKrugstein
Copy link
Author

Ok, as far as I understand the following is happening:

In _parse_with_lexer_ctx objects are found and decrypted. For examples for Strings here.

But for streams, the data is not really read, only the range withing the file where the stream is located is strored (that happens in parse_stream_object which is called here and defined here.

Then, in Storage::decode the data at the range stored for the stream is taken from the file and decoded in decode. But it is not decrypted there. Decryption at that point is also not easy, because the Context with the key is not available (as far as I can see).

Is my analysis correct?

If so, I wonder how to fix that.

  • Option A: Instead of only storing the range for the stream, store the decrypted stream.
  • Option B: Make the key available at the point where the stream is decoded (in FlateDecode for example).
  • Option C: I am totally wrong (not unlikely).

@s3bk What do you think, how should this be done? I am willing to try to implement this but I would like your opinion so I am not running in the wrong direction.

@s3bk
Copy link
Contributor

s3bk commented Jan 16, 2023

From my understanding there is only one key per file, which is why decrypting happens in storage.
Why is it not decrypted?

https://github.com/pdf-rs/pdf/blob/master/pdf/src/object/stream.rs#L84 should call
https://github.com/pdf-rs/pdf/blob/master/pdf/src/file.rs#L129 which calls
https://github.com/pdf-rs/pdf/blob/master/pdf/src/file.rs#L70
and decrypts it.

@RudolfVonKrugstein
Copy link
Author

Mmmh, should decrypt work "in place"? Maybe https://github.com/pdf-rs/pdf/blob/master/pdf/src/file.rs#L75 should be

            data = Vec::from(t!(decoder.decrypt(id, &mut data)));

@s3bk
Copy link
Contributor

s3bk commented Jan 19, 2023

ugh. yes.

@s3bk s3bk closed this as completed in 97f4272 Jan 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants