Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-105333: Preserve trailing spaces on headers for signing #105485

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

gaogaotiantian
Copy link
Member

@gaogaotiantian gaogaotiantian commented Jun 8, 2023

Currently when you use email to parse headers like:

Content-Type: multipart/alternative; 
 boundary="------------l1sFYP2m750zRB2cqq59goTA"
from email.header import Header
# Notice the trailing space after ;
h = 'Content-Type: multipart/alternative; \n boundary="------------l1sFYP2m750zRB2cqq59goTA"'
print(repr(Header(h).encode()))

The trailing space will be removed.

'Content-Type: multipart/alternative;\n boundary="------------l1sFYP2m750zRB2cqq59goTA"'

This won't invalidate the message, but could potentially break the signing.

A small modification is introduced to address this issue when parsing headers to preserve the trailing spaces. The impact is limited and all the tests passed.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it preserve trailing spaces if specify charset? Does it support the header value consisting of only spaces?

It would be interesting to test it also with many spaces and small maxlinelen, so the length of the line with spaces exceed the limit.

@gaogaotiantian
Copy link
Member Author

So first of all, I think the main purpose of the change is to make signing possible for more cases. It's impossible to make signing available in all cases because in some scenario, you can just specify "I want to change this email content" and it won't make sense to sign it. For example, if you parse it with maxlinelen - that just says "I'll change it".

Does it preserve trailing spaces if specify charset?

I think if you specify charset, the encoded message would be competely different so it does not make sense to keep the spaces. The charset will be inserted in the encoded header so it's impossible to keep it the same (from my understanding).

I believe the deal breaker here is - will it work without the extra spaces? If it does not even work without these redundant spaces, it's not the thing that this PR is trying to fix.

Does it support the header value consisting of only spaces?

The test shows so. Not sure if this is a valid header.

This PR has been sitting here for a while now, I don't have the plan to push it. I don't believe this is a silver bullet that solves a large chunk of issues magically. It just fixes a very limited scenario. As I mentioned in the original discussion - I believe if you start parsing, you lose the original format. There's no guarantee that encoding a parsed message will result in the same one as the original.

On the other hand, this patch is compact and has very little side effects. It also does solve a problem.

I'm okay with either closing or merging this PR. I won't fight for it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants