Skip to content

Wrong charset handling in MailSender #5096

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wRAR opened this issue Apr 11, 2021 · 6 comments · Fixed by #5118
Closed

Wrong charset handling in MailSender #5096

wRAR opened this issue Apr 11, 2021 · 6 comments · Fixed by #5118
Labels

Comments

@wRAR
Copy link
Member

wRAR commented Apr 11, 2021

Looks like passing charset='utf-8' makes a plain text message with Content-Transfer-Encoding: base64 which then can't be read.

At

msg.set_charset(charset)
set_charset is called but as the payload is not set yet, the underlying class just sets the headers. When later set_payload is called, it doesn't do any encoding, but the Content-Transfer-Encoding is already set. Looks like the fix should be passing the encoding to set_payload too, like was proposed in #3722 (and set_charset may be safe to be removed, not sure about this). Note that we have tests but they don't catch this.

Note also, that all of this seems to be compat code according to the Python docs.

@wRAR wRAR added the bug label Apr 11, 2021
@mmitropoulou
Copy link
Contributor

mmitropoulou commented Apr 12, 2021

Hello, i am new here and i would like to try and fix this bug. Can you please give me some guidlines and tell me how to recreate this issue?

@wRAR
Copy link
Member Author

wRAR commented Apr 12, 2021

Sure, you can go ahead.

@Gallaecio
Copy link
Member

Can you please give me some guidlines and tell me how to recreate this issue?

The documentation about using the MailSender class is at https://docs.scrapy.org/en/latest/topics/email.html . Then it should be a matter of following the instructions on the issue description (“passing charset='utf-8'”), and verify that indeed the resulting email message cannot be read.

@mmitropoulou
Copy link
Contributor

Hello, I have made a small change and now when you send an email with utf-8 encoding the body of the email looks fine and does not have a base64 encoding. Do you want me to do a pull request or maybe a draft pull request to see what I have done and give me feedback?

@Gallaecio
Copy link
Member

Do you want me to do a pull request or maybe a draft pull request to see what I have done and give me feedback?

That would be great! Feel free to make it a regular pull request if it’s ready for review and discussion.

mmitropoulou added a commit to marlenachatzigrigoriou/scrapy that referenced this issue Apr 26, 2021
@mmitropoulou
Copy link
Contributor

mmitropoulou commented Apr 26, 2021

I created a pull request and I am looking forward for your feedback. Thank you!! @Gallaecio @wRAR

mmitropoulou added a commit to marlenachatzigrigoriou/scrapy that referenced this issue Jun 25, 2021
andreastziortz added a commit to AngelikiBoura/scrapy that referenced this issue May 22, 2022
Changes:
Implemenetation
- Set encoding utf-8 for payload inside send
- Refactor code

Testing
- Test body in payload encoded in utf-8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants