-
-
Notifications
You must be signed in to change notification settings - Fork 32.5k
Open
Labels
Description
Python 3.10.6
module email — An email and MIME handling package v3.11.3
Consider this simple message:
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8-bit
MIME-Version: 1.0
From: Dmcc <foobar1@gmail.com>
To: Dmcc <foobar2@gmail.com>
Subject: test msg of 8bit CTE and UTF8
there is the hötel
Notice the o-umlaut in the word hotel, this is encoded in utf8. I put this in a file called msg.eml. Then run this:
#!/usr/bin/env python3
import email
from email.policy import default
f = open("msg.eml", "r")
msg = email.message_from_file(f, policy=default)
f.close()
print('CTE: ', msg['content-transfer-encoding'])
body = msg.get_content()
print('body:', body)
The output:
CTE: 8-bit
body: there is the h�tel
I expect the output to have valid utf8 since the CTE is 8bit. This problem also hhappens with the older get_payload() and with any of the "_from" methods, such as email.message_from_bytes().
Linked PRs
arhadthedev, michaelfm1211 and sazarkin