Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Charset error with french mails containing accents because of hotmail (Windows-1252) #14

Closed
stouch opened this issue Jul 1, 2022 · 8 comments

Comments

@stouch
Copy link
Contributor

stouch commented Jul 1, 2022

Hello,

The mails sent from the french Microsoft outlook and hotmail.com are sent with Content-Type: text/plain; charset="Windows-1252" charset.

So the payload in hook seems to be badly encoded and the chars and I finally receive are some non-recoverable dead chars.

Do you think there is some easy solution to this ?

Thank you so much for your work.

@stouch
Copy link
Contributor Author

stouch commented Jul 1, 2022

Here is the payload i'm currently testing :

It is typically a mail sent from Microsoft outlook.live.com in France.

Je suis intéréssé is replaced and arrived on my hook like Je suis int �ress �


From: Debug from <test@from.com>
To: "2053d27c-6370-4996-a2a4-09bfa8df5c6c@example.io"
	<2053d27c-6370-4996-a2a4-09bfa8df5c6c@example.io>
Subject: RE: TEST
Thread-Topic: TEST
Thread-Index: AQHYhulSipU7FNM+vk2C79M0JiDmAK1cxgla
Date: Thu, 23 Jun 2022 10:15:03 +0000
Message-ID:
	<AM7PR02MB62413F36389FB6BB5F9C778ABDB59@AM7PR02MB6241.eurprd02.prod.outlook.com>
References: <b801f1b1-e5e8-49ec-b375-7fbd43104b72@smtp-relay.sendinblue.com>
In-Reply-To: <b801f1b1-e5e8-49ec-b375-7fbd43104b72@smtp-relay.sendinblue.com>
Content-Language: fr-FR
X-MS-Has-Attach:
X-MS-Exchange-Organization-SCL: -1
X-MS-TNEF-Correlator:
X-MS-Exchange-Organization-RecordReviewCfmType: 0
msip_labels:
Content-Type: multipart/alternative;
	boundary="_000_AM7PR02MB62413F36389FB6BB5F9C778ABDB59AM7PR02MB6241eurp_"
MIME-Version: 1.0

--_000_AM7PR02MB62413F36389FB6BB5F9C778ABDB59AM7PR02MB6241eurp_
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

Bonjour,

Je suis int=E9ress=E9.

@stouch
Copy link
Contributor Author

stouch commented Jul 1, 2022

Okay I think I found the reason of this..

In go-smtpsrv / parser.go, you ReadAll the bytes of mail content without considering bytes mapping depending on charset of the mail.

So, for example, instead of :

		case contentTypeTextPlain:
			newPart, err := decodeContent(part, part.Header.Get("Content-Transfer-Encoding"))
			if err != nil {
				return textBody, htmlBody, embeddedFiles, err
			}

			ppContent, err := ioutil.ReadAll(newPart)

We should have :

		case contentTypeTextPlain:
			newPart, err := decodeContent(part, part.Header.Get("Content-Transfer-Encoding"))
			if err != nil {
				return textBody, htmlBody, embeddedFiles, err
			}

                         // Depending on  part.Header.Get("Content-Type") 's charset : 
			tr := charmap.Windows1252.NewDecoder().Reader(newPart)

			ppContent, err := ioutil.ReadAll(tr)

I'm not enough used to Go programming to help..

Would there anyone who could help me fixing this ?

By the way, the Dockerfile of this project does not work anymore, so I used the perfectly working repo of https://github.com/aranajuan/smtp2http @aranajuan

Thanks


EDIT : I created a pull request in go-smtpsrv (alash3al/go-smtpsrv@a653a0b) to fix this issue. @alash3al

But anyway I can't use smtp2http right now because the Dockerfile is deprecated and does not work anymore. Would be really amazing / appreciated to have an update in this repo to get something working 😇

@aranajuan
Copy link
Contributor

Hi! I don't know i this repo is being maintained. If not feel free to make the PR in my fork, I can push it to dockerhub then :)

@stouch
Copy link
Contributor Author

stouch commented Jul 3, 2022

Hi! I don't know i this repo is being maintained. If not feel free to make the PR in my fork, I can push it to dockerhub then :)

I pulled your repo to mine, I forked and updated go-smtpsrv to fix the charset issue, and I made some renaming to make it work. It s not amazing but it is working :)

https://github.com/stouch/smtp2http

What do you think ?

@alash3al
Copy link
Owner

alash3al commented Jul 4, 2022

Guys, sorry for that so long delay, and thanks for your contributions,

you can do your PRs and I'll merge them ASAP

@stouch
Copy link
Contributor Author

stouch commented Jul 4, 2022

@alash3al Let's check #16 to close this issue

@alash3al
Copy link
Owner

alash3al commented Jul 4, 2022

Merged successfully

@stouch stouch closed this as completed Jul 4, 2022
@alash3al
Copy link
Owner

@aranajuan @stouch could you please tell me which use case you use this software for?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants