-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
smtp: Support internationalised characters in envelope addresses #4892
smtp: Support internationalised characters in envelope addresses #4892
Conversation
306cad2
to
2d8f264
Compare
I don't think so. This is about IDN in the domain names, not what the SMTP server supports. Update: did some more thinking and I'm no longer really sure what to think. We either need to test it out or find this documented to learn what the answer is.
Yes I think so. IDN is only for the host name and I would be a bit scared of what could happen by accident if we encode more than necessary.
I'd say should rather than need, but yes probably?
Again should rather than need but yes if the host name in the protocol, there will probably be users with IDN names who would like to use that name rather that the punycoded version. |
2d8f264
to
b683fe0
Compare
lib/smtp.c
Outdated
} | ||
} | ||
|
||
auth = aprintf("%s", mail_auth.name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this returns NULL we need to set result
too for failure, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep - isn't that taken care of by the block that starts at line 568? Unless I've screwed up the indentation:
if(!auth) {
free(from);
return CURLE_OUT_OF_MEMORY;
}
e214ee6
to
a4b0793
Compare
This looks good, but we really need a test case or two for this as well, right? |
Yep - that's this evening's job. I've dug out the HTTP IDN test cases (165, 1034, 1035, 1448, 2046 and 2047) which will hopefully help me out with the non-ASCII stuff ;-) Just waiting for my munchies to be delivered by Ocado (other supermarket delivery services are available) before I get stuck in ;-) |
07e9c9e
to
83c8162
Compare
83c8162
to
bc7e3d1
Compare
This avoids the duplication of strings when the optional AUTH and SIZE parameters are required. It also assists with the modifications that are part of curl#4892.
9dfe4d6
to
1b86261
Compare
I think this is about there and ready for public scrutiny:
My thoughts were similar when I started this journey and I must admit I am still in two minds even now, having spent a a fair amount of time with RFC-6531. My understanding is that if the client wants to transmit UTF-8 in either a) the local part of the mailbox in an envelope command, or b) any part of a mailbox address used in one or more headers then the SMTPUTF8 extension is required. However if the client wants to transmit UTF-8 in only the host part of mailboxes in an envelope command (but not in any header) then the client may choose to convert these, at its discretion, using IDN. This is detailed in [1] and I think is where I started out. You could argue that with client side SMTPUTF8 support there is no need to convert the host part to an A-label as it could be sent as a U-label - See [2]. Note: The server side extension is present in the logs from #4828 so in summary I think our only issue there is we didn't advertise SMTPUTF8 support in the MAIL command. If we had of done there the server should have accepted that address.
Yep - agreed, I have made changes to implement this.
I've ended up implementing this for the MAIL command (both FROM and AUTH parameters), RCPT TO command and VRFY command. I've also added the advertisement of SMTPUTF8 to the EXPN command so the server my respond with UTF-8 based addresses.
I think I'll leave IMAP for another rainy day. [1] RFC-6531 Section 3.1 Paragraph 3 states:
However, the paragraph then goes on to state:
So I was tempted to implement support for ASCII based recipients (ie. The local part of a mailbox) on UTF-8 hosts using the IDN conversion to A-label.
[2] RFC-6531 Section 3.2 Paragraph 2 states:
|
1b86261
to
108e625
Compare
f9af971
to
9cee717
Compare
def9c17
to
383e24b
Compare
383e24b
to
ed853a5
Compare
…eter Non-ASCII host names will be ACE encoded if IDN is supported.
…meter Support the SMTPUTF8 extension when sending mailbox information in the FROM parameter. Non-ASCII domain names will be ACE encoded if IDN is supported. Reported-by: ygthien on github Fixes curl#4828
Simply notify the server we support the SMTPUTF8 extension if it does.
ed853a5
to
f064111
Compare
Hi @mback2k, Are you able to take a look at my Windows builds and suggest a way forward please? I'm trying to update/add some tests that output UTF-8 characters which is then breaking those tests on Windows :( For example: https://ci.appveyor.com/project/curlorg/curl/builds/31045056/job/knhcb7deywn9j1ks |
@captain-caveman2k I can try to take a look this week. Unfortunately I have never been able to solve similar issues while porting the test suite to Windows, because the classic Windows Console is not so keen about UTF-8. |
This patch set fixes support for internationalised characters in mailbox addresses, which effects the
MAIL
,RCPT TO
andVRFY
commands. by either a) using IDN to encode a UTF-8 host name into a ACE which is 7-bit or b) by using theSMTPUTF8
extension to support UTF-8 in both the local part of the mailbox as well as the host name part.When the
SMTPUTF8
extension is used theEXPN
will also advertise, to the servers, it's support for the extension.As such, this PR not only fixes issue #4828 but also allows UTF-8 to be used "legally" in other parts of an SMTP message.
The log from issue #4828 shows that the server supports the
SMTPUTF8
extension, as per RFC-6531, but either of the above two fixes would solve that particular issue. However, if UTF-8 mailboxes are to be used in other headers then only the option b can be used to implement theSMTPUTF8
extension fully.I have utilised the existing IDN conversion code from
url.c
for the UTF-8 to ASCII conversion. We might want to consider moving that code out ofurl.c
into an IDN or hostname module but we can do that at a later date.When I first opened this PR I had the following questions. I have kept them here for historical reasons but I believe we have resolved them now.
--
SMTPUTF8
?MAIL_FROM
field - do we need to?MAIL_AUTH
andRCPT TO