Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDN - Internationalized domain name for email address (ASCII conversion) #463

Closed
JamesBoon opened this issue May 12, 2023 · 15 comments
Closed

Comments

@JamesBoon
Copy link

Hi, thank you for this great library!

I must send emails to recipients with UTF-8 characters as part of the domain name. E.g. "you@exämple.com".
As I am using Postfix to send my mails, which does no automatic conversion to the ASCII (xn--mumble) form, I thought maybe SimpleJavaMail could/would do this.

I've been searching the docs and issues, but could not find a clear answer.
Is there a config for this? Or do I have to implement it myself?

Some code:

Email email = EmailBuilder.startingBlank()
        .withSubject("Test subject")
        .from("me@example.com")
        .to("you@exämple.com") // Expected: "To: you@xn--exmple-cua.com"
        .withHTMLText("<b>Bold html text!</b>")
        .buildEmail();

Mailer mailer = MailerBuilder
        .withTransportModeLoggingOnly()
        .buildMailer();

mailer.sendMail(email);
@RohanNagar
Copy link
Contributor

RohanNagar commented May 12, 2023

Just chiming in here, it is pretty easy to convert the domain to ASCII yourself in Java if it is not an option in SimpleJavaMail:

String convertedDomain = IDN.toASCII(domain, IDN.ALLOW_UNASSIGNED);

@JamesBoon
Copy link
Author

Hi @RohanNagar, thank you very much.

Does something like the following lines make sense?

com.sanctionco.jmail.Email rawTo = JMail.validator().tryParse("you@exämple.com").get();

String to = rawTo.localPartWithoutComments() + "@" +
        rawTo.domainParts().stream()
                .map(part -> IDN.toASCII(part, IDN.ALLOW_UNASSIGNED))
                .collect(Collectors.joining("."));
// ...

btw. thank you for JMail 😃

If I may, I would like to ask you one more question: why IDN.ALLOW_UNASSIGNED?

@RohanNagar
Copy link
Contributor

Your example should work! I think you could make it even simpler if you want:

com.sanctionco.jmail.Email rawTo = JMail.tryParse("you@exämple.com").get();

String to = rawTo.localPartWithoutComments() + "@" +
        IDN.toASCII(rawTo.domainWithoutComments(), IDN.ALLOW_UNASSIGNED));

// ...

I started using IDN.ALLOW_UNASSIGNED once I realized that emojis could be included in the domain. If you know that your domain will not contain emojis you could probably leave that out.

@bbottema
Copy link
Owner

Hey guys, thanks for reaching out. I'm wondering now, is this something Simple Java Mail should take care of internally, and, is it safe to make those assumptions in all cases. Or is this really something that's up to the users?

@JamesBoon
Copy link
Author

Hi @RohanNagar, thank you, that is just perfect 😄 Didn't think it could be so easy.

Hi @bbottema, I think it would be super helpful if Simple Java Mail has at least an option to easily switch this transformation on (thus having no breaking change).

@RohanNagar
Copy link
Contributor

@bbottema Internationalized domain names are valid as of RFC 6530. I believe it is supposed to be the mail server's responsibility to map the IDN to ASCII, but I'm sure there are many legacy servers not doing that.

I think it would probably be ok to make the assumption to convert to ASCII in all cases, but I can't be 100% sure.

Wikipedia has a high level explanation: https://en.wikipedia.org/wiki/Email_address#Internationalization

@JamesBoon
Copy link
Author

I am far from being an expert for IDN, but what I have read in the documents, there are two specifications: the old "IDNA2003" and the current "IDNA2008". They treat a few characters very differently. See https://unicode.org/faq/idn.html

When testing the example given on that page, the java function IDN.toASCII returns the old "IDNA2003" version (tested with Java 19):

System.out.println(IDN.toASCII("faß.de", IDN.ALLOW_UNASSIGNED));
// Result: "fass.de"
// Expected: "xn--fa-hia.de"

If you have been using Simple Java Mail with a service like gmail, it will do the conversion for you and will probably do it correctly. So according to the current specification.

That is why I think it would be a breaking change if Simple Java Mail will automatically convert all domain names.

@JamesBoon
Copy link
Author

This IDN problem is a lot more complicated than I thought. There are only a very little amount of libraries that actually handle them correctly using the current "IDNA2008" standard.

I have found some very interesting readings at: https://community.icann.org/display/TUA/UA+Training+Materials

The one library that was referred to as being "The gold standard library for Unicode" is: https://mvnrepository.com/artifact/com.ibm.icu/icu4j/73.1

Using it with the example "faß.de" above, it returns the expected result:

import com.ibm.icu.text.IDNA;

IDNA validator = IDNA.getUTS46Instance(
        IDNA.NONTRANSITIONAL_TO_ASCII
                | IDNA.NONTRANSITIONAL_TO_UNICODE
                | IDNA.CHECK_BIDI
                | IDNA.CHECK_CONTEXTJ
                | IDNA.CHECK_CONTEXTO
                | IDNA.USE_STD3_RULES);

IDNA.Info info = new IDNA.Info();
StringBuilder output = new StringBuilder();

validator.nameToASCII("faß.de", output, info);

System.out.println(output);
// Result: "xn--fa-hia.de"

I do not know what this means for Simple Java Mail or JMail. I just wanted to share my findings and hope it helps.

@bbottema
Copy link
Owner

I think this is beyond me, and the beyond the scope of Simple Java Mail. I'm going to close this as won't fix, unless someone has a very good argument to the contrary. Thanks for looking into this!

@JamesBoon
Copy link
Author

Closing this is perfectly fine with me.
However, I have a question, would it be ok for you to add a note to the documentation that the handling of international domain names may have to be implemented by yourself?
That would have helped me a lot in the first place 😃

@bbottema
Copy link
Owner

Where exactly? I would be happy to accept a pull request too, btw

@JamesBoon
Copy link
Author

I'd love to contribute! But I'm not sure where the note will fit.

@JamesBoon
Copy link
Author

Sigh. I give up. I couldn't find a good place to add such a note to the docs. Maybe it is enough that this issue exists and has enough resources to get someone started.

Thank you @bbottema and @RohanNagar for your time and help!

@JamesBoon
Copy link
Author

If anyone is interested, I've created a gist to play around with the email domain names: IdnEmailExample.java

@RohanNagar
Copy link
Contributor

@JamesBoon thanks so much for digging into this! I'm actually very interested in implementing this better in JMail and have opened an issue to implement this (linked just above this comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants