Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utf8-domains are not valid #47

Open
mborko opened this issue Jun 2, 2017 · 13 comments
Open

utf8-domains are not valid #47

mborko opened this issue Jun 2, 2017 · 13 comments

Comments

@mborko
Copy link

mborko commented Jun 2, 2017

"Invalid domain name schönwetterwolke.at, fails regexp check"

Please, edit the regexp filter for adding domains to allow domains with UTF-8 encoding. Postfix [1] enabled the utf8-support by default from version >= 3.0. So it would be great, if postfixadmin (functions.inc.php:203) allows creating such domains too. UI should also be checked afterwards to ensure error free usage!

[1] http://www.postfix.org/SMTPUTF8_README.html
[2] https://stackoverflow.com/questions/14313849/how-to-validate-internationalized-domain-names

@DavidGoodwin
Copy link
Member

That commit seems to work for me; it requires the intl php extension however.
The domain is saved in the database in it's unicode encoded form.
I've not tested whether that works with Postfix, but am assuming it does.

@mborko
Copy link
Author

mborko commented Jun 2, 2017

I'm sorry, I was using the 3.0.2 Version from sourceforge, there was the old regex. I cloned it now from github and the creation process works

But I'm getting an error "Invalid parameter!" by accessing the link "./postfixadmin/list-virtual.php?domain=sch%26ouml%3Bnwetterwolke.at" in the Domain-List. The database entries are correct. Do you check the domain property for the UI afterwards in another way?

@cboltz
Copy link
Member

cboltz commented Jun 7, 2017

I'm afraid utf8 in domains is opening a very big can of worms :-(

The encoding in the url parameter is probably the easiest fix (urlencode() instead of htmlentities()).

The more interesting part is on the database side. Currently all fields holding a domain or mail address are latin1. Making them utf8 (or even utf8mb4) will cause trouble with the index length and make indexes covering two varchar(255) impossible (we do that in the vacation_notification table).

I'd prefer to have the punicode domain (xn--whatever) in the database to avoid all the utf8 "fun" unless there is a good reason to use "real utf8" domains. The frontend could then display the utf8 name, or even both utf8 and punicode.

Can you please test if everything (especially sending and receiving mails) works with the xn--whatever variant instead of real utf8 in the database?

@mborko
Copy link
Author

mborko commented Jun 7, 2017

I've got the problem with receiving emails on my mailserver sent from gmail/gmx/... to the managed domain. I was using the ASCII (xn--mumble) form, but postfix switched default to the UTF-8 [1]. After disabling the UTF-8 form ("smtputf8_enable = no") receiving mails works again, because automatic conversions are not supported. I hope there will be no troubles in future, but I don't know when the postfix-guys will switch entirely to UTF-8.

Maybe you consider switching in future to UTF-8 ... So, for now it works.

[1] http://www.postfix.org/SMTPUTF8_README.html

cboltz added a commit that referenced this issue Jun 22, 2017
Unicode support is a much bigger can of worms (see the discussion in #47),
and having just a little part of unicode support in is a bad idea.

You can of course use the xn--whatever notation for unicode domains ;-)
@xpunkt
Copy link

xpunkt commented Jun 27, 2017

utf8 is not used on any dns servers, if i fail here, please list the buggy dns servers :=)

all we need is idn support in postfixadmin, nothing more nothing less

note when this is done, remaining part is to make postfix dual mapping for idn domain AND smtputf8 variant, else one of them gets Relay denied :)

so i changed postfix default to have smtputf8 disabled, it would be a error to enable it with missing maps files for utf8, what charset is used in sql is less relevant, if it supports all 7bit domains

@gnanet
Copy link

gnanet commented Jun 29, 2017

SMTP Extension for Internationalized Email

Clearly states for domain names the requirement of IDNA conformity, which means even if the DNS library understands unicode, in the end the A-label format is used (xn-- ) in the query to a DNS server:

Any domain name to be looked up in the DNS MUST conform to and be processed as specified for Internationalizing Domain Names in Applications (IDNA) RFC5890. When doing lookups, the SMTPUTF8-aware SMTP client or server MUST either use a Unicode-aware DNS library, or transform the internationalized domain name to A-label form (i.e., a fully-qualified domain name that contains one or more A-labels but no U-labels) as specified in RFC 5890 RFC5890

So the best is to store and validate domains in the xn-- form

my 2 cents (as a native Hungarian, and DNS admin for more than 15years, from which i worked at a registrar 9 years)

@gessel
Copy link

gessel commented Nov 11, 2017

it appears the same regex check also refuses utf8 user names. It would seem that since postfix has no trouble with utf-8 domain names, it would be reasonable to allow them. I run against postgreSQL and have no trouble with UTF-8 characters.

@phoehnel
Copy link

Has Anything changed on this topic yet?
I "fixed" the Problem for me by editing the RegEx in check_domain for my needs but why isn't this yet in the official version?

@DavidGoodwin
Copy link
Member

No, I'm not aware of any changes haven taken place to fix it.

I assume the proper fix is to store the domain name punyencoded (xn-- form); which would require postfixadmin to presumably translate it to utf-8 when rendering the web ui?

@gessel
Copy link

gessel commented Jan 25, 2021

I realize this was closed last year, but... corona time. Also when last I checked, gmail was converting UTF8 domains to punycode before sending so mail could be delivered with UTF8 enabled on Postfix. This is no longer the case.

It appears to be an issue that is destined to become more pressing as "international mail" is more widely adopted. As it stands:

Postfix currently does not convert internationalized domain names from UTF-8 into ASCII (or from ASCII into UTF-8) before using domain names in SMTP commands and responses, before looking up domain names in lists such as mydestination, relay_domains or in lookup tables such as access tables, etc., before using domain names in a policy daemon or Milter request, or before logging events.

http://www.postfix.org/SMTPUTF8_README.html

If you enable SMTPUTF8 support with smtputf8_enable = yes, gmail (at least) will convert email addresses entered as punycode to UTF8, which will be rejected as unknown if postfix is configured by postfixadmin, which won't validate UTF8 domain names. If smtputf8_enable is disabled, then gmail will convert utf8 into punycode and mail will be delivered.

I understand the varchar 255 issue was a hassle with utf8mb4 on MySql/MariaDB, but for those of us on postgreSQL this has never been an issue. Further, with MySQL/MariaDB, the innodb_large_prefix should resolve the issue and permit setting the storage engine to InnoDB and the charset to utf8mb4_general_ci, no?

Forming a workable validation regular expression appears to still be a work in progress. Perhaps a simple a validation override would be a convenient method of allowing UTF8/EIA/RFC 6532 compliant servers to host UTF8 mailboxes and domains.

@xpunkt
Copy link

xpunkt commented Feb 17, 2021

i came to think one more time now :=)

why not just virtual alias from eai to virtual alias idn map it ?

i still have not found a dns server that supports eai, dead end imho

@gessel
Copy link

gessel commented May 1, 2021

Windows 2000 and 2003 apparently support UTF-8 domain names (https://docs.microsoft.com/en-us/troubleshoot/windows-server/identity/naming-conventions-for-computer-domain-site-ou)

It appears there are patches for BIND 9 to support UTF-8 encoded domains (https://www.nic.ad.jp/ja/idn/mdnkit/download/documents/mdnkit-1.2-doc/en/spec/bind9.html)

Thunderbird 89.0a1 (at least) supports UTF-8 mailboxes (EAI localpart) and UTF-8 domains (IDN). (TB 68 does not support EAI localpart but does support IDN).

Dovecot supports UTF8 mailbox names, I can't find clear confirmation that LMTP is SMTPUTF8 compatible yet.

The problem is that setting "smtputf8_enable = yes" is necessary to signal support for EAI, which is the local part in UTF-8, to the left of the @. Setting "smtputf8_enable = no" permits delivery to EAI@IDN (at least from gmail) as gmail uses punycode IDN which matches punycode virtual_mailbox_domains, but this should block EAI, that is utf-8 mailboxes.

For the local part (mailbox), the lack of UTF-8 support (EAI) seems pretty hard to defend, no? The lack of UTF8 localpart support does absolutely break EAI delivery as there's no punycode translation of UTF8 localparts.

Using the language of https://community.icann.org/download/attachments/132940436/Configuring%20EAI%20-%20African%20Association%20of%20Universities%20EAI%20webinar%20-%2016Nov20.pdf?version=1&modificationDate=1608023031000&api=v2 Slide 15

Works: ASCII@ASCII ekrem@misal.istanbul
Works: ASCII@IDN marc@société.org (assuming sender conversion to marc@xn--socit-esab.org)
works: punycode@punycode xn--ugb0x@xn--ngbeu.com
Fails: Unicode@ASCII 测试@example.com
Fails: Unicode@IDN अजय@SOCIété.org (Full EAI support) (also fails as अजय@xn--socit-esab.org)

That is, not RFC 6855/6856 compliant.

this presentation has some interesting points: https://uasg.tech/wp-content/uploads/2018/02/UASG019B-Email-Address-Internationalization-A-technical-perspective.pdf

specifically relevant is the advice to:

  • Validate EAI mailboxs using LGR rules so they're punycode compatible
  • Automatically (or at least offer to) generate a punycode alias to the same mailbox
  • case folding isn't practical

** gmail autoconverts punycode IDN to UTF8 on receipt. Neither gmail nor thunderbird autoconvert punycode mailboxes to UTF8. Gmail displays the mailbox as xn--ugb0x@برت.com

It seems the original objection to UTF-8 local part was based on breaking legacy database structures, but innodb_large_prefix was introduced back in mysql 5.6, on is the default now, and it won't be possible to disable it soon. Most (all?) current file systems are UTF-8 compatible now. I'd argue that the expectations of internationalization and alignment with the capabilities of the rest of the email ecosystem are more compelling than preserving compatibility with pre-2013 MySQL versions.

EAI support requires validating valid UTF-8 virtual_mailbox_domains and UTF-8 mailboxes (slide 54). This might be a nice goal for the 10th anniversary of RFC 6855.

@xpunkt
Copy link

xpunkt commented May 1, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants