Unicode support in EmailField #1527
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #1228
Previously, email addresses containing unicode characters didn't work at all. Now, unicode domains are supported out of the box and utf8 validation on the user parts can be explicitly enabled. This PR also adds validation of IP-based domain parts (e.g. "user@[127.0.0.1]") and introduces support for whitelisting otherwise invalid domains (e.g. "root@localhost").
Support for unicode usernames is still somewhat limited in many packages, so that option should be enabled only after a thorough inspection of the other parts of the system. The domain whitelist is empty and the IP validation is disabled by default primarily to be consistent with the past behavior, and because many applications may choose to reject such addresses.
Performance
TL;DR validation of regular ascii-only email addresses is slightly slower than previously, but still fast enough for vast majority of use cases (< 5us). Validation of unicode email addresses is slower by an order of magnitude (~75-85us), but works as expected and is only slow for these less-common addresses.
On master:
On this branch: