New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: support an EMAIL data type which implements correct address validation #24437
Comments
This is a cool idea! What do you think is the appropriate extent for validation? One possibility for implementing this without supporting an entirely new data type could be to create a new builtin
though I might have to defer to the wisdom I've heard before that I'm not sure there's much benefit gained here over just doing
given that foo@bar.com is a perfectly valid email that I give to sites all the time :) |
In the linked SO answer, they do something like: CREATE EXTENSION citext;
CREATE DOMAIN email AS citext
CHECK ( value ~ '^[a-zA-Z0-9.!#$%&''*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$' ); which seems pretty good without going crazy checking MX records and such (though I am not an email expert so IDK). If you want to lose your mind you can look at the Based on personal experience a lot of marketing email software does not seem to do anywhere near this level of validation, and as a result I get a lot of e-commerce email (including purchase receipts with PII, even!) addressed to other Richard Lovelands who have similar (but not the same) email addresses. I suspect they are doing something like the A data type or function would not necessarily prevent such sloppiness on the application side, but could make doing something closer to the right thing easier. In any case I definitely don't want to take away your |
We have marked this issue as stale because it has been inactive for |
I still think this is a valid feature request. lots of apps are out there still getting this wrong in 2021 |
I'm fine to leave this open, but I am not entirely convinced we'd want this. Speaking personally, I've always felt that the question "is this a valid email address?" is almost always asking one of two things:
In other words, email addresses are a business logic concern, so the business logic should make sure the addresses satisfy whatever conditions are needed for the app to work. That's just my own 2c though. In order for us to prioritize this, we'd need to know what problems are "invalid email addresses" causing for users? Then we can decide if those problems are best solved with this database feature.
I relate to this problem! My feeling has always been that I am getting these emails because my equally-named counterparts make typos in their email address when they sign up for things or enter their address anywhere. |
Based on a quick google search it seems that this is not a data type that is implemented in other databases.
There is an apparently not-much-used (28 stars) Postgres extension: pgemailaddr.
However it seems like a feature that is de facto in use by many applications that they hand-roll and almost always get wrong since it is very difficult to implement a correct email validator (up to and including checking MX records depending on how thorough one wants to be).
For example, see this StackOverflow question about the best way to store an email in Postgres.
The accepted response has two parts: (1) uses a pretty crazy regex, and the "more correct" (2) uses
plperlu
/links to a Perl library, Email::Valid, which is maintained by the CTO at Fastmail and presumably is in use there. My assumption is that it's pretty battle-tested if Fastmail is using it. The library not only determines whether an email address is valid, it also (optionally) checks whether a mail host exists for the domain.I am not. However I would wager that many applications that use databases are doing workarounds due to the lack of this data type.
Jira issue: CRDB-5767
The text was updated successfully, but these errors were encountered: