Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Validator] In "strict" mode, email validator inaccurately claims certain valid emails are invalid #18156

Closed
natechicago opened this issue Mar 13, 2016 · 14 comments

Comments

@natechicago
Copy link
Contributor

Symfony's email validation constraint symfony/src/Symfony/Component/Validator/Constraints/EmailValidator.phpcan be configured in one of two modes. Non-strict mode (default) will cause the validator to compare an email address against a trivial '/^.+\@\S+\.\S+$/' regex. It's sufficient for most traditional email addresses. Strict mode, on the other hand, delegates validation to an external library, Egulias\EmailValidator, which provides complex validation against the guidelines defined mostly in RFC-5321 and RFC-5322.

This external library can also be run in strict vs. non-strict mode. However, in the context of this library, "strict" means something very different than what it means in the context of Symfony's email validator. More on that in a moment. But for now, realize that Symfony's email validator chooses to run the external library in "strict" mode.

Consider the following code that demonstrates the bug. Please pardon the inline HTML.
https://gist.github.com/natechicago/e679b14d74cd84865c68

Its output can be seen here. Note that each email address shown is valid, per the RFCs: http://i.imgur.com/E44cp0o.png

In the "Symfony validator: Non-strict validation" column, the validation failures are not unexpected, since the simple regex cannot possibly account for every variety of valid email address.

However, in the "Symfony validator: Strict validation" column, we would expect that the validator would judge each of the listed email addresses as valid, since (a) each address is valid, per the RFCs, and (b) we've delegated this validation to a library designed to use the RFC standards to validate emails. As you can see, some of the more strange/uncommon (yet still legitimate) email addresses are inaccurately judged by Symfony as invalid.

The reason for this problem is that Symfony uses the Egulias\EmailValidator library in strict mode, rather than non-strict mode. In the context of the external library, "strict" means: "Even if an email address is valid per the RFCs, it will be considered invalid if any warnings (non-critical informational alerts) were raised during the parsing of the email address." For example, one such informational alert indicates the presence of a double-quoted local part in the email address. In general, a warning will occur if the address contains deprecated elements, has unusual elements (but remains valid per RFC-5321), or meets RFC-5322's broad definition of an email address (but might not satisfy the definitions in older/obsolete RFCs).

In the "Egulias validator: with $strict === false" column, you'll see the results of invoking the external validator directly, but in non-strict mode rather than strict mode. In my opinion, the corresponding results are in line with how Symfony's validator should behave when Symfony's validator is used in strict mode.

To summarize, currently:

Symfony's validator in non-strict mode: Validates against a simple regex.
Symfony's validator in strict mode: Validates against standard RFC rules. Uses external library in strict mode, which results in the unnecessary flagging of legitimate email addresses as invalid.

Proposed solution 1:

Symfony's validator in non-strict mode: Validates against a simple regex.
Symfony's validator in strict mode: Validates against standard RFC rules. Uses external library in non-strict mode, which won't consider strange/uncommon (yet still legitimate) emails as invalid.

Proposed solution 2:

Symfony's validator in non-strict mode: Validate against a simple regex.
Symfony's validator in strict mode: Validates against standard RFC rules. An option will be added to the constraint class to allow client code to determine whether or not the external library should be used in strict or non-strict mode. For backwards-compatibility, this would default to using the library in strict mode.

I'm submitting a patch in line with proposed solution 1, because it seems like solution 2 would lead to an unnecessarily-confusing constraint class (potentially needing to juggle two different definitions of the word "strict"). But if anyone feels differently (or for that matter, feels this shouldn't even be considered a bug in the first place), please let me know!

@natechicago
Copy link
Contributor Author

Also, the same issue was previously reported here: egulias/EmailValidator#20

From the author of the library:

But indeed current Symfony integration is not being strict in the same way the validator, is being 'real world' strict; that was discussed during development and was agreed that should be RFC strict so we should try to adhere to that. This means warnings should be treated as valid emails with warnings. [emphasis added]

While it's not entirely clear, I do think that statement supports the idea that Symfony's current usage of the library (using it in "strict" mode) is flawed.

@backbone87
Copy link
Contributor

i would deprecate the strict option in the EmailValidator and introduce a variable mode with a value of simple (symfony internal regex), rfc (external non-strict) or strict (external strict). the strict option then maps to simple if false and strict if true.

@nicolas-grekas
Copy link
Member

I like @backbone87's proposal, so this would be a new feature. I also suggest to add an html5 mode that validates the email against the HTML5 email validation regexp.

@egulias
Copy link
Contributor

egulias commented Mar 15, 2016

I agree with @natechicago 's proposal 1.
Yes, warnings need a thought but the fix could probably improve.
I'm working on v2 of the lib to admit several strategies (something similar to https://github.com/egulias/EmailValidator4J#validation-strategies), instead of flags in the isValidmethod, which could provide all the functionality @backbone87 proposes.

@webmozart
Copy link
Contributor

I like @backbone87's suggestion as well. What do you think about level instead of mode? I think it's easy to think about this option as increasing levels of strictness (even though not technically correct):

level Description
basic uses a basic regex
html5 uses the HTML5 regex
rfc complies to the RFC
rfc-no-warn complies to the RFC, disallows warnings

Example usage:

/**
 * @Email(level="rfc")
 */
private $email;

@backbone87
Copy link
Contributor

like it, the only change i would make is renaming rfc-no-warn to rfc-strict

@backbone87
Copy link
Contributor

also you could rename level to profile to indicate that there is no order derived from the values

@natechicago
Copy link
Contributor Author

Thanks for the feedback, everyone. I'll submit a new PR soon to implement the suggestions of @backbone87 and @webmozart .

@webmozart
Copy link
Contributor

like it, the only change i would make is renaming rfc-no-warn to rfc-strict

What is "strict"? I think that word is too arbitrary. "no-warn" is not beautiful, but much more expressive and in line with the RFC.

@backbone87
Copy link
Contributor

didnt read the RFC, if they dont use the "strict" terminology or prefer "warnings" over it, i am fine with it. was just a tought :)

still for profile over level

natechicago added a commit to natechicago/symfony that referenced this issue Apr 4, 2016
…ely claims certain valid emails are invalid

Email validation can now occur in one of four different modes/profiles:
 - Basic regex
 - HTML5 regex
 - RFC-compliant (non-strict)
 - RFC-compliant (strict; warnings will cause an otherwise-valid email to be considered invalid)
natechicago added a commit to natechicago/symfony that referenced this issue Apr 4, 2016
…ely claims certain valid emails are invalid

Email validation can now occur in one of four different modes/profiles:
 - Basic regex
 - HTML5 regex
 - RFC-compliant (non-strict; informational warnings raised during validation are ignored)
 - RFC-compliant (strict; warnings will cause an otherwise-valid email to be considered invalid)
natechicago added a commit to natechicago/symfony that referenced this issue Apr 4, 2016
…ely claims certain valid emails are invalid

Email validation can now occur in one of four different modes/profiles:
 - Basic regex
 - HTML5 regex
 - RFC-compliant (non-strict; informational warnings raised during validation are ignored)
 - RFC-compliant (strict; warnings will cause an otherwise-valid email to be considered invalid)
@natechicago
Copy link
Contributor Author

Hi all, pull request #18428 implements the new features that were suggested here.

natechicago added a commit to natechicago/symfony that referenced this issue Apr 5, 2016
…ely claims certain valid emails are invalid

Changes from code review.
natechicago added a commit to natechicago/symfony that referenced this issue Apr 10, 2016
…ely claims certain valid emails are invalid

Changes from code review.
@javiereguiluz
Copy link
Member

This looks incredibly over-engineered to me. Validating an email address without sending an actual email is almost impossible ... but then Symfony proposes not only one but four different solutions 😖

natechicago added a commit to natechicago/symfony that referenced this issue Apr 11, 2016
…ely claims certain valid emails are invalid

Changes from code review.
@lunetics
Copy link

lunetics commented Sep 1, 2016

any update on this one?

@xabbuh
Copy link
Member

xabbuh commented Jan 13, 2017

I am closing here as the related PR was rejected and it does not seem that we will change the validator in the future. Though as explained in #18428 (comment) one can simply create their own validator if you have a need for this.

@xabbuh xabbuh closed this as completed Jan 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants