Skip to content

Commit

Permalink
#31 fixing and clarifying DEFAULT
Browse files Browse the repository at this point in the history
  • Loading branch information
bbottema committed Feb 29, 2016
1 parent e83b5f7 commit 4748e3d
Showing 1 changed file with 33 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,28 @@

/**
* Defines a set of restriction flags for email address validation. To remain completely true to RFC 2822, all flags should be set to <code>true</code>.
* <p>
* There are a few basic use cases:
* <ol>
* <li>
* User wants to scrape as much data from a possibly-ugly address as they can and make a sensible address from it; these users typically allow all
* kinds of addresses (except perhaps for single-domain addresses) because in the wild, legitimate senders often violate 2822. E.g. If your goal is to
* parse spammy emails for analysis, you may want to allow every variation out there just so you can parse something useful.
* </li>
* <li>
* User wants to check to see if an email address is of proper, normal syntax; e.g. checking the value entered in a form. These users typically make
* everything strict, since what most people consider a "valid" email address is a drastic subset of 2822. For users with the strictest requirements,
* this library may not be enough, since although it checks most of RFC 2822, it might still be too 'tolerant' for their needs (on the other side of
* the spectrum, most libraries use a simple blah@blah.blah.com type regex, which as we of course know is
* <a href="http://www.troyhunt.com/2013/11/dont-trust-net-web-forms-email-regex.html">rarely a good idea</a>)
* </li>
* <li>
* User wants to intelligently parse a possibly-ugly address with the goal being a cleaned up usable address that other software
* (MTAs, databases, whatever) can use / parse without breaking; {@link #DEFAULT} tailors to this use case (with the possible exception of
* {@link #ALLOW_DOT_IN_A_TEXT}, to taste). In our experience they allowed "real" addresses the highest percentage of the time, and the addresses they
* failed on were almost all ridiculous.
* </li>
* </ol>
*
* @author Benny Bottema
*/
Expand All @@ -20,6 +42,7 @@ public enum EmailAddressCriteria {
* ("example.com"), then don't include this critera.
*/
ALLOW_DOMAIN_LITERALS,

/**
* This criteria states that as per RFC 2822, quoted identifiers are allowed (using quotes and angle brackets around the raw address), e.g.:
* <p>
Expand All @@ -29,6 +52,7 @@ public enum EmailAddressCriteria {
* (<tt>john.smith@somewhere.com</tt> - no quotes or angle brackets), then don't include this criteria.
*/
ALLOW_QUOTED_IDENTIFIERS,

/**
* This criteria allows &quot;.&quot; to appear in atext (note: only atext which appears in the 2822 &quot;name-addr&quot; part of the address, not the
* other instances)
Expand All @@ -42,6 +66,7 @@ public enum EmailAddressCriteria {
* quotes.
*/
ALLOW_DOT_IN_A_TEXT,

/**
* This criteria allows &quot;[&quot; or &quot;]&quot; to appear in atext. Not very useful, maybe, but there it is.
* <p>
Expand All @@ -58,12 +83,12 @@ public enum EmailAddressCriteria {
* you.
*/
ALLOW_SQUARE_BRACKETS_IN_A_TEXT,

/**
* This criteria allows as per RFC 2822 &quot;)&quot; or &quot;(&quot; to appear in quoted versions of the localpart (they are never allowed in unquoted
* versions)
* <p>
* You can disallow it, but better to include this criteria. I left this hanging around (from an earlier incarnation of the code) as a random option you
* can
* You can disallow it, but better to include this criteria. I left this hanging around (from an earlier incarnation of the code) as a random option you can
* switch off. No, it's not necssarily useful. Long story.
* <p>
* If this criteria is not included, it will prevent such addresses from being valid, even though they are: &quot;bob(hi)smith&quot;@test.com
Expand All @@ -72,15 +97,15 @@ public enum EmailAddressCriteria {

/**
* The default setting is not strictly 2822 compliant. For example, it does not include the {@link #ALLOW_DOMAIN_LITERALS} criteria, which results in
* exclusions on single domains.
* exclusions on single domains. Useful for cleaning up email strings that other middleware (ie. the next server) will be able to understand.
* <p>
* Included in the defaults are: <ul> <li>{@link #ALLOW_QUOTED_IDENTIFIERS}</li> <li>{@link #ALLOW_PARENS_IN_LOCALPART}</li> </ul>
* Included in the defaults are: <ul> <li>{@link #ALLOW_QUOTED_IDENTIFIERS}</li> <li>{@link #ALLOW_PARENS_IN_LOCALPART}</li> </ul>.
*/
public static final EnumSet<EmailAddressCriteria> DEFAULT = of(ALLOW_DOMAIN_LITERALS);
public static final EnumSet<EmailAddressCriteria> DEFAULT = of(ALLOW_QUOTED_IDENTIFIERS, ALLOW_PARENS_IN_LOCALPART);

/**
* Criteria which is most RFC 2822 compliant and allows all compliant address forms, including the more exotic ones.
* Criteria which is most RFC 2822 compliant and allows all compliant address forms, including the more exotic ones. Most useful for validating the broadest
* range of email address that should be allowed within the boundaries of RFC compliancy.
*/
public static final EnumSet<EmailAddressCriteria> RFC_COMPLIANT = of(ALLOW_DOMAIN_LITERALS, ALLOW_QUOTED_IDENTIFIERS, ALLOW_DOT_IN_A_TEXT,
ALLOW_SQUARE_BRACKETS_IN_A_TEXT, ALLOW_PARENS_IN_LOCALPART);
public static final EnumSet<EmailAddressCriteria> RFC_COMPLIANT = EnumSet.allOf(EmailAddressCriteria.class);
}

0 comments on commit 4748e3d

Please sign in to comment.