Skip to content

Consider all domain suffixes for blacklisting and normalise implementations #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Mar 18, 2016

Conversation

owst
Copy link
Contributor

@owst owst commented Mar 18, 2016

This change was triggered by the Ruby implementation not matching any blacklisted domains with more than 1 subdomain, e.g. test@spamthis.co.uk would not be considered invalid, even though spamthis.co.uk is a blacklisted domain.

This refactoring normalises the implementation across the 7 languages:

  1. We're now using the same regexp for email wellformed-checking,
  2. All suffixes of an email's domain are checked for blacklist membership in the same way (modulo language/library differences),
  3. The test cases are now the same for each implementation.

This removes false-negatives (e.g. where the domain includes > 2 subdomains) in some cases, and false-positives (e.g. where a blacklisted domain appears as a subdomain) in other cases.

As a side effect, several improvements were made to the tests of PHP, Python in particular, using standard libraries for unit testing rather than ad-hoc tests.

Meta: I think this will be easiest to review on a commit-by-commit basis. This PR is based off my other pending PR, so that PR's commit also shows up here.

owst added 14 commits March 14, 2016 15:02
- Use the (better) regex from the Ruby code for checking for well-formed
  email addresses,
- Check for matches using identity equality (is/is not None),
- Use stdlib unittest module for tests and import testcases from Ruby
- Add instructions to run tests
- Ensure minitest is a development dependency
- Add tests for blank emails
- Use PHPUnit test framework (via composer dependency)
- Add instructions to run tests
- Multiple expectations per test
- Add blank/empty invalid email tests
- Use the same valid email regexp everywhere - based on PHP's
  FILTER_VALIDATE_EMAIL,
- Lowercase the list entries in the generator, rather than in each
  match implementation.
@owst
Copy link
Contributor Author

owst commented Mar 18, 2016

I can't work out why circle ci is failing on composer install --prefer-source --no-interaction I was able to run that locally with no problem. Perhaps a version issue, I'm on:

~ composer --version
Composer version 1.0.0-alpha11 2015-11-14 16:21:07

FGRibreau added a commit that referenced this pull request Mar 18, 2016
Consider all domain suffixes for blacklisting and normalise implementations
@FGRibreau FGRibreau merged commit d665557 into FGRibreau:master Mar 18, 2016
@FGRibreau
Copy link
Owner

Very impressive work @owst!! Thanks a lot!

@owst
Copy link
Contributor Author

owst commented Mar 18, 2016

You're welcome, thanks for maintaining the project!

How do we go about getting a new Ruby gem published? At work we're currently overriding MailChecker.valid_email?, so it'd be good to get rid of that code by using a newly published version.

By the way, do you have any idea why composer failed on Circle CI?

@FGRibreau
Copy link
Owner

Hello @owst ,

It's now fixed, I pushed ruby-mailchecker-1.6.3.gem :)

@owst
Copy link
Contributor Author

owst commented Mar 19, 2016

Great, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants