Skip to content

Consider all domain suffixes for blacklisting and normalise implementations#54

Merged
FGRibreau merged 14 commits intoFGRibreau:masterfrom
owst:master
Mar 18, 2016
Merged

Consider all domain suffixes for blacklisting and normalise implementations#54
FGRibreau merged 14 commits intoFGRibreau:masterfrom
owst:master

Conversation

@owst
Copy link
Copy Markdown
Contributor

@owst owst commented Mar 18, 2016

This change was triggered by the Ruby implementation not matching any blacklisted domains with more than 1 subdomain, e.g. test@spamthis.co.uk would not be considered invalid, even though spamthis.co.uk is a blacklisted domain.

This refactoring normalises the implementation across the 7 languages:

  1. We're now using the same regexp for email wellformed-checking,
  2. All suffixes of an email's domain are checked for blacklist membership in the same way (modulo language/library differences),
  3. The test cases are now the same for each implementation.

This removes false-negatives (e.g. where the domain includes > 2 subdomains) in some cases, and false-positives (e.g. where a blacklisted domain appears as a subdomain) in other cases.

As a side effect, several improvements were made to the tests of PHP, Python in particular, using standard libraries for unit testing rather than ad-hoc tests.

Meta: I think this will be easiest to review on a commit-by-commit basis. This PR is based off my other pending PR, so that PR's commit also shows up here.

owst added 14 commits March 14, 2016 15:02
- Use the (better) regex from the Ruby code for checking for well-formed
  email addresses,
- Check for matches using identity equality (is/is not None),
- Use stdlib unittest module for tests and import testcases from Ruby
- Add instructions to run tests
- Ensure minitest is a development dependency
- Add tests for blank emails
- Use PHPUnit test framework (via composer dependency)
- Add instructions to run tests
- Multiple expectations per test
- Add blank/empty invalid email tests
- Use the same valid email regexp everywhere - based on PHP's
  FILTER_VALIDATE_EMAIL,
- Lowercase the list entries in the generator, rather than in each
  match implementation.
@owst
Copy link
Copy Markdown
Contributor Author

owst commented Mar 18, 2016

I can't work out why circle ci is failing on composer install --prefer-source --no-interaction I was able to run that locally with no problem. Perhaps a version issue, I'm on:

~ composer --version
Composer version 1.0.0-alpha11 2015-11-14 16:21:07

FGRibreau added a commit that referenced this pull request Mar 18, 2016
Consider all domain suffixes for blacklisting and normalise implementations
@FGRibreau FGRibreau merged commit d665557 into FGRibreau:master Mar 18, 2016
@FGRibreau
Copy link
Copy Markdown
Owner

Very impressive work @owst!! Thanks a lot!

@owst
Copy link
Copy Markdown
Contributor Author

owst commented Mar 18, 2016

You're welcome, thanks for maintaining the project!

How do we go about getting a new Ruby gem published? At work we're currently overriding MailChecker.valid_email?, so it'd be good to get rid of that code by using a newly published version.

By the way, do you have any idea why composer failed on Circle CI?

@FGRibreau
Copy link
Copy Markdown
Owner

Hello @owst ,

It's now fixed, I pushed ruby-mailchecker-1.6.3.gem :)

@owst
Copy link
Copy Markdown
Contributor Author

owst commented Mar 19, 2016

Great, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants