Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(git): relax email validation in commitEntry results #1876

Merged
merged 8 commits into from Mar 16, 2023

Conversation

matthewmayer
Copy link
Contributor

@matthewmayer matthewmayer commented Feb 24, 2023

extracted from #1872

the test for email addresses in the commitEntry test is stricter than what is allowed for faker.internet.email(), specifically hyphens are not allowed. This changes the regex to match that used for stripping the localPart in faker.internet.email().

@codecov
Copy link

codecov bot commented Feb 24, 2023

Codecov Report

Merging #1876 (1139cd3) into next (256631d) will increase coverage by 0.00%.
The diff coverage is n/a.

❗ Current head 1139cd3 differs from pull request most recent head 4ac5716. Consider uploading reports for the commit 4ac5716 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##             next    #1876   +/-   ##
=======================================
  Coverage   99.62%   99.63%           
=======================================
  Files        2357     2357           
  Lines      236856   236856           
  Branches     1228     1229    +1     
=======================================
+ Hits       235973   235991   +18     
+ Misses        861      843   -18     
  Partials       22       22           

see 3 files with indirect coverage changes

@Shinigami92 Shinigami92 added the s: needs decision Needs team/maintainer decision label Feb 24, 2023
@Shinigami92
Copy link
Member

Lets discuss in a meeting if hyphens are valid or at least common enough to be considered as generate-able inside a commitEntry or if we should just strip them away and consider them as uncommon

@matthewmayer
Copy link
Contributor Author

if they are valid in faker.internet.email(), they should also be valid as part of an email address in commitEntry

@Shinigami92
Copy link
Member

if they are valid in faker.internet.email(), they should also be valid as part of an email address in commitEntry

Or we remove that in email function as well

The implementation is still from v5.5.3 and not written by the new maintainers

@ST-DDT
Copy link
Member

ST-DDT commented Feb 24, 2023

I think we can easily test this by using https://github.com/validatorjs/validator.js/blob/master/src/lib/isEmail.js once.

@matthewmayer
Copy link
Contributor Author

Hyphens are allowed in email addresses through some providers like gmail don't allow registering usernames including a -

@ST-DDT ST-DDT requested review from a team February 24, 2023 22:58
@Shinigami92
Copy link
Member

We have another report about "email not valid"

I start to feel more and more unsafe about our current email implementation and we should think about to prevent such cases directly inside the email function.
E.g. generating a mail with or without including - could be an option that needs to be toggled on explicitly

@matthewmayer
Copy link
Contributor Author

We have another report about "email not valid"

I start to feel more and more unsafe about our current email implementation and we should think about to prevent such cases directly inside the email function. E.g. generating a mail with or without including - could be an option that needs to be toggled on explicitly

Perhaps could deprecate allowSpecialCharacters and replace it with toggles for individual characters. But i think thats out of scope for this PR.

@Shinigami92
Copy link
Member

It's not fully out of scope, but affected by a side-effect

const email = this.faker.internet.email(firstName, lastName);

Here we don't pass any options to the email function
So because allowSpecialCharacters is default to false, I would expected that this also strips out special characters like -
So therefore the current implementation does not fulfill my expectations

... its getting even madder 👀
checking these lines:

// Strip any special characters from the local part of the email address
// This could happen if invalid chars are passed in manually in the firstName/lastName
localPart = localPart.replace(/[^A-Za-z0-9._+\-]+/g, '');

why does the - even survive at the end at all?!

@matthewmayer
Copy link
Contributor Author

matthewmayer commented Feb 28, 2023

- is permitted by the regex

I dislike the current allowSpecialCharacters implementation. It's weird which is why I haven't touched it!

@matthewmayer
Copy link
Contributor Author

AllowSpecialCharacters is really "substitute some random special chars in which may break things in interesting ways" which is almost certainly never desirable.

@Shinigami92
Copy link
Member

@matthewmayer don't get me wrong, I'm thankful that you bring this forward and are patient to this
But I think we need more in depth decisions with the team of how we should proceed here
This "bug" is currently not in our top prio, so this wont get released in a bug-fix version soon anyway, therefore we have a bit time to no stress the merge of this PR and can focus on try to make it right and analyze not how to fight the symptom but also fix the root-cause

@matthewmayer
Copy link
Contributor Author

@matthewmayer don't get me wrong, I'm thankful that you bring this forward and are patient to this But I think we need more in depth decisions with the team of how we should proceed here This "bug" is currently not in our top prio, so this wont get released in a bug-fix version soon anyway, therefore we have a bit time to no stress the merge of this PR and can focus on try to make it right and analyze not how to fight the symptom but also fix the root-cause

understood, though note this blocks #1872. Non-english locales already generate emails with hyphens in them, its just that the tests only check this in en at the moment.

@matthewmayer
Copy link
Contributor Author

matthewmayer commented Feb 28, 2023

So i think there are really two discussions

  1. With allowSpecialCharacters=false (the default, so 95% of users will probably be doing this), what characters should be allowed in email addresses - currently A-Za-z0-9 and then four special chars: _ - + .

_ seems well supported - this is unlikely to appear in a locale name but could in theory be passed in manually, and is also used as a seperator.
- is well supported for sending but some providers dont allow registration in local parts. This appears in some names in some locales.
+ has fairly good support, it is used in emails like somebody+extension@gmail.com - this is unlikely to appear in a locale name but could in theory be passed in manually.
. is well supported but has special rules about not being at start and end. This appears in some names in some locales, and is used as a seperator https://gmail.googleblog.com/2008/03/2-hidden-ways-to-get-more-from-your.html

Once that's decided, the regexes for testing in faker.internet.email and faker.git.commitEntry should be aligned

  1. Then figure out what to do with allowSpecialCharacters

import-brain
import-brain previously approved these changes Mar 1, 2023
@ST-DDT ST-DDT changed the title fix(test): commit entry emails dont allow hyphens test(git): relax email validation in commitEntry results Mar 11, 2023
Copy link
Member

@ST-DDT ST-DDT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we forgot to actually post our decision on this PR.

Team Decision

Instead of using a custom regex, that might fail with any refactor, we should grep the email address from that line and check it with validatorjs.isEmail.
The validator function has an option require_display_name, that can be used to require it to have a display name like the commit entry has.

expect(parts[n]).toMatch(/^Author: .*$/);
expect(parts[n].substring(8)).toSatisfy(v -> isEmail(v, { require_display_name: true }));

We also discussed changing the email function:

@ST-DDT ST-DDT added s: accepted Accepted feature / Confirmed bug and removed s: needs decision Needs team/maintainer decision labels Mar 11, 2023
@matthewmayer
Copy link
Contributor Author

Instead of using a custom regex, that might fail with any refactor, we should grep the email address from that line and check it with validatorjs.isEmail.

but how are you gonna extract the email address from the line without using a regex in the first place :D

@ST-DDT
Copy link
Member

ST-DDT commented Mar 14, 2023

but how are you gonna extract the email address from the line without using a regex in the first place :D

As written above (using substring):

expect(parts[n]).toMatch(/^Author: .*$/);
expect(parts[n].substring(8)).toSatisfy(v -> isEmail(v, { require_display_name: true }));

ST-DDT
ST-DDT previously approved these changes Mar 15, 2023
@ST-DDT
Copy link
Member

ST-DDT commented Mar 15, 2023

Any idea why the tests fail? Did I make a mistake with the require_display_name part?

@matthewmayer
Copy link
Contributor Author

Any idea why the tests fail? Did I make a mistake with the require_display_name part?

it appears dots in the display name fail the validator. it expects them to be quoted.

v.isEmail('Mavis.Wunsch57 Mavis.Wunsch81@hotmail.com',{allow_display_name:true})
false
v.isEmail('Mavis Wunsch57 Mavis.Wunsch81@hotmail.com',{allow_display_name:true})
true
v.isEmail('"Mavis.Wunsch57" Mavis.Wunsch81@hotmail.com',{allow_display_name:true})
true

@matthewmayer
Copy link
Contributor Author

it seems git doesnt follow the exact same rules as display names for email addresses per RFC 5322

For example i can run

git config --global user.name "Mona,Lisa"
or
git config --global user.name "Mona.Lisa"

and then in git log commits show without quoting the display name, although those would be required by RFC 5322

@ST-DDT
Copy link
Member

ST-DDT commented Mar 15, 2023

I think for our purposes it is enough if we modify that part to always include the quotes.

x.replace(/^(.*) </, '"$1" <')

Would that work?

@matthewmayer
Copy link
Contributor Author

You mean replace it in the test? Or replace what commitEntry produces?

@ST-DDT
Copy link
Member

ST-DDT commented Mar 16, 2023

You mean replace it in the test? Or replace what commitEntry produces?

Yes, only in the test as the git output seems to omit the quotes.

@ST-DDT ST-DDT requested review from a team March 16, 2023 09:14
@ST-DDT ST-DDT enabled auto-merge (squash) March 16, 2023 18:00
@ST-DDT ST-DDT merged commit 65e6b7f into faker-js:next Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: test p: 1-normal Nothing urgent s: accepted Accepted feature / Confirmed bug
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

5 participants