New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance email regex in keyring #257
Enhance email regex in keyring #257
Conversation
Good finding. Maybe we might want to do something like
to also match user-ids that look like "foo@bar.baz". |
I just noticed, that the new regex would match some illegal email addresses, such as "<john doe@mail.tld>" (contains a space).
|
Hi Paul, thanks for looking into this. Yes, my regex is pretty non-validating which aims to have as few false-positives as possible. This means that I think it's better to extract some adresses which are not valid instead of excluding exotic correct adresses. My regex should not match spaces (\s is excluded). Some tests of my regex: https://regex101.com/r/E9wBzS/1 If you play the game through and really like 99.99% of correctness, you end up with a damn hard readable regex: https://www.emailregex.com/ So lets get back a step, what do we really want to do with this method? As far as I can see, this method should extract an address from an identity. If it's valid or not should imho be checked by a specialized library / tool by the calling application (stick to the one-tool-one-task philosophy). This way it's up to the calling application to handle invalid adresses, even with the possiblilty to ignore that. Maybe a comment in javadoc for this method to explain the non-validating behaviour would be good? |
This new regex better matches the string format for identities as specified by the RFC. The email address is surrounded by <>. The new regex correctly parses identities like this: foo@bar.com <baz@foo.com> Resulting in baz@foo.com being returned. Without this fix, foo@bar.com was returned.
Based on #257 Thanks @bratkartoffel for the initial proposed changes
I just pushed d55d6a1 which should fix this. Edit: Sorry, I missed your comment with the explanation for your reasoning. Let me reply to that here. I tend to disagree to allow matching almost everything between angle brackets. Users are lazy and if there is a method named "getEMailAddresses()" users will likely assume that this method only returns valid email addresses. If an email application dev notices that this method misses some exotic mail addresses, they can still write a custom function to extract mail addresses themselves. Anyway, I greatly appreciate your contribution. Thanks for pointing out that the original method was improperly matching some addresses! |
Thank you, looking forward for the next release to test this with my application. |
I just released |
This new regex better matches the string format for identities as
specified by the RFC. The email address is surrounded by <>.
The new regex correctly parses identities like this:
foo@bar.com <baz@foo.com>
Resulting in baz@foo.com being returned. Without this fix, foo@bar.com
was returned. Furthermore, this new regex should better match
international characters.