Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace e-mail regex by some thing that matches all e-mail addresses #579

Open
jaapio opened this issue Sep 4, 2023 · 3 comments
Open

Comments

@jaapio
Copy link
Member

jaapio commented Sep 4, 2023

E-mailaddresses are more complex than the regex we are using now. According to the RFC of e-mail there are way more options that commonly used by people.

Originally posted by @jaapio in #573 (comment)

@linawolf
Copy link
Contributor

linawolf commented Sep 8, 2023

While that is true I don't think it is the responsibility of the token parser. I you really have to link an e-mail adress which uses many special signs - and I think that is a rare use-case - you can make an explicit anonymous link.

@wouterj
Copy link
Contributor

wouterj commented Sep 8, 2023

Docutils uses this regex to detect e-mailaddresses: [-_!~*'{|}\/#?^`&=+$%a-zA-Z0-9\x00]+(?:\.[-_!~*'{|}\/#?^`&=+$%a-zA-Z0-9\x00]+)*(?<!\x00)@[-_!~*'{|}\/#?^`&=+$%a-zA-Z0-9\x00]+(?:\.[-_!~*'{|}\/#?^`&=+$%a-zA-Z0-9\x00]s*)*(?:[_~*\/=+a-zA-Z0-9]|[-_.!~*'()[\];\/:@&=+$,%a-zA-Z0-9\x00](?=[>]))

https://github.com/docutils/docutils/blob/master/docutils/docutils/parsers/rst/states.py#L674-L690

@linawolf
Copy link
Contributor

linawolf commented Sep 8, 2023

The issue is that we have a text with other special signs in it and want to detect email adresses in it. When we allow certain special signs (yours allows the backtick for example) then these are made part of an email adress that was written in backtics and supposed to be made part of a role.

So I would suggest to only test for common emailadresses (alpha-numeric, a few common special signs) and accept that not every email adress with the most crazy sign combination is auto-detected. Whoever wants to link an emailadress that has such special signs and is not automatically detected will have to use an anonymous link and propper escapes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants