-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linkify creates http link in mailto link when host contains a hyphen #300
Comments
I don't think this is something I'm going to get to fixing any time soon. I'll happily accepts PRs, though. |
Do you have any idea, why this happens? |
Not offhand. You'd have to go through the code and debug it. Sorry! |
I found out that the problem also exists if >>> linkify('foo@bar-baz.com')
'foo@bar<a href="http://-baz.com" rel="nofollow">-baz.com</a>' Maybe this helps locating the issue... |
Ah I could fix the outcome of this bug like described in the first message 🔝 (but I don't know how to solve the root cause (last message 🔼 )) by applying this patch: diff --git a/bleach/linkifier.py b/bleach/linkifier.py
index fc346c3..b8f801e 100644
--- a/bleach/linkifier.py
+++ b/bleach/linkifier.py
@@ -344,7 +344,17 @@ class LinkifyFilter(Filter):
def handle_links(self, src_iter):
"""Handle links in character tokens"""
+ in_a = False # happens, if parse_email=True and if a mail was found
for token in src_iter:
+ if in_a:
+ if token['type'] == 'EndTag' and token['name'] == 'a':
+ in_a = False
+ yield token
+ continue
+ elif token['type'] == 'StartTag' and token['name'] == 'a':
+ in_a = True
+ yield token
+ continue
if token['type'] == 'Characters':
text = token['data']
new_tokens = [] Is this (and some test cases) enough for a pull request or do you rather want to fix the bug, that parts of mail addresses must not be interpreted as http links? |
I'm wondering if the description and #300 (comment) are actually two different issues. Are the causes related? If not, then maybe the second one should be a separate issue. We can keep them in this issue, too. At this point, whatever bookkeeping is easier for you is fine with me. Pull requests to fix things are always helpful! I think it's fine to do a PR to cover one part of this issue and another PR to cover another part later. Thank you for looking into this! |
I think, if we could fix the issue, that |
Partly fixes mozilla#300: ```py >>> linkify('foo@exa-mple.com', parse_email=True) '<a href="mailto:foo@exa-mple.com">foo@exa<a href="http://-mple.com" rel="nofollow">-mple.com</a></a>' '<a href="mailto:foo@exa-mple.com">foo@exa-mple.com</a>' ``` but it does not fix this issue: ```py >>> linkify('foo@bar-baz.com') 'foo@bar<a href="http://-baz.com" rel="nofollow">-baz.com</a>' ```
@sedrubal When you have "fixes #xyz" in a PR comment, then GitHub closes the issue automatically. In #300 (comment), I suggested this is two issues and we should open up a new issue, but I would leave the decision on that to you. You didn't say anything, so I figured between you marking the PR as having fixed this issue, then you were satisfied with everything being done. It'd help me if you opened up a new issue with a failing test of the outstanding things. Regarding releases, you can check the milestone this is assigned to. Thank you! |
Oops I'm sorry. I opened a new one (#322). Thanks. |
Linkify creates http links in mailto links if the mail host name contains a hyphen
-
(andparse_email=True
). The http link starts with the hyphen, even if host names mustn't start with hyphens at all (https://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_hostnames).This is what the link looks like: foo@exa-mple.com 😄
The text was updated successfully, but these errors were encountered: