-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
linkify is eager about top level domain URLs #35
Comments
I don't consider linkifying something that looks like a domain a bug: the bug here is that |
One thing to consider is that top level xpi files might exist without any prefixed path, something like "somescript.sh" -- this would be linked by linkify, which is not desired for amo-validator output. |
|
Right, losing example.com would be fine. We just use it for URLs that are already prefixed with http(s). We want those links to go through the outgoing server so using bleach for this seemed like a good idea. |
The default behavior can't change, because I don't think this is expected behavior. I'd nominate |
An aside, these false positives are why django's urlize filter only linkifies urls prefixed with a schema or 'www.', or domain-only urls ending in .com, .org and .net. |
@SmileyChris: That difference is actually one of the reasons this method exists in the first place, which is why I'm not keen on changing it. |
Fair enough then! :) |
Could linkify be more configurable? Sometimes example.com should be linked, sometimes not. Now the only possiblity to configure the linkify is to copy-paste the whole linkify method. I think it'll be good to do one of the following:
My use case: linkify IDN domains (e.g. сайт.рф) + add Would you accept patch with such changes or this is all unreasonable or you want to implement something yourselves? |
The cleanest of those options feels like the class, but I really struggle to see how any of them could be completely backwards compatible (except maybe callbacks for "common tasks" but I don't want a 40-kwarg method). I'm open to any of them as long as Any of these changes, I think, necessitates moving the @kmike: I'm certainly open to a big rewrite but it's not my highest priority right now, especially not given how much bigger that scope is than this issue. |
Another consequence of this eager linking is that linkify links the word "settings.py" on my crate.io page. Is there any way to tell it to suppress the link? |
That said, #56 would make linkify customizable enough that anyone could refuse to linkify whatever TLDs they want. The solution will be to write callbacks that test it, e.g.: def no_python(text, attrs):
if attrs['href'].endswith('.py'):
return None Or more broadly: def only_with_protocol(text, attrs):
if not attrs['href'].startswith('http://'):
return None The latter seems like a logical thing to include, maybe not as a default, but as an option. Closing this in favor of #56. |
It seems that linkify will turn anything that looks like a domain into a URL. There should be a way to turn this off. Maybe
eager_domain_linking=False
.For example, the filenames in amo-validator output are getting linked. Like /path/to/somescript.sh gets linked as http://somescript.sh . Screenshot: https://bug670047.bugzilla.mozilla.org/attachment.cgi?id=544675 More details: https://bugzilla.mozilla.org/show_bug.cgi?id=670047
(I have the AMO issue assigned to me but it keeps getting bumped due to priority. I should get to it soon!)
The text was updated successfully, but these errors were encountered: