-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bad URL recognition #47
Comments
Although 'dot' is a legal letter of URL, I think that we can assume that all URLs will not end with dot... XD Thanks for reporting this bug. I'll fix it ASAP. |
Thanks This URL recognition is feature, for which lilyterm is better choice than others terminal emulators ;-) But recognizing the "sftp" can be nice too, but it is not in a "must have" catergory. |
Hi, i found another problem, the text::
is recognized as::
regards |
Similar happens with question marks and trailing parentheses ( Also, URLs whose domains don't contain a dot are not recognized, so I can't click on |
What is a good rule for which chars should be stripped at the end? Oh, I found it: http://daringfireball.net/2010/07/improved_regex_for_matching_urls |
DO NOT use that regex; it's susceptible to DoS via catastrophic backtracking. I'm not sure a pure regex is appropriate for this; more likely you want a dead simple regex with some manual inspection to weed out false positives. |
Can you further elaborate why this should be dangerous? As far as security goes, there are no bad comments here: https://news.ycombinator.com/item?id=1552766 |
Try matching that regex against Depends on the engine, of course, but most engines handle nested It's an artifact of naïve backtracking that makes pathological input take exponential time. And you'd never notice on normal input, just like the author and the HN commentors didn't. But I'd rather not have every single terminal window potentially lock up because some jerk said a malformed URL on IRC. :) Don't use hairy regexes, and particularly not ones you just got off some guy's website, unless you really really understand what you're doing. Just use something broad and touch it up with postprocessing. Or use a real parser, I guess. |
Hm, I think I understand. How about something like http://uriparser.sourceforge.net/. |
Hi again :-) Here is opposite problem, the URL "http://localhost:8000/" is not recognized as URL. Perhaps due port number? |
@Tetralet You should close this issue. It is not closed. |
Hm, I still get Cool would be if URLs with |
I often see, that the URL recognized in bad manner. For example, here is text:
sftp://raspi.skk.
(dot at the end). The url is recognized as "ftp://raspi.skk." - with dot at the end. While i understand the changing SFTP to FTP, but the dot at the end is bad and this happen with parenthesis and other chars too...
The text was updated successfully, but these errors were encountered: