New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent autoLinkUrls from including ending punctuation in the URL #1947
Conversation
The regular expression that implements autoLinkUrls includes an ending character match that is designed to prevent inclusion of ending punctuation in the URL. For some reason, this last character match has been short circuited by adding a final question mark meta-character, which renders it useless. This patch corrects this mistake, and adds the same utf8 Letter mode matching that is permitted before the final character in the URL. It looks like the regex for ftp also suffers from this problem, but this update does not fix that one. Signed-off-by: Noteworthy Software, Inc <online@noteworthysoftware.com>
It looks like this issue was originally created by an old @Spuds update: SimpleMachines/SMF@97cda43 There are others ways to fix the problems with autoLinkUrls. For example, final characters could be detected using something like |
Relevant discussion: |
Thanks. I did not see that topic before creating this one. I generally view the autoLinkUrls mechanism as a casual courtesy, and don't think that edge cases should override basic user expectations. Regardless, the existing character match |
That's the main problem, is not "strictly speaking" broken, but it's not always doing what people would expect. 😜 Yes, I know I'm nitpicking. 👼 Out of curiosity, what the |
Actually, it is absolutely broken as currently written (or, at the least, it is dead code that does nothing but make the regex look more intimidating). The |
ohh... okay, regular expressions are not really my "competence area". lol @Spuds you are the one that changed it, what do you think? |
Since this seems to be stagnating, I'll just add/repeat a final thought. In my view, autoLinkUrls should not be greedy. Unusual URLs and edge cases should not be the focus, as these can always be linked by a BBC when (rarely) necessary. The behavior of autoLinkUrls should not force the use of BBC when engaging in the common practice of enclosing a URL in parentheses. Perhaps a much simplified regex that just avoids the inclusion of common punctuation in the auto-linked URL is a better fix. However, the existing regex does have the benefit of standing the test of time (used in SMF 1.1 and 2.0). |
Sorry I've not been able to chime in on this at all. I think its fine to revert the ? back out and go back to what was there. I think this part |
Thanks for the comment. I'll review this based on your comment. |
Signed-off-by: Noteworthy Software, Inc online@noteworthysoftware.com
I agree with your comment. I went ahead and used the Unicode equivalent The Travis CI error is 'Your last search was less than 5 seconds ago' which I do not believe is related to this PR. |
Yeah sometimes travis goes bump, restarted the job to clear it up. Thanks for checking in to what |
Prevent autoLinkUrls from including ending punctuation in the URL
I guess it would be good to backport it to 1.0.3, right? |
The problem with travis at the moment is that elk.net doesn't like too many searches:
in the Curl tests. |
Yes
🍅 |
The regular expression that implements autoLinkUrls includes an ending character match that is designed to prevent inclusion of ending punctuation in the URL. For some reason, this last character match has been short circuited by adding a final question mark meta-character, which renders it useless. This patch corrects this mistake, and adds the same utf8 Letter mode matching that is permitted before the final character in the URL.
It looks like the regex for ftp also suffers from this problem, but this update does not fix that one.
Signed-off-by: Noteworthy Software, Inc online@noteworthysoftware.com