fixed clickability of URLs if they are surrounded by special characters#570
fixed clickability of URLs if they are surrounded by special characters#570simensBat wants to merge 7 commits into
Conversation
|
I think it could be simpler to just use one single regex. |
|
@AnatoleAM Because when there are special characters next to the link, it is no longer a link, but just a set of strings. I have to remove special characters and check if there is a link in the string after removal. There are no special characters in the transition link itself, but they are in "displayText" If you try to remove them, you will have to render each special character as a separate token, and you will also need to follow the order so that the message text does not change. |
|
Alternatively, I can offer a solution that will simply remove special characters from the message: screencast-www.twitch.tv-2023.05.09-14_29_19.webmThis option is better, because now all text is not styled as a link. The value of these special characters is not very high. The main thing is that the link is clickable. I tried to parse the message, separate the special characters into separate tokens and render them separately from the link, but I couldn't make a stable behavior where nothing would break into the chat message. Maybe you have an idea how to do it better. |
|
Removing the special characters isn't a good solution. You can modify the range parameter to correctly highlight only the URL part. |
459f983 to
f5f0d78
Compare
0631999 to
9e3729c
Compare
|
@berghall I modified the code, there were problems with determining the range in order to correctly insert a link between special characters and so that all these edits would not break the original text of the message. I put this logic in a separate class so as not to clog up the Base rules: Rules for special characters: |
|
@Batyodie I mean that you don't need to use a regex to filter out special characters. What happens if I send a link surronded by characters you haven't specified in your special character regex? |
|
@berghall In different ways, a lot of curls, where the symbol is located, what kind of symbol, etc. In the worst case, the link will become just text, another option is that this special character will be in the link display text, but it will not affect the link url. It all comes down to the fact that we must find where the link starts and where it ends, in order to set the correct range for the token based on this information. If you do not use 2 regular expressions to prepare the url string for parsing through tldParse, you still have to comb out these characters somehow. Otherwise, the current url validation check will not pass. private isValidLink(link: string): URL | null {
try {
// without clearing, the url comes here, where there is an extra character that breaks the check.
const url = new URL(link);
const { isIcann, domain } = tldParse(url.hostname);
if (domain && isIcann) {
return url;
}
} catch (e) {
void 0;
}
return null;
}for the start of the link is even clearer if everything to the left of http | https is there, just cut it out. And what to do with the end of the link, without a regular expression, you still have to check that there are no strange characters there. |
|
Here's a solution where the entire URL is matched. It could be cleaned though https://github.com/kuel27/Extension/blob/improvements/src/common/chat/Tokenizer.ts#L139 |
|
@kuel27 Overall yes, your solution is better. The regularity looks terrible, but it works. I would still make a check for links from the tails of else if.... It doesn't look very declarative, there are several conditions where we still redefine the variable inside this condition. But this is so, the little things of life. As I understand it, this was done as part of another task, how should it be in such cases? Close this pr? |
|
Or you can postpone this court decision, as part of this task. |
075a9f5 to
15a9694
Compare
|
Whether or not this is a desirable feature is debatable. Some popular twitch clients do not create hyperlinks when wrapped in parentheses or brackets. The only real solution would be change the separation of text fragments (currently it's separated by spaces), but this would add extra complexity to the tokenization process, which is hard to justify for a relatively small improvement. tl;dr this could be solved by changing a few characters on a single line, but likely unnecessary. |




Fixed #548 Added an additional regular expression that fixes links inside messages if there are special characters near the links.
Special symbols:
All of these characters are stripped during url validation.
As it was before:
As it is now:
Links are now clickable. But there is one feature, the URL of the link will be correct, without extra characters. But the link is displayed along with special characters.
I did not invent anything, because I would have to parse the string, split it into characters, and manually add each element as a token.
If this fix option is enough, I'll be glad. If not, I will try to fix it to the end.