Emoji domain not recognised as link #11247

stophecom · 2021-04-29T21:08:49Z

I have searched open and closed issues for duplicates
I am submitting a bug report for existing functionality that does not work as intended
I have read https://github.com/signalapp/Signal-Android/wiki/Submitting-useful-bug-reports
This isn't a feature request or a discussion topic

Bug description

Links that contain emojis are not recognised as proper links.

Steps to reproduce

Open Signal
Go to any chat conversation or Note to Self
Share a link that contains an emoji. E.g. https://🤫.st

Actual result:
There is no link preview, the URL is not recognised as a proper link (not clickable)

Expected result: Valid domains should be handled as proper links.

It's a bit edge case bug, and might even count as feature request. However I'd be happy for a fix - which most likely goes here somewhere:

Signal-Android/app/src/main/java/org/thoughtcrime/securesms/linkpreview/LinkPreviewUtil.java

Line 35 in dc6dc19

public final class LinkPreviewUtil {

cc @greyson-signal @alan-signal Unfortunately I'm not familiar with Java, otherwise I'd do a PR myself. Let me know if I can help nonetheless.

Screenshots

Device info

Device: OnePlus 6
Android version: 10
Signal version: 5.7.1

Link to debug log

Nothing to log

hiqua · 2021-05-02T21:35:53Z

With your URL this variable is false:

      boolean validCharacters = ALL_ASCII_PATTERN.matcher(cleanedDomain).matches() ||
                                ALL_NON_ASCII_PATTERN.matcher(cleanedDomain).matches();

because cleanedDomain ends up being 🤫st. Not sure what's the reasoning behind this check though...

mayjs · 2021-06-04T21:07:05Z

@hiqua I noticed a similar issue with domains containing umlauts (e.g. https://üei.de). Could this be related?

hiqua · 2021-06-04T21:55:12Z

I would say so.

stale · 2022-01-26T09:49:44Z

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stophecom · 2022-01-26T16:37:05Z

Yes, it is this still relevant. Links with emoji or umlauts (ä, ö, ü) are not recognized as proper links.

stale · 2022-03-27T16:53:24Z

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

ckujau · 2022-03-27T17:56:21Z

Jup, it's still relevant. Btw, this affects the desktop version as well.

greyson-signal · 2022-03-28T16:07:37Z

We use the stock android linkifier. I don't anticipate that we'll be making a custom one anytime soon, apologies.

ckujau · 2022-03-28T21:29:41Z

Hm, both links work in Google Chat (formerly "Hangouts") just fine. And also in other applications (e.g. "Slack").

greyson-signal · 2022-03-28T23:31:41Z

Ah, I'm sorry, I misspoke. So we do use the default linkifier, but we filter out some results that fail certain rules. Right now, the main rule is that we don't linkify links whose domain has a mix of ascii and non-ascii characters. This is to help prevent homograph attacks. Our rule for that is a little coarse, but honestly given the huge character space we're trying to play it safe here.

So summary:

If a link is linkified, it's because the default linkifier identified it as a link
If it's not linkified, it's because the default linkifier missed it or it violates one of the rules we setup to prevent homograph attacks.

That said, I still think the outcome for this ticket is the same: I don't anticipate we'll do more nuanced stuff here just to be safe. But apologies for the previous incorrect reasoning.

stophecom · 2022-03-30T08:49:17Z

Thanks for the explanation.
I would argue it's a nice trick and this rule definitely helps. But it's also a bit arbitrary. And with newer TLDs this "trick" might become less effective. E.g.

(You hardly notice that his is now within unicode-land)

And what about my emoji domain 😭😂😂😂. So sad.

hiqua mentioned this issue May 2, 2021

Update domain auto linking signalapp/Signal-Desktop#5170

Merged

7 tasks

donovaly mentioned this issue Sep 20, 2021

Internationalized domain names not recognized as link #11636

Closed

stale bot added the wontfix label Jan 26, 2022

stale bot removed the wontfix label Jan 26, 2022

stale bot added the wontfix label Mar 27, 2022

stale bot removed the wontfix label Mar 27, 2022

greyson-signal closed this as completed Mar 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emoji domain not recognised as link #11247

Emoji domain not recognised as link #11247

stophecom commented Apr 29, 2021

hiqua commented May 2, 2021

mayjs commented Jun 4, 2021

hiqua commented Jun 4, 2021

stale bot commented Jan 26, 2022

stophecom commented Jan 26, 2022

stale bot commented Mar 27, 2022

ckujau commented Mar 27, 2022

greyson-signal commented Mar 28, 2022

ckujau commented Mar 28, 2022 •

edited

greyson-signal commented Mar 28, 2022

stophecom commented Mar 30, 2022 •

edited

Emoji domain not recognised as link #11247

Emoji domain not recognised as link #11247

Comments

stophecom commented Apr 29, 2021

Bug description

Steps to reproduce

Screenshots

Device info

Link to debug log

hiqua commented May 2, 2021

mayjs commented Jun 4, 2021

hiqua commented Jun 4, 2021

stale bot commented Jan 26, 2022

stophecom commented Jan 26, 2022

stale bot commented Mar 27, 2022

ckujau commented Mar 27, 2022

greyson-signal commented Mar 28, 2022

ckujau commented Mar 28, 2022 • edited

greyson-signal commented Mar 28, 2022

stophecom commented Mar 30, 2022 • edited

ckujau commented Mar 28, 2022 •

edited

stophecom commented Mar 30, 2022 •

edited