New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify parsing of hyperlinks on mobile and desktop #598

Closed
shaaati opened this Issue Jan 11, 2016 · 9 comments

Comments

Projects
None yet
10 participants
@shaaati
Copy link

shaaati commented Jan 11, 2016

I tried sending the following links on Signal Desktop:

github.com
www.github.com
https://github.com
ftp://github.com
smb://github.com

Out of those, only the variants prefixed with https and ftp are rendered as clickable hyperlinks inside the Desktop app. On Android, the first three links are made clickable.
In my opinion, all of the above hyperlinks should be parsed as such, but there should at least be a unified situation in which users of all platforms can expect messages to look the same (as I don't have any iDevice, I cannot check the look of the same message on Signal for iOS).

@liliakai liliakai added this to the On the roadmap milestone Jan 11, 2016

@liliakai

This comment has been minimized.

Copy link
Contributor

liliakai commented Jan 11, 2016

Also just noticed that a parenthetical link, i.e., (https://github.com) linkifies on android but not desktop. Whoops.

@psychofisch

This comment has been minimized.

Copy link

psychofisch commented May 25, 2016

When you send a message with a link and there is no space between the last character of the text and the first character of the link it doesn't get rendered correctly. The message is shown correctly on the phone.
Additional Info from the debug log:
Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36 Signal-Desktop/0.12.5
signaldesktopbug

sharonghae added a commit to sharonghae/Signal-Desktop that referenced this issue Jul 9, 2016

@ghost

This comment has been minimized.

Copy link

ghost commented Oct 8, 2016

have you tried?

var URL_REGEX = /((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?([a-z0-9-.]*)\.([a-z]{2,4})(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?((?:\:|\?)?[a-z+&\$_.-][a-z0-9';:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/gi;

and
if (body.length > 0) {
var escaped = body.html();
body.html(escaped.replace(URL_REGEX, "<a href='$&' target='_blank'>$&</a>"));
}

screen shot 2016-10-08 at 23 49 54

@troyjfarrell

This comment has been minimized.

Copy link

troyjfarrell commented Feb 2, 2017

@ping133 Your regular expression fails with Cyrillic (and probably other Unicode) in URLs. While these should probably be url-encoded, this issue is about unifying the parsing of URLs. The Android application converts https://en.wiktionary.org/wiki/сообщиться into a link. Your suggested URL_REGEX does not include any possibility for Unicode values in the path, query or fragment, and creates a link which points to https://en.wiktionary.org/wiki/.

@weienwong

This comment has been minimized.

Copy link

weienwong commented Sep 2, 2017

@troyjfarrell I believe the above can be fixed using new URL(...) (https://developer.mozilla.org/en-US/docs/Web/API/URL/URL) such that it would parse the URL string. However we need to handle the scenario in which we have something like

hellohttps://en.wiktionary.org/wiki/сообщиться

I have attached a screenshot

screenshot from 2017-08-30 13-44-26

Also on mobile Android, the UI highlights only the wiktionary.org/wiki/сообщиться portion of the URL

screenshot from 2017-09-01 17-29-04

@scottnonnenberg scottnonnenberg removed the easy label Nov 2, 2017

@0xbb

This comment has been minimized.

Copy link

0xbb commented Nov 6, 2017

https://en.wikipedia.org/wiki/Mother! is also corner case that works on Android but fails on the Desktop.

@mattimac

This comment has been minimized.

Copy link

mattimac commented Jan 5, 2018

When URL contains '[" sign this sign and following are not part of the url on Desktop but should be. Ok on android.
Example:
https://www.example.com/holiday/HIT/spain-ES/lanzarote-ACE/20137-h10-lanzarote-princess.html?m=test_obj_view&show[trip_typ]=EA&show[code]=201&show[arrival]=ACE&show[season]=W8&show[region]=ACE&show[departure]=&show[date_start]=2018-04-07&show[period]=3&show[adults]=3&show[code_type]=A&show[room]=1&show[room_typ]=A&show[food]=AI&show[catalog]=HIT

@Shalmanese

This comment has been minimized.

Copy link

Shalmanese commented Apr 10, 2018

Links that contain non-ASCII letters are still being parsed incorrectly.

For example, https://zh.wikipedia.org/zh-hans/信号 will form the correct clickable link on iOS but only the https://zh.wikipedia.org/zh-hans/ is parsed on desktop.

Chrome automatically converts non-ASCII characters to escaped form (eg: https://zh.wikipedia.org/zh-hans/%E4%BF%A1%E5%8F%B7) when copied and pasted from the URL bar but many Chinese browsers do not.

gasi-signal added a commit that referenced this issue Apr 10, 2018

gasi-signal added a commit that referenced this issue Apr 10, 2018

@gasi-signal gasi-signal referenced this issue Apr 10, 2018

Merged

Improve Messages Link Detection #2240

12 of 12 tasks complete

gasi-signal added a commit that referenced this issue Apr 10, 2018

gasi-signal added a commit that referenced this issue Apr 11, 2018

gasi-signal added a commit that referenced this issue Apr 11, 2018

gasi-signal added a commit that referenced this issue Apr 11, 2018

gasi-signal added a commit that referenced this issue Apr 11, 2018

Improve URL Auto-Linking In Messages (#2240)
As a user, I’d like the app to autolink as many possible URL formats I write in
messages as possible, e.g.

- [x] URLs without protocol: `github.com`
- [x] URLs with in different languages (Unicode):
  - [x] `https://zh.wikipedia.org/zh-hans/信号`
  - [x] `https://ru.wikipedia.org/wiki/Сигнал`
- [x] URLs with single quotes: `https://www.example.com/this-couldn't-be-true`
- [x] Messages with URLs right after special characters:
      `wink ;)https://www.youtube.com/watch?v=oHg5SJYRHA0`
- [x] URLs with square brackets:
      `https://www.example.com/test.html?foo=bar&baz[qux]=quux`
- [x] **Infrastructure:** Include TypeScript files in build.
- [x] **Infrastructure:** Rename `ts/test` to `ts/styleguide`.
- [x] **Infrastructure:** Decouple linting from testing.
- [x] **Infrastructure:** Run all tests in CI.
- [x] **Infrastructure:** Compile TypeScript on CI.

### Dependencies
- Forked `link-text` to disable HTML escaping: It only has the minimum required
  dependencies:
    - `linkify-it`: Best-in-class link detection library with support for
                    Unicode/IDN. Popular alternative: `linkifyjs`.
                    Doesn’t handle Unicode in URLs.
    - ~~`escape-html`: Standalone dependency for escaping HTML.~~
    - `uc.micro`: Standalone dependency of Unicode data files.

### Known Issues

We don’t auto-link trailing exclamation points which in most cases would be
expected to be part of the message body rather than the link.
**Counterexample:** `https://en.wikipedia.org/wiki/Mother!`.
N.B. GitHub doesn’t do this right either.

Fixes #598.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment