Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Not encoded non-normative characters when in url or href, not seperate non-normative characters from English ones #70449

Closed
ghost opened this issue Mar 14, 2019 · 7 comments
Assignees
Labels
editor-contrib Editor collection of extras help wanted Issues identified as good community contribution opportunities markdown Markdown support issues

Comments

@ghost
Copy link

ghost commented Mar 14, 2019

Wrong


Steps to Reproduce:

  1. Create an 'md' file.
  2. Write your href like this (mixed with Chinese characters):

image

The "汉字" (or other non-lartain characters) shouldn't be included as the part of Url or href, or mailto:

Ref:nodejs/nodejs.org#1612

@vscodebot
Copy link

vscodebot bot commented Mar 14, 2019

(Experimental duplicate detection)
Thanks for submitting this issue. Please also check if it is already covered by an existing one, like:

@vscodebot vscodebot bot added the editor-contrib Editor collection of extras label Mar 14, 2019
@ghost
Copy link
Author

ghost commented Mar 14, 2019

@vscodebot:Not a duplicate question.

@mjbvz mjbvz self-assigned this Mar 14, 2019
@mjbvz
Copy link
Contributor

mjbvz commented Mar 14, 2019

Please share the text of the code along with screenshots

@ghost
Copy link
Author

ghost commented Mar 14, 2019

@mjbvz:Thanks for your very quick reply, here's the text for your demo to reproduce this issue ——

  1. Screenshot is in my 1st post (Non-Lartain Characters shouldn't be the part of the url or href, or mailto, so "汉字" and "我们" should be white instead of blue).

  2. Here's the test file for you to download :)
    README.zip

@mjbvz mjbvz added the markdown Markdown support issues label Mar 14, 2019
@mjbvz
Copy link
Contributor

mjbvz commented Mar 14, 2019

Thanks. Linked text for quick future reference:

http://www.baidu.com。

http://www.baidu.com汉字。

mailto:m@abc.commailto:m@abc.com我们。

@ghost
Copy link
Author

ghost commented Mar 15, 2019

@mjbvz:Suddenly got an idea of how to encode these non-normative characters?

According to RFC 1738, It says——

Octets MUST be encoded if they have no corresponding graphic
character within the US-ASCII coded character set, if the use of the
corresponding character is unsafe, or if the corresponding character
is reserved for some other interpretation within the particular URL
scheme.

No corresponding graphic US-ASCII:

URLs are written only with the graphic printable characters of the
US-ASCII coded character set. The octets 80-FF hexadecimal are not
used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
control characters; these must be encoded.

It seems that you can keep the Chinese characters but just do something like 'encodeURI' or 'encodeURIComponent'……What do you think of this?If you are OK with this idea, I'll change my topic.

This means that you can say 'http://www.baidu.com汉字' as what it is, but when clicking it, it will automatically encode "汉字" by using something like 'encodeURI'.

So 'http://www.baidu.com。' should encode '。', because this isn't a normative character.

But for mailto, we should still seperate non-normative characters from the English ones.

Just waiting for your ideas on this.

@ghost ghost changed the title [Bug] Allow non-lartain characters as the part of href, url or mailto [Bug] Not encoded non-normative characters when in url or href, not seperate non-normative characters from English ones Mar 15, 2019
@mjbvz mjbvz added the help wanted Issues identified as good community contribution opportunities label Aug 16, 2019
@mjbvz
Copy link
Contributor

mjbvz commented May 6, 2021

Closing since there hasn't been any further interest in this issue for two years. We'd still take a PR if someone want to try to fix this and the solution is reasonable enough

@mjbvz mjbvz closed this as completed May 6, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Jun 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
editor-contrib Editor collection of extras help wanted Issues identified as good community contribution opportunities markdown Markdown support issues
Projects
None yet
Development

No branches or pull requests

1 participant