-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Characters are not percent-encoded in URIs #101
Comments
The corresponding unit test is % ---RESULT--- "example": 346,
%
% <p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p>
%
% ---\RESULT---
<<<
<http://foo.bar.`baz>`
>>>
documentBegin
BEGIN link
- label: http://foo.bar.`baz
- URI: http://foo.bar.%60baz
- title:
END link
documentEnd Here is the result of running
The issue seems related to the if options.hybrid then
self.string = self.escape_minimal
self.uri = function(s)
s = util.uri_encode(s)
s = self.escape_minimal(s)
return s
end
else
self.string = self.escape
self.uri = function(s)
s = util.uri_encode(s)
s = self.escape_uri(s)
return s
end
end Note also that we (ab)use < self.escape = util.escaper(self.escaped_chars, self.escaped_minimal_strings)
< self.escape_uri = util.escaper(self.escaped_uri_chars, self.escaped_minimal_strings)
< self.escape_minimal = util.escaper({}, self.escaped_minimal_strings)
<
< if options.hybrid then
< self.string = self.escape_minimal
< self.uri = function(s)
< s = util.uri_encode(s)
< s = self.escape_minimal(s)
< return s
< end
< else
< self.string = self.escape
< self.uri = function(s)
< s = util.uri_encode(s)
< s = self.escape_uri(s)
< return s
< end
< end
---
> local escape_typographic_text = util.escaper(self.escaped_chars, self.escaped_minimal_strings)
> local escape_programmatic_text = util.escaper(self.escaped_uri_chars, self.escaped_minimal_strings)
> local escape_hybrid_text = util.escaper({}, self.escaped_minimal_strings)
>
> if options.hybrid then
> self.string = self.escape_hybrid_text
> self.uri = function(s)
> s = util.uri_encode(s)
> s = self.escape_hybrid_text(s)
> return s
> end
> self.key = self.escape_hybrid_text
> else
> self.string = self.escape_typographic_material
> self.uri = function(s)
> s = util.uri_encode(s)
> s = self.escape_programmatic_text(s)
> return s
> end
> self.key = self.escape_programmatic_text
> end We will need to replace any direct calls to |
As for the issue title: It is percent-encoded rather than mapped to HTML entities. |
CommonMark is not percent-encoding some characters in URIs. Some of them are:
Consider a direct link:
What CommonMark expects:
How the URI is actually percent-encoded:
Do we update the tests to reflect to actual percent-encoding or skip the mentioned characters when encoding? |
@lostenderman Can you ask whether this is a bug or a feature at https://github.com/commonmark/commonmark-spec/issues? |
I don't suppose it is a bug, seeing multiple discussions about it: commonmark/commonmark-spec#334
and commonmark/commonmark-spec#270.
|
If percent-encoding is considered optional and "up to the renderer", then I would insert raw Unicode codepoints in Lua and let the TeX renderer do the percent-encoding if it needs to. |
I am not sure I understand the proposed solution correctly. What output should be produced in this example?
How do
fit into all of this? |
As per the quotes you listed in #101 (comment), the percent-encoding should be left up to the renderer. Therefore, it seems to me that we don't need to convert |
@lostenderman In Witiko#271 and Witiko#272, implemented changes suggested in #101 (comment). I suppose we can close this issue now. |
See https://spec.commonmark.org/0.30/#example-346
The text was updated successfully, but these errors were encountered: