stringify: incorrect entity encoding #498

ysiw · 2020-05-28T06:43:36Z

remark-stringify outputs incorrect escaped entites in table if options.entities is "escape"

$ cat a.md
| aaa        | aaa |
| ---------- | --- |
| http://abc |     |
$ remark a.md
| aaa             | aaa |
| --------------- | --- |
| http&#x3A;//abc |     |
a.md: no issues found
$ remark --setting '"entities":"escape"' a.md
| aaa                 | aaa |
| ------------------- | --- |
| http&amp;#x3A;//abc |     |
a.md: no issues found
$

wooorm · 2020-05-30T09:08:02Z

What is the reason for using entities: 'escape'?

wooorm · 2020-05-30T09:12:07Z

This is indeed a bug. Double escaping is going on if entities: 'escape' is defined.
I think it makes the most sense to remove the option altogether.

ysiw · 2020-05-30T23:15:58Z

I prefer some entites are escaped, such as &check; (✓), &cross; (✗). It is the only way to do it. entities: true is not an option for me, because it escapes all CJK characters.

wooorm · 2020-06-09T11:31:51Z

I prefer some entites are escaped, such as &check; (✓), &cross; (✗)

That behavior is closer to entities: true than entities: 'escape'.
escape used to work but is now broken, and should do the inverse: https://github.com/wooorm/stringify-entities/tree/355100188e1f3661359e7404115cd7e04c75ed41#optionsescapeonly.

I think removing 'escape' makes the most sense.

ysiw · 2020-06-12T13:09:23Z

I suggest modifing behavior of entites: 'escape' to escape ONLY all character entites listed in https://dev.w3.org/html5/html-author/charref

There is a package character-entities which has the complete list.

Current implementation gives me no option to keep my document unchanged.

wooorm · 2020-06-12T17:18:38Z

I suggest modifing behavior of entites: 'escape' to escape ONLY all character entites listed in https://dev.w3.org/html5/html-author/charref

That does include many ASCII / other characters, such as a space, a tab, a line feed, +, -, :, and so many more!

As you are already invested in Unicode, with Chinese characters, I don’t really get why check and cross can’t be in unicode either?

I instead strongly think we should drop options on encoding character references. /CC @ChristianMurphy, what do you think?

ChristianMurphy · 2020-06-12T18:20:13Z

drop options on encoding character references

As in remove options.entities from the project entirely?

wooorm · 2020-06-12T18:22:35Z

Yes

wooorm · 2020-06-12T18:23:29Z

background: it was added because I recently made stringify-entities, which had those options. Now, years later, we definitely live in a unicode world, there is basically no reason anymore to use character references.

ChristianMurphy · 2020-06-12T19:59:52Z

Removing the option works for me.
The only character entity that I've used in projects recently is  , which I believe has a Unicode equivalent.

ysiw added 🐛 type/bug This is a problem 🙉 open/needs-info This needs some more info labels May 28, 2020

wooorm changed the title ~~incorrect escaped entities in table~~ stringify: incorrect entity encoding Jul 17, 2020

wooorm added remark-stringify 🐛 type/bug This is a problem 🗄 area/interface This affects the public interface 🙅 no/wontfix This is not (enough of) an issue for this project and removed 🐛 type/bug This is a problem 🙉 open/needs-info This needs some more info labels Jul 17, 2020

wooorm closed this as completed in 4a49fbc Jul 17, 2020

spl mentioned this issue Jul 22, 2020

stringify: zero-width space HTML character entity replaced by Unicode #518

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stringify: incorrect entity encoding #498

stringify: incorrect entity encoding #498

ysiw commented May 28, 2020

wooorm commented May 30, 2020

wooorm commented May 30, 2020

ysiw commented May 30, 2020 •

edited

Loading

wooorm commented Jun 9, 2020

ysiw commented Jun 12, 2020 •

edited

Loading

wooorm commented Jun 12, 2020

ChristianMurphy commented Jun 12, 2020

wooorm commented Jun 12, 2020

wooorm commented Jun 12, 2020

ChristianMurphy commented Jun 12, 2020

stringify: incorrect entity encoding #498

stringify: incorrect entity encoding #498

Comments

ysiw commented May 28, 2020

wooorm commented May 30, 2020

wooorm commented May 30, 2020

ysiw commented May 30, 2020 • edited Loading

wooorm commented Jun 9, 2020

ysiw commented Jun 12, 2020 • edited Loading

wooorm commented Jun 12, 2020

ChristianMurphy commented Jun 12, 2020

wooorm commented Jun 12, 2020

wooorm commented Jun 12, 2020

ChristianMurphy commented Jun 12, 2020

ysiw commented May 30, 2020 •

edited

Loading

ysiw commented Jun 12, 2020 •

edited

Loading