Skip to content

Conversation

GULPF
Copy link
Member

@GULPF GULPF commented Apr 11, 2018

  • addQuoted and strutils.escape no longer escapes multibyte UTF-8 characters
  • Use addQuoted/addEscapedChar in renderer.nim instead of custom procs
  • Change escaping scheme used by addEscapedChar to match the one previously implemented in renderer.nim (except that '\n' is now escaped as "\\n" instead of "\\L"). strutils.unescape has been updated to match this as well.

Fixes #7559
Fixes #3056

@krux02
Copy link
Contributor

krux02 commented Apr 11, 2018

Well I already recommended to deprecate strutils.escape. It is not clear what it is good for. Obviously it is not good to use it for ansi C code. Aparently it is also not good to use it for html. And constantly strutils.escape will be use where escaping is necessary and then people realize that strutils.escape doesn't work.

@GULPF
Copy link
Member Author

GULPF commented Apr 11, 2018

Well I already recommended to deprecate strutils.escape. It is not clear what it is good for. Obviously it is not good to use it for ansi C code. Aparently it is also not good to use it for html. And constantly strutils.escape will be use where escaping is necessary and then people realize that strutils.escape doesn't work.

That's mostly an issue with naming/docs imo. I think the use case is this: make the content of a string readable for a human in a non-ambiguous way. That's what addQuoted is used for, and it's also why renderer.nim escapes strings.

@krux02
Copy link
Contributor

krux02 commented Apr 11, 2018

Well that is what you inerpreted into it. But escaping is used for a lot of things. For example to prevent code injection, or to allow spaces in arguments on a unix shell. The use case here is simply not clear.

My recommendation would be to have these procedures instead shellEscape, cStringEscape, nimStringEscape, htmlEscape and their inverse, the unescape procedures.

@skilchen
Copy link
Contributor

@GULPF your PR fails on the js backend... I think you need to add:

`c` = unescape(encodeURIComponent(`c`));

as the first line in the emitted code in makeNimstrLit in lib/system/jssys.nim to utf8-encode the string passed as parameter. Javascripts charCodeAt gives utf-16 code units and this might cause some more problems on the js backend, if we no longer escape multibyte characters in Nim.

@GULPF
Copy link
Member Author

GULPF commented Apr 12, 2018

After looking at the places where strutils.escape is used in the compiler/stdlib I agree with @krux02 that it should be deprecated. I think that this PR should be changed so that only the behavior of addQuoted/addEscapedChar is modified, strutils.escape should keep the current behavior to avoid breaking things.

I do think it's useful to have a convenience wrapper around addQuoted/addEscapedChar in strutils though. It could be named nimStringEscape like @krux02 suggested.

Opinions?

@Araq Araq merged commit 8caf257 into nim-lang:devel Apr 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants