Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Singe quote strings - issues with special characters that need escaping #3073

Closed
pjanik opened this issue Oct 6, 2023 · 7 comments
Closed
Labels

Comments

@pjanik
Copy link

pjanik commented Oct 6, 2023

Describe the bug
It seems that single quote strings don't work too well with special characters that need to be escaped, like ", or '.

To Reproduce
Open the MathJS website, the webbrowser console, and type:

  1. math.evaluate("'te\"st'")
  2. math.evaluate("'te\'st'")

Result

  1. Uncaught SyntaxError: Unexpected non-whitespace character after JSON at position 4 (line 1 column 5) at JSON.parse ()
  2. SymbolNode.js:109 Uncaught Error: Undefined symbol st at Function.value

Exptected result

  1. 'te"st'
  2. "te'st"

Other comments

It is worth noting that the first issue (1.) can be workaround by typing math.evaluate("'te\\\"st'"). However, the second issue (2.) cannot be fixed by using any number of backslashes.

This problem may be related to the following line:

return JSON.parse('"' + str + '"') // unescape escaped characters

where single-quoted string content is actually enclosed in double quotes. Also, I understand that there are some possible conflicts with the transpose operator, but I am not sure how this could or should affect characters escaping.

@josdejong
Copy link
Owner

Thanks for reporting, this will be fixed via ac9052d.

I think your second case is not valid: the escape character there is already eaten by JavaScript, so effectively the parser receives a string "'te'st'" and not "'te\'st'". I think this example should be "'te\\'st'".

@josdejong
Copy link
Owner

Should be fixed now in v11.11.2, can you give that a try Piotr?

@pjanik
Copy link
Author

pjanik commented Oct 11, 2023

@josdejong, thank you very much for the quick response and the fix. Yes, this definitely improves the handling of our issues, and you're right about the second test case being wrong.

I still have some problems with special characters though. In particular, I can show them using this single line:

math.parse(math.parse(`fn('te"st')`).toString())

It results in a syntax error.

It happens because:

math.parse(`fn('te"st')`).toString() 

returns 'fn("te"st")' which is no longer valid string indeed.

So, a single quote string is not preserved when an expression is stringified again. It sounds similar to the issue of whitespace preserving, but I think it can be a bit different category. While ignoring whitespace will never break an expression, updating string delimiters can.

Does it seem like something that could be fixed in MathJS without huge effort?
I guess the parser would also need to unescape these two different kinds of string in a different way (not just with JSON.parse('"' + ... + '"')). It seems that this recent fix still makes each string constant a "" delimited string and unescapes it that way. But I might be wrong here, as I'm slowly getting lost when things are escaped and unescaped. 😉

@josdejong
Copy link
Owner

Ah, that is a good point. We indeed have to escape the " character when calling toString (and the same with toTex and toHTML). It's not that complicated, I'll have a look into it next week. Maybe it is better to get rid of the JSON.parse and only escape/unescape the quote character, I'll think that through a bit more.

@pjanik
Copy link
Author

pjanik commented Oct 13, 2023

Right, that's one approach. Another approach would be to maintain the string delimiters while calling .toString(). So, string constants would need to identify what kind of string they are and escape characters according to the original delimiter. That's what I actually meant in my previous message. However, maybe it's not worth the effort if MathJS doesn't preserve other details of the original string, such as whitespace, etc.

Having said all that, I managed to fix all the problems on our side, and v11.11.2 made that possible. Thanks for the quick fix!

@josdejong josdejong changed the title Singe quite strings - issues with special characters that need escaping Singe quote strings - issues with special characters that need escaping Oct 18, 2023
josdejong added a commit that referenced this issue Oct 18, 2023
@josdejong
Copy link
Owner

It would indeed be neat to remember whether a string was created with single or double quotes. I will leave that out for now though (it is similar to maintaining all whitespaces for example, that would be nice too).

I've done a bit of refactoring to not rely on JSON.parse when parsing strings, it is a bit hacky. And I've created a fix for the method node.toString() and function format (used by node.toString()) to correctly escape control characters and double quotes in a string, see PR: #3082. Feedback would be welcome.

josdejong added a commit that referenced this issue Oct 25, 2023
* chore: refactor parsing strings to not rely on `JSON.parse`

* fix: #3073 function `format` not escaping control characters and double quotes in strings

* chore: add more unit tests
@josdejong
Copy link
Owner

The issue with escaping characters in function format and node.toString() is fixed now in v11.12.0, fixed via #3082.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants