Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repair_json incorrectly truncates JSON strings with escaped quotes and commas #44

Closed
mlxyz opened this issue May 13, 2024 · 3 comments
Closed

Comments

@mlxyz
Copy link
Contributor

mlxyz commented May 13, 2024

Describe the bug
The valid JSON {"foo": "bar \"foo\", baz"} gets turned into the broken JSON {"foo": "bar \\"foo"} when using repair_json.
I think this is related to the escaped quotes and comma.

To Reproduce
Steps to reproduce the behavior:

  1. Call repair_json('{"foo": "bar \"foo\", baz"}')
  2. Check the output {"foo": "bar \\"foo"}

Expected behavior
Correct output {"foo": "bar \"foo\", baz"}

@mangiucugna
Copy link
Owner

so this is a tough one because if you pass that string to python without using r"" the string passed is: {"foo": "bar "foo", baz"} and the result is correct because it's impossible to know in advance if that comma closes the key/value pair or is part of the string.

if you call repair_json(r'{"foo": "bar \"foo\", baz"}') the result is correctly {"foo": "bar \"foo\", baz"}

@mlxyz
Copy link
Contributor Author

mlxyz commented May 13, 2024

Ah, got it - thanks!

I thought my problem was related to escaped quotes but apparently it is not and now i can't seem to figure it out:
I would expect {"foo": "bar \"e\", .", "g": ["h:"]]} to get repaired to {"foo": "bar \"e\", .", "g": ["h:"]} but repair_json(r'{"foo": "bar \"e\", .", "g": ["h:"]]}') returns {"foo": "bar \\\"e\\", ",": ": [\"h:"}.

Any idea why?

@mangiucugna
Copy link
Owner

I hate escaping problems, I will push a new version in a few minutes with what is hopefully a definitive fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants