New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deserialising unicode escape gives non-UTF8 String #228
Comments
Hi, good catch! let me investigate. |
@hkratz I think this tracks back to simdutf8, I added the following test case
and it doesn't throw an error. |
Simdutf8 usage in simd-json only confirms that the input byte sequence without any escape processing is valid UTF-8. Here the input byte sequence Lines 22 to 73 in 6af8b2a
This does not seem to handle a lone low surrogates such as |
Heya, sorry about that, too many distractions these days, you're completely right, I missed that there is a \u in there that is not escaped during the test itself 🤦 |
Heya @5225225 sorry this took so long, I was a bit too all over the place and totally missed the real issue here ... on the bright side, it should be fixed now and released as 0.5.1 :) |
Here, the second unwrap here fails, which means the library gave me a String with invalid UTF-8.
The text was updated successfully, but these errors were encountered: