New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid UTF-8 Strings when compiling with MRB_UTF8_STRING compute length incorrectly #5269
Comments
lopopolo
added a commit
to artichoke/artichoke
that referenced
this issue
Jan 8, 2021
- Fix quickcheck test setup for owned and borrowed bytes tests. An upstream mruby bug (mruby/mruby#5269) prevents the length of invalid UTF-8 strings from being correctly calculated. The quickcheck harness generates a wider range of inputs which requires converting the tests to use `bytes`, `byteslice`, and `bytesize`. - Add an additional `convert_with_trailing_nul` test to the `bytes` converter module. - Fix quickcheck test setup for float tests. The quickcheck harness generates a wider range of inputs which revealed issues with the test when comparing NaN, infinities, and, some pairs which caused a subtraction overflow. - Fix a test in `spinoso-securerandom` to no longer use the `chars()` iterator because `alphanumeric` returns a `Vec<u8>` now. This commit also renames the cargo features in `spinoso-random` from `rand_core` to `rand-traits` and from `rand` to `random-rand`. These changes more closely align `spinoso-random` with its related `rand_mt` crate. These changes remove the package name overrides for `rand` and `rand_core` in `Cargo.toml`. Doc comments and doc tests have been updated for these changes.
The length of broken UTF-8 strings is undefined, I think. You may get whatever length. |
For this particular case (the first byte is in |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When compiling mruby with UTF-8 Strings (e.g. setting
CFLAGS="-DMRB_UTF8_STRING"
), mruby incorrectly computes the length of strings with invalid UTF-8 byte sequences.mruby
Reproduction steps
rake clean CFLAGS="-DMRB_UTF8_STRING" rake
Executing in
mirb
:Reference MRI execution
With forced UTF-8 encoding:
The text was updated successfully, but these errors were encountered: