-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in python code #5
Comments
I tried to replicate it but couldn't:
|
@VectorASD Would you be able to provide a minimal reproduction example? |
If you miss the fact that str_data_b uses is my own class, which allows you to write sectors of a dex file independently of each other, and at the end to glue and put down the binding data, then this is exactly what a fully working MUTF-8 will look like:
it will print: eda0bdedb38b, instead of the erroneous eda1bdedb38b |
I tried one of the examples from https://docs.rs/residua-mutf8/latest/mutf8/ and I can see that the two implementations are giving different results. The Rust version converts
When testing with some data from Android ( https://android.googlesource.com/platform/development/+/63bf1087ebb06b59e3d82cbc5ccd4485704c6b91/vndk/tools/definition-tool/tests/test_dex_file.py#29 ) I see the same thing happen. So it seems that @VectorASD is correct that there is an error. |
It seems that the problem is in the encoding, not the decoding:
|
While decoding a 6-byte value, you have "0x10000 |". It's not right to do so. Due to the fact that the usual unicode in the construction 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx allows you to encode 21 bits, in MUTF8 you only have 20 bits available, so you need to ADD 0x10000, and not turn the OR operation. In coding, these 0x10000 are not taken into account at all. Try to encode for example "📋" yourself, and then decode it. As a result, we get what 🥴.
The text was updated successfully, but these errors were encountered: