You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's currently a bug in getUInt when reading values that are negative bytes.
For example, consider the following:
val bytes = byteArrayOf(0x2e, 0xf6.toByte())
println(bytes.getUInt(0)) // prints 46
println(bytes.getUInt(1)) // prints 4294967286 but should be 246
This looks as simple as just ensuring the value is masked, i.e. private fun ByteArray.getUInt(index: Int) = get(index).toUInt() and 0xffu although I've not throughly tested this. I assume the same needs to be applied to getULong too.
The test suite is probably not broad enough in terms of the byte values under test; which is why it doesn't pick up this issue. I'm currently in the process of adding an incremental version of murmurhash into my own Multiplatform library, cryptohash and while looking into that came across this test suite from Apache Commons Codec which helped highlight this issue.
From a testing point of view, with a fix in place, hash32x86 with a zero seed would be expected to return 0x368f9df1 for the 2-byte array above.
The text was updated successfully, but these errors were encountered:
Which behind the scenes does the masking you suggest, but keeps it “Kotlin-esque” by leveraging the expressive conversion methods from/to signed/unsigned that the platform provides.
You're right that the test suite should have caught these. The wordlist is extensive, but it's most;y "a" to "z", i.e., "\u0041" through "\u005a", which conveniently misses on all of these edge cases. I didn't consider this before, but your Apache Commons Codec share looks like a great resource (good license, too). I'll likely borrow bits of it.
Hi @goncalossilva,
There's currently a bug in
getUInt
when reading values that are negative bytes.For example, consider the following:
This looks as simple as just ensuring the value is masked, i.e.
private fun ByteArray.getUInt(index: Int) = get(index).toUInt() and 0xffu
although I've not throughly tested this. I assume the same needs to be applied togetULong
too.The test suite is probably not broad enough in terms of the byte values under test; which is why it doesn't pick up this issue. I'm currently in the process of adding an incremental version of murmurhash into my own Multiplatform library, cryptohash and while looking into that came across this test suite from Apache Commons Codec which helped highlight this issue.
From a testing point of view, with a fix in place,
hash32x86
with a zero seed would be expected to return0x368f9df1
for the 2-byte array above.The text was updated successfully, but these errors were encountered: