Hashes for Slots in ScriptableObject should be seeded #290

botic · 2016-12-18T15:19:50Z

As far as I can interpret the slot allocation in ScriptableObject it uses String.hashCode() to generate a hash number, which then is used to determine a slot for the value. This makes it easy to exploit with an algorithmically attack.

Take the following example in RingoJS:

function loadJSON(file) {
   const start = Date.now();
   const data = require("./" + file);
   console.log("File:", file, "Time: " + (Date.now() - start) + " ms");
}

loadJSON(require("system").args[1]);

For a 20 MB huge JSON with randomized keys it takes under a second to load. An attacker could use colliding hash codes since it's easy to generate them with the knowledge that slots are allocated with String.hashCode(). Just an example sequence of colliding hashes: AaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAa, AaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaBB, AaAaAaAaAaAaAaAaAaAaAaAaAaAaAaAaBBAa, ……… Then the example above takes 20 minutes on my machine. And just imagine an application calling JSON.parse() without any bounds for the request's body.

To fix this problem Java changed its hash-based data structure's implementation with Java 7: https://vaskoz.wordpress.com/2013/04/06/java-7-hashing-drastically-better-than-java-6/

You can find the full code here: https://gist.github.com/botic/d409564dca727eea8086fb6a1cb00623

Further infos / links:

The text was updated successfully, but these errors were encountered:

davidmann4 · 2016-12-18T15:43:54Z

maybe useful lib if you want to find some collisions; (I guess you need to modify the hashing algorithm)
https://github.com/hastebrot/V8-Hash-Collision-Generator

gbrail · 2016-12-19T17:28:13Z

Thanks -- that's an important issue. There seem to be very few people able to work actively on Rhino right now. This looks like a great issue for a new author who might be looking for a way to contribute to a new project.

botic · 2017-01-04T14:37:00Z

I started with including a public domain Java-aware (chars != bytes) port of MurmurHash3: https://github.com/botic/rhino/tree/issue290-seededhashes

Various versions with more or less adaptions of the implementations can be found all over Github. Just some of them:

Feedback / other ideas welcome, since this is the first time I work directly on Rhino ;)

whitlockjc · 2017-08-15T00:13:57Z

@botic What's stopping you from creating a PR from this? I think your approach is sound. I was just about to do this work myself when I saw you had done it already.

gbrail · 2017-08-15T00:33:14Z

It'd be great if either of you could do this! At any rate, I'd prefer to see two more things: First, the seed for the hash should come out of SecureRandom. That'll increase startup time a bit but the vast majority of things that embed Rhino already have the default instance of that initialized I bet, and since the whole point of this is security we should do it securely ;-) Second, I'd prefer to see the hash algorithm in a separate class since ScriptableObject.java is so gigantic already. Bonus points if we have a unit test for the class that at least verifies that the basic code paths execute...

…

On Mon, Aug 14, 2017 at 5:13 PM, Jeremy Whitlock ***@***.***> wrote: @botic <https://github.com/botic> What's stopping you from creating a PR from this? I think your approach is sound. I was just about to do this work myself when I saw you had done it already. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#290 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAf0a38-ngxH8XLOMgz7lKRHhXI1Xx4wks5sYOLGgaJpZM4LQJvC> .

botic · 2017-08-16T20:14:55Z

Sorry to say, but my time is extremely limited right now. I see no time to do this in the next months due my parental leave and other projects. We implemented some workarounds in RingoJS and the web packages, which avoid the issue "good enough" for the moment. Afai-remember: some unit tests went red after my change and I could not spot the source of the problems.

gbrail · 2017-08-30T21:04:35Z

Here's an update on this issue. I've done two experiments so far:

-- Used the contribution from @botic to use MurmurHash on every String object rather than the built-in hashCode() method on java.lang.String.
-- Replaced the hand-rolled hash table in the ScriptableObject class with java.lang.LinkedHashMap. (This will help because when a hash bucket has too many collisions it switches that bucket from a linked list to a balanced tree, so the worst case performance for a hash-flooding attack is much less bad.)

Here are the results of the "v8 benchmarks" in these cases. (If you are new to Rhino then you will be shocked at how bad these are in comparison to V8 or even Nashorn):

master: 88.8
MurmurHash: 60.6
java.lang.Hash: 63.2
In other words, so far we don't have a way yet to fix this problem that does not have a large performance impact.

One simple proposal would be to simply permute the hash that's generated in ScriptableObject.java by some sort of a random seed. For instance we could do the following:

At class initialization time (aka once per JVM) we would generate a random 32-bit seed, possibly using SecureRandom.
Use this seed to modify the output from the built-in hashCode() function. This should make it more difficult to guess a hash collision because different instances of Rhino on different VMs will hash things differently.

I'm going to try this. I'm looking for feedback on the various pros and cons from a security perspective.

gbrail · 2017-09-22T00:50:26Z

Pull request #328 addresses this problem by dividing up the hash table that ScriptableObject uses into two implementations: One with very low overhead based on the existing code, and one that uses java.util.HashMap for its collision resistance. ScriptableObject changes from one to the other when the object reaches a threshold. (Currently 2500 properties based on some testing I did.)

@botic -- thank you very much for bringing this to our attention. I used part of the test file you sent me to create a regression test. Please let me know if you have any issues with this.

gbrail · 2017-11-20T20:05:59Z

Pushed to master.

botic · 2017-11-21T08:29:14Z

Thanks, the approach sounds reasonable. Since most objects don't have 2.500 or more properties, attacks would run into the "trap".

@gbrail no, perfect that it helped!

gbrail mentioned this issue Sep 22, 2017

Improve hashing logic in ScriptableObject for collision resistance. #328

Closed

gbrail closed this as completed Nov 20, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hashes for Slots in ScriptableObject should be seeded #290

Hashes for Slots in ScriptableObject should be seeded #290

botic commented Dec 18, 2016

davidmann4 commented Dec 18, 2016

gbrail commented Dec 19, 2016

botic commented Jan 4, 2017 •

edited

Loading

whitlockjc commented Aug 15, 2017

gbrail commented Aug 15, 2017 via email

botic commented Aug 16, 2017

gbrail commented Aug 30, 2017 •

edited

Loading

gbrail commented Sep 22, 2017

gbrail commented Nov 20, 2017

botic commented Nov 21, 2017

Hashes for Slots in ScriptableObject should be seeded #290

Hashes for Slots in ScriptableObject should be seeded #290

Comments

botic commented Dec 18, 2016

davidmann4 commented Dec 18, 2016

gbrail commented Dec 19, 2016

botic commented Jan 4, 2017 • edited Loading

whitlockjc commented Aug 15, 2017

gbrail commented Aug 15, 2017 via email

botic commented Aug 16, 2017

gbrail commented Aug 30, 2017 • edited Loading

gbrail commented Sep 22, 2017

gbrail commented Nov 20, 2017

botic commented Nov 21, 2017

botic commented Jan 4, 2017 •

edited

Loading

gbrail commented Aug 30, 2017 •

edited

Loading