Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update HashMap docs regarding DoS protection #35371

Merged
merged 1 commit into from Aug 10, 2016
Merged

Update HashMap docs regarding DoS protection #35371

merged 1 commit into from Aug 10, 2016

Conversation

mgattozzi
Copy link
Contributor

Because of changes to how Rust acquires randomness HashMap is not
guaranteed to be DoS resistant. This commit reflects these changes in
the docs themselves and provides an alternative method to creating
a hash that is resistant if needed.

This fixes #33817 and includes relevant information regarding changes made in #33086

Because of changes to how Rust acquires randomness HashMap is not
guaranteed to be DoS resistant. This commit reflects these changes in
the docs themselves and provides an alternative method to creating
a hash that is resistant if needed.
@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@mgattozzi
Copy link
Contributor Author

r? @steveklabnik

@steveklabnik
Copy link
Member

@bors: r+ rollup

looks great, thanks a ton!

@bors
Copy link
Contributor

bors commented Aug 8, 2016

📌 Commit 2683e84 has been approved by steveklabnik

@mgattozzi
Copy link
Contributor Author

Glad I could help!

sophiajt pushed a commit to sophiajt/rust that referenced this pull request Aug 8, 2016
Update HashMap docs regarding DoS protection

Because of changes to how Rust acquires randomness HashMap is not
guaranteed to be DoS resistant. This commit reflects these changes in
the docs themselves and provides an alternative method to creating
a hash that is resistant if needed.

This fixes rust-lang#33817 and includes relevant information regarding changes made in rust-lang#33086
steveklabnik added a commit to steveklabnik/rust that referenced this pull request Aug 10, 2016
Update HashMap docs regarding DoS protection

Because of changes to how Rust acquires randomness HashMap is not
guaranteed to be DoS resistant. This commit reflects these changes in
the docs themselves and provides an alternative method to creating
a hash that is resistant if needed.

This fixes rust-lang#33817 and includes relevant information regarding changes made in rust-lang#33086
bors added a commit that referenced this pull request Aug 10, 2016
@sfackler
Copy link
Member

How was our HashMap "guaranteed" to be HashDoS resistant before? Did we ever make that guarantee?

@steveklabnik
Copy link
Member

@sfackler through the algorithm choice. That hasn't changed, but the source of random numbers can be weaker.

@sfackler
Copy link
Member

How did our algorithm choice guarantee HashDoS resistance? What does it even mean to "guarantee resistance"? The change made was to fall back to a random source we already use on many systems. If we guaranteed resistance before, I don't see how that is no longer a guarantee. Or was it only "really" guaranteed on Linux 3.17+?

Could you clarify what specifically was incorrect about the old documentation?

@bors bors merged commit 2683e84 into rust-lang:master Aug 10, 2016
@sfackler
Copy link
Member

I am still interested in an answer to my questions, btw.

@steveklabnik
Copy link
Member

@sfackler sorry, it got lost somehow. My bad.

How did our algorithm choice guarantee HashDoS resistance?

Normally, inserting into a HashMap is O(1), but when there's a collision, it can be worse. This depends on the hashing algorithm, but it can get out of hand. Like, O(N^2) out of hand. So the attack here requires two things:

  1. The attacker can control data that goes into your hashmap.
  2. is one by which it's easy for someone to calculate values that hash to the same thing, ie, generate a colliosion.

This means that you wrote your code expecting roughly O(1) amount of work, but you're instead doing O(N^2) amount of work. So the attacker overwhelms you with these collisions, causing the DDoS.

"DDoS resistance" here, then, is predicated on number two: how easy is it to generate collisions? "DDoS resistant hash algorithms" are then ones by which don't have the degenerate algorithmic complexity when it comes to collisions.


Okay, so what's that have to do with Rust and HashMap? Well, SipHash is a "DDoS-resistant" hashing algorithm, but this property is only really guaranteed by the quality of the random numbers fed into it. If we can't guarantee that they're good enough, then SipHash can't guarantee that collisions are tough to figure out.

So, you are right that it's always dependent on where those random numbers came from, and maybe we were over-stepping our bounds before. This change is just the one that brought the issue enough to the forefront to hedge a bit in the docs.

Does that all make sense?

@steveklabnik
Copy link
Member

(@gankro points out that O(N^2) is actually for creating a whole table of length N, it's actually O(N) for the insertion of one element. General point stands though)

@matDobek
Copy link

( not a member, but this is a good read 👍 )

@sfackler
Copy link
Member

sfackler commented Aug 17, 2016

So I think I have two core issues here:

I think the original docs hedged just fine - they even explicitly said that we make no guarantees about the quality of the random seed! I would probably weaken the guarantee of use of the highest quality RNG to say that we make a "reasonable best effort" to choose the highest quality RNG. This has nothing to do about the recent change, though, but rather the fact that we don't want to guarantee that we'll be able to immediately jump on new RNG APIs the second their introduced on all of the platforms we support.

The new docs make the picture less clear by pulling in two concepts that really shouldn't be involved at all. It first says that we use a "slow" hash function. If we're going to talk about the speed of the default hash function, it really needs to be elaborated on in a way that talks about its performance relative to other hash functions in different use cases (e.g. I believe SipHash is very competitive for medium-long strings) and documents when it is appropriate to switch to a different hash function and how to. If it's mentioned off hand with no context, it seems like people will just come away with the impression that they shouldn't use HashMap because "it's slow".

Secondly, it talks about "truly random numbers", which is maybe a meaningful concept in philosophy, but is certainly not in this context. I would not describe the data out get out of any RNG we're going to use as "truly random" (whatever that even means). All that matters is that it is sufficiently difficult (i.e. practically impossible as far as we know) to predict what will come out of the RNG based on what has already come out. All that matters in this context is that an attacker cannot guess the random key we select.

@sfackler
Copy link
Member

To be more clear about the randomness point, the case that we fall back to /dev/urandom is that some heuristic decides that it doesn't think that the kernel RNG has enough entropy. It's probably right, in that the RNG is weaker than it could be, but "true randomness" is not involved in any way.

@steveklabnik
Copy link
Member

@sfackler I would be happy to take more PRs to fix this up even more, for sure.

bors added a commit that referenced this pull request Sep 30, 2016
Clean up hasher discussion on HashMap

* We never want to make guarantees about protecting against attacks.
* "True randomness" is not the right terminology to be using in this
    context.
* There is significantly more nuance to the performance of SipHash than
    "somewhat slow".

r? @steveklabnik

Follow up to discussion on #35371
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clarify HashMap docs discussion of DoS protection
7 participants