Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hash/maphash, runtime: Fallback versions of the hash function are likely vulnerable to hash-flooding DoS #66841

Closed
sugawarayuuta opened this issue Apr 15, 2024 · 10 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@sugawarayuuta
Copy link

Go version

go version go1.22.2 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/ys/.cache/go-build'
GOENV='/home/ys/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/ys/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/ys/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/snap/go/10585'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/snap/go/10585/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.22.2'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/ys/dos/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2137305734=/tmp/go-build -gno-record-gcc-switches'

What did you do?

Consider these examples:

package main

import (
	"encoding/binary"
	"fmt"
	"hash/maphash"
	"math/rand"
)

func main() {
	// Note: only works on -tags purego.
	seed := maphash.MakeSeed()
	for n := 0; n < 10; n++ {
		var buf [16]byte
		binary.LittleEndian.PutUint32(buf[:], 0xe7037ed1)      // Upper bits of constant m2.
		binary.LittleEndian.PutUint32(buf[4:], rand.Uint32())  // Arbitrary bits.
		binary.LittleEndian.PutUint32(buf[8:], 0xa0b428db)     // Lower bits of constant m2.
		binary.LittleEndian.PutUint32(buf[12:], rand.Uint32()) // Arbitrary bits.
		fmt.Printf("%x\n", buf[:])
		fmt.Printf("%016x\n", maphash.Bytes(seed, buf[:]))
	}
}
package main

import (
	"encoding/binary"
	"math/rand"
	"testing"

	_ "unsafe" // Only for the linkname.
)

const (
	width = 1 << 10
)

var (
	sink bool
)

var (
	//go:linkname useAeshash runtime.useAeshash
	useAeshash bool
)

func BenchmarkMap(b *testing.B) {
	useAeshash = false
	tab := make(map[string]bool, width)
	var keys [width]string
	for n := range width {
		var buf [16]byte
		binary.LittleEndian.PutUint64(buf[:], 0xe7037ed1a0b428db)
		binary.LittleEndian.PutUint64(buf[8:], rand.Uint64())
		key := string(buf[:])
		keys[n] = key
		tab[key] = true
	}
	b.ResetTimer()
	b.ReportAllocs()
	for n := range b.N {
		sink = sink != tab[keys[n%width]]
	}
}

What did you see happen?

If you run the first program with go run -tags purego ., you will see all the values hashed into 0.
If you run the second program like go test -bench . -count 10, it will report ~2000 ns/op, while it will report ~18 ns/op on other cases.

What did you expect to see?

While machines without AES or hash/maphash with purego tag are not so common, I think it's important to prevent issues like this.

The cause is that it relies on simple multiplications, which means we can use the XORed constant as inputs to cancel the effect, essentially making one of the operand 0, which in turn makes the whole product 0. I don't really have good fixes to make without losing much speed.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Apr 15, 2024
@ianlancetaylor
Copy link
Contributor

CC @golang/runtime @randall77

@cherrymui cherrymui added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 15, 2024
@cherrymui cherrymui added this to the Backlog milestone Apr 15, 2024
@randall77
Copy link
Contributor

There are a bunch of discussions in #9365 about this. TL;DR we're not really providing a hash-flooding DoS guarantee, but we should probably mitigate easy seed-independent collisions (like this one is).

I also said:

I intend to add some startup randomness to the new hash functions once they are in. It should only require a judicious xor or two to add that in.

Which I haven't done. Maybe we should do that.

@randall77
Copy link
Contributor

Which I haven't done. Maybe we should do that.

Maybe I did do that? And then it was undone by https://go-review.googlesource.com/c/go/+/279612

@randall77
Copy link
Contributor

@randall77
Copy link
Contributor

@mengzhuo

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/579115 mentions this issue: runtime: make it harder to find collisions in the 64-bit fallback hash

@mengzhuo
Copy link
Contributor

mengzhuo commented Apr 16, 2024

Good finding!
Since original wyhash isn't crypto-safed hash algorithm, I think process-wide seeding is good enough to resolve hashmap collision attack as Keith's CL do.

@sugawarayuuta
Copy link
Author

@randall77 Thank you for the quick CL, but I'm wondering if this is a proper fix to this issue (I'm assuming it's OK to ping, do tell me otherwise).
I downloaded your patch and it definitely prevents the exact snippet I posted, but I believe there is a workaround. For example consider this example.

package main

import (
	"encoding/binary"
	"math/rand"
	"testing"
)

const (
	width = 1 << 10
)

var (
	sink bool
)

func BenchmarkMap(b *testing.B) {
	tab := make(map[string]bool, width)
	var keys [width]string
	for n := range width {
		var buf [48 + 16]byte
		// Must use this random 8 bytes on both buf[24:] and buf[40:].
		random := rand.Uint64()
		binary.LittleEndian.PutUint64(buf[16:], 0x8ebc6af09c88c6e3) // constant m3.
		binary.LittleEndian.PutUint64(buf[24:], random)
		binary.LittleEndian.PutUint64(buf[32:], 0x589965cc75374cc3) // constant m4.
		binary.LittleEndian.PutUint64(buf[40:], random)
		key := string(buf[:])
		keys[n] = key
		tab[key] = true
	}
	b.ResetTimer()
	b.ReportAllocs()
	for n := range b.N {
		sink = sink != tab[keys[n%width]]
	}
}

(Note: I didn't realize there was a GODEBUG env variable, so to reproduce, use a command like GODEBUG="cpu.aes=off" go test -bench . -count 10 instead.)
This will still report ~2000 ns/op, again. The cause is this part:

	seed1 := seed
	seed2 := seed
	for ; l > 48; l -= 48 {
		seed = mix(r8(p)^m2, r8(add(p, 8))^seed)
		seed1 = mix(r8(add(p, 16))^m3, r8(add(p, 24))^seed1)
		seed2 = mix(r8(add(p, 32))^m4, r8(add(p, 40))^seed2)
		p = add(p, 48)
	}
	seed ^= seed1 ^ seed2

Since it's doing seed1 ^ seed2, it's possible to collide the result, as long as seed1 and seed2 evaluates to the same value, without knowing secrets like seeds or hashkeys.
In this program, I made it so that the m3/m4 at the left side cancels, leaving the value just hashkey[1], and the r8(add(p, 24))^seed1 and r8(add(p, 40))^seed2 at the right side become the same value.

@mengzhuo
Copy link
Contributor

FYI it is based on

https://github.com/wangyi-fudan/wyhash/blob/408620b6d12b7d667b3dd6ae39b7929a39e8fa05/wyhash.h#L128-L135

      uint64_t see1=seed, see2=seed;
      do{
        seed=_wymix(_wyr8(p)^secret[1],_wyr8(p+8)^seed);
        see1=_wymix(_wyr8(p+16)^secret[2],_wyr8(p+24)^see1);
        see2=_wymix(_wyr8(p+32)^secret[3],_wyr8(p+40)^see2);
        p+=48; i-=48;
      }while(_likely_(i>=48));
      seed^=see1^see2;

Maybe we should replace hard-wired secret(primes) XORed process-wide seed?

	seed1 := seed
	seed2 := seed
	for ; l > 48; l -= 48 {
		seed = mix(r8(p)^m2^hashkey[1], r8(add(p, 8))^seed)
		seed1 = mix(r8(add(p, 16))^m3^hashkey[2], r8(add(p, 24))^seed1)
		seed2 = mix(r8(add(p, 32))^m4^hashkey[3], r8(add(p, 40))^seed2)
		p = add(p, 48)
	}
	seed ^= seed1 ^ seed2

@randall77
Copy link
Contributor

Yeah, the long-key case can still get key-independent collisions.
Your fix sounds fine to me, xoring different process seeds into the different streams.
At that point, not sure what the point of ^mX even is. a ^ c ^ secret is completely equivalent to a ^ secret.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants