Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestSmhasherWindowed is still flaky on linux-386-longtest #43130

Closed
bcmills opened this issue Dec 11, 2020 · 17 comments
Closed

runtime: TestSmhasherWindowed is still flaky on linux-386-longtest #43130

bcmills opened this issue Dec 11, 2020 · 17 comments
Labels
NeedsInvestigation release-blocker Testing
Milestone

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Dec 11, 2020

2020-12-10T22:58:49-6d3d3fb/linux-386-longtest

##### GOMAXPROCS=2 runtime -cpu=1,2,4 -quick
--- FAIL: TestSmhasherWindowed (27.87s)
    hash_test.go:566: 32 bit keys
    hash_test.go:161: unexpected number of collisions: got=512 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:568: 64 bit keys
    hash_test.go:570: string keys
--- FAIL: TestSmhasherWindowed (26.69s)
    hash_test.go:566: 32 bit keys
    hash_test.go:161: unexpected number of collisions: got=512 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:568: 64 bit keys
    hash_test.go:570: string keys
--- FAIL: TestSmhasherWindowed (26.69s)
    hash_test.go:566: 32 bit keys
    hash_test.go:161: unexpected number of collisions: got=512 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:568: 64 bit keys
    hash_test.go:570: string keys
FAIL
FAIL	runtime	598.996s

2020-10-26T21:47:49-36c5edd/linux-386-longtest

--- FAIL: TestSmhasherWindowed (26.23s)
    hash_test.go:566: 32 bit keys
    hash_test.go:161: unexpected number of collisions: got=512 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:568: 64 bit keys
    hash_test.go:161: unexpected number of collisions: got=512 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:570: string keys
FAIL
FAIL	runtime	169.624s

See previously #39352.

I would guess that it's flaky on 32-bit ARM too, but we don't have a longtest builder for that configuration.

CC @randall77

@bcmills bcmills added Testing NeedsInvestigation labels Dec 11, 2020
@bcmills bcmills added this to the Backlog milestone Dec 11, 2020
@bcmills
Copy link
Member Author

@bcmills bcmills commented Jan 25, 2021

2021-01-22T23:03:58-25c39e4/linux-386-longtest

--- FAIL: TestSmhasherWindowed (26.54s)
    hash_test.go:566: 32 bit keys
    hash_test.go:568: 64 bit keys
    hash_test.go:570: string keys
    hash_test.go:161: unexpected number of collisions: got=514 mean=0.499992 stddev=0.707101 threshold=156.565200
    hash_test.go:161: unexpected number of collisions: got=256 mean=0.499992 stddev=0.707101 threshold=156.565200
FAIL
FAIL	runtime	182.398s

@gopherbot
Copy link

@gopherbot gopherbot commented Mar 5, 2021

Change https://golang.org/cl/280372 mentions this issue: runtime: using wyhash for memhashFallback on 32bit platform

gopherbot pushed a commit that referenced this issue Mar 16, 2021
wyhash is a general hash function that:

1. Default hash function of Zig, Nim
2. Passed Smhasher, BigCrush and PractRand
3. Less code
4. 3~26% faster than internal hashmap

name                  old time/op    new time/op    delta
Hash5                   67.8ns ± 0%    65.4ns ± 0%   -3.45%  (p=0.000 n=7+10)
Hash16                  82.5ns ± 0%    74.2ns ± 0%  -10.12%  (p=0.000 n=6+8)
Hash64                   121ns ± 0%     102ns ± 0%  -15.82%  (p=0.000 n=7+10)
Hash1024                1.13µs ± 0%    0.89µs ± 0%  -20.58%  (p=0.000 n=10+9)
Hash65536               68.9µs ± 0%    54.4µs ± 0%  -21.04%  (p=0.000 n=10+7)
HashStringSpeed          103ns ± 2%      93ns ± 3%  -10.24%  (p=0.000 n=9+10)
HashBytesSpeed           191ns ± 2%     180ns ± 1%   -5.40%  (p=0.000 n=10+8)
HashInt32Speed          59.0ns ± 2%    59.1ns ± 1%     ~     (p=0.655 n=9+8)
HashInt64Speed          72.7ns ± 3%    66.1ns ± 5%   -9.04%  (p=0.000 n=10+10)
HashStringArraySpeed     270ns ± 1%     222ns ± 2%  -17.91%  (p=0.000 n=10+10)
FastrandHashiter         108ns ± 0%     109ns ± 1%   +0.96%  (p=0.002 n=10+10)

name                  old speed      new speed      delta
Hash5                 73.8MB/s ± 0%  76.4MB/s ± 0%   +3.58%  (p=0.000 n=7+10)
Hash16                 194MB/s ± 0%   216MB/s ± 0%  +11.25%  (p=0.000 n=10+8)
Hash64                 530MB/s ± 0%   630MB/s ± 0%  +18.74%  (p=0.000 n=8+10)
Hash1024               910MB/s ± 0%  1145MB/s ± 0%  +25.88%  (p=0.000 n=10+9)
Hash65536              951MB/s ± 0%  1204MB/s ± 0%  +26.64%  (p=0.000 n=10+7)

Update #43130

Change-Id: Id00c54b116a2411fcf675e95896fffb85f0e25b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/280372
Trust: Meng Zhuo <mzh@golangcn.org>
Run-TryBot: Meng Zhuo <mzh@golangcn.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Jan 6, 2022

@bcmills Unfortunately even after 483533d is merged It still failed.

Good news is it failed at the rate 1 failure per month and 256 collisions instead of 512.

I'll check whether my implement same with original wyhash :-(

@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Jan 7, 2022

Updates: I've run 32 bits SmhasherWindow for 4770 times (~24h) without failures...

@bcmills
Copy link
Member Author

@bcmills bcmills commented Jan 20, 2022

Indeed, still failing about once a month (and that's including the lower rate of test runs due to the code freeze). The more recent failures are back to 512 collisions each.

greplogs --dashboard -md -l -e '(?ms)FAIL: TestSmhasherWindowed.*unexpected number of collisions' --since=2021-09-24

2022-01-19T20:54:49-1efc581/linux-386-longtest
2021-12-19T20:16:45-87b2a54/linux-386-longtest
2021-11-15T21:22:17-e08aae2/linux-386-longtest
2021-10-23T12:44:47-b0f7eb6/linux-386-longtest

@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Jan 20, 2022

Indeed, still failing about once a month (and that's including the lower rate of test runs due to the code freeze). The more recent failures are back to 512 collisions each.

Recently a new version of Smhasher is released with the "bad seed" testcase, which wyhash is failed ( caused by multiply 0).
Maybe we can skip "bad seed" on 32 bit platforms and regain a "good seed" in the alginit procedure.

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 3, 2022

@mengzhuo, at the very least would it be useful to have the test log the seed(s) that produced the unexpected collision rate? That would help to verify that this test failure is due to the expected bad seeds.

@bcmills bcmills removed this from the Backlog milestone Feb 3, 2022
@bcmills bcmills added this to the Go1.19 milestone Feb 3, 2022
@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 3, 2022

Marking as release-blocker for Go 1.19: linux/386 is a first-class port, but it's probably too late in the 1.18 cycle to fully fix this without introducing too much risk.

@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Feb 8, 2022

My mistake: 386 using AES asm part do the hashing, I will take a look.

------Original comment-------------------------------
@randall77 Keith,

This issue can be reproduced within 6000 times run on 64 bit keys SmhasherWindow test.

After I deleted these 4 lines, no more flaky (12K runs):

go/src/runtime/alg.go

Lines 325 to 328 in 69e1711

hashkey[0] |= 1 // make sure these numbers are odd
hashkey[1] |= 1
hashkey[2] |= 1
hashkey[3] |= 1

I think the OR itself cancel the randomness of last bit (always 1)

But when I tried XOR

hashkey[0] ^= 1

there are flaky failures.

@gopherbot
Copy link

@gopherbot gopherbot commented Feb 9, 2022

Change https://go.dev/cl/384075 mentions this issue: runtime: AES maphash scramble 3 times on 386

@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Feb 13, 2022

Unfortunatly, it still flaky within 1/40000 and the key and seed seems random enough to me.

    hash_test.go:119: aeskey 7c3fcff1e80c0a0a9a13ce82b08c0ed80192f93e919b702a728c7bee2b6be56b86146ea040eee7a85d17efc794474ef83465567cd588569ca13945feaec28df3
    hash_test.go:120: unexpected number of collisions: got=256 mean=0.499992 stddev=0.707101 threshold=156.565200

According to the smhasher, some hashing algorithm based on AES failed the test.
Is it the AES itself is not qualified as a hashing algorithm?

@bcmills
Copy link
Member Author

@bcmills bcmills commented Mar 30, 2022

greplogs --dashboard -md -l -e 'FAIL: TestSmhasherWindowed' --since=2022-02-08

2022-03-21T18:58:42-79103fa/linux-386-longtest

@bcmills bcmills reopened this Mar 30, 2022
@bcmills
Copy link
Member Author

@bcmills bcmills commented Apr 12, 2022

greplogs --dashboard -md -l -e 'FAIL: TestSmhasherWindowed' --since=2022-03-30

2022-04-11T22:03:44-0605bf6/linux-386-longtest

@bcmills
Copy link
Member Author

@bcmills bcmills commented Apr 22, 2022

greplogs --dashboard -md -l -e 'FAIL: TestSmhasherWindowed' --since=2022-04-12

2022-04-21T23:45:36-67d6be1/linux-386-longtest

@bcmills
Copy link
Member Author

@bcmills bcmills commented Apr 22, 2022

@golang/runtime: this is a recurring failure on linux/386, which is a first class port. Could someone take a look at this and either fix it (if feasible) or add a test skip (if this turns out not to be a priority)?

@gopherbot
Copy link

@gopherbot gopherbot commented Apr 26, 2022

Change https://go.dev/cl/402456 mentions this issue: runtime: disable windowed Smhasher test on 32-bit systems

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation release-blocker Testing
Projects
None yet
Development

No branches or pull requests

3 participants