-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
math/rand: tests are too weak due to misuse of statsResults.maxError #21211
Comments
Change https://golang.org/cl/51891 mentions this issue: |
Shuffle uses the Fisher-Yates algorithm. Since this is new API, it affords us the opportunity to use a much faster Int31n implementation that mostly avoids division. As a result, BenchmarkPerm30ViaShuffle is about 30% faster than BenchmarkPerm30, despite requiring a separate initialization loop and using function calls to swap elements. Fixes golang#20480 Updates golang#16213 Updates golang#21211 Change-Id: Ib8956c4bebed9d84f193eb98282ec16ee7c2b2d5
This is even worse than I thought. The existing code reads: func (this *statsResults) checkSimilarDistribution(expected *statsResults) error {
if !nearEqual(this.mean, expected.mean, expected.closeEnough, expected.maxError) {
s := fmt.Sprintf("mean %v != %v (allowed error %v, %v)", this.mean, expected.mean, expected.closeEnough, expected.maxError)
fmt.Println(s)
return errors.New(s)
}
if !nearEqual(this.stddev, expected.stddev, 0, expected.maxError) {
s := fmt.Sprintf("stddev %v != %v (allowed error %v, %v)", this.stddev, expected.stddev, expected.closeEnough, expected.maxError)
fmt.Println(s)
return errors.New(s)
}
return nil
} Note that when checking stddev, closeEnough is set to 0 in the call to nearEqual. This means that all the work is being done by expected.maxError, which is generally too large. Also, there should probably be separate values for closeEnough and maxError for mean and stddev, given that they are generally of very different scales. |
Change https://golang.org/cl/55972 mentions this issue: |
Shuffle uses the Fisher-Yates algorithm. Since this is new API, it affords us the opportunity to use a much faster Int31n implementation that mostly avoids division. As a result, BenchmarkPerm30ViaShuffle is about 30% faster than BenchmarkPerm30, despite requiring a separate initialization loop and using function calls to swap elements. Fixes #20480 Updates #16213 Updates #21211 Change-Id: Ib8956c4bebed9d84f193eb98282ec16ee7c2b2d5 Reviewed-on: https://go-review.googlesource.com/51891 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>
Perm and Shuffle are fundamentally doing the same work. This change makes Perm's algorithm match Shuffle's. In addition to allowing developers to switch more easily between the two methods, it affords a nice speed-up: name old time/op new time/op delta Perm3-8 75.7ns ± 1% 51.8ns ± 1% -31.59% (p=0.000 n=9+8) Perm30-8 610ns ± 1% 405ns ± 1% -33.67% (p=0.000 n=9+9) This change alters the output from Perm, given the same Source and seed. This is a change from Go 1.0 behavior. This necessitates updating the regression test. This also changes the number of calls made to the Source during Perm, which changes the output of the math/rand examples. This also slightly perturbs the output of Perm, nudging it out of the range currently accepted by TestUniformFactorial. However, it is complete unclear that the helpers relied on by TestUniformFactorial are correct. That is #21211. This change updates checkSimilarDistribution to respect closeEnough for standard deviations, which makes the test pass. The whole situation is muddy; see #21211 for details. There is an alternative implementation of Perm that avoids initializing m, which is more similar to the existing implementation, plus some optimizations: func (r *Rand) Perm(n int) []int { m := make([]int, n) max31 := n if n > 1<<31-1-1 { max31 = 1<<31 - 1 - 1 } i := 1 for ; i < max31; i++ { j := r.int31n(int32(i + 1)) m[i] = m[j] m[j] = i } for ; i < n; i++ { j := r.Int63n(int64(i + 1)) m[i] = m[j] m[j] = i } return m } This is a tiny bit faster than the implementation actually used in this change: name old time/op new time/op delta Perm3-8 51.8ns ± 1% 50.3ns ± 1% -2.83% (p=0.000 n=8+9) Perm30-8 405ns ± 1% 394ns ± 1% -2.66% (p=0.000 n=9+8) However, 3% in performance doesn't seem worth having the two algorithms diverge, nor the reduced readability of this alternative. Updates #16213. Change-Id: I11a7441ff8837ee9c241b4c88f7aa905348be781 Reviewed-on: https://go-review.googlesource.com/55972 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rob Pike <r@golang.org>
Shuffle uses the Fisher-Yates algorithm. Since this is new API, it affords us the opportunity to use a much faster Int31n implementation that mostly avoids division. As a result, BenchmarkPerm30ViaShuffle is about 30% faster than BenchmarkPerm30, despite requiring a separate initialization loop and using function calls to swap elements. Fixes golang#20480 Updates golang#16213 Updates golang#21211 Change-Id: Ib8956c4bebed9d84f193eb98282ec16ee7c2b2d5
Perm and Shuffle are fundamentally doing the same work. This change makes Perm's algorithm match Shuffle's. In addition to allowing developers to switch more easily between the two methods, it affords a nice speed-up: name old time/op new time/op delta Perm3-8 75.7ns ± 1% 51.8ns ± 1% -31.59% (p=0.000 n=9+8) Perm30-8 610ns ± 1% 405ns ± 1% -33.67% (p=0.000 n=9+9) This change alters the output from Perm, given the same Source and seed. This is a change from Go 1.0 behavior. This necessitates updating regress test. This also changes the number of calls made to the Source during Perm, which changes the output of the math/rand examples. This also slightly perturbs the output of Perm, nudging it out of the range currently accepted by TestUniformFactorial. However, it is complete unclear that the helpers relied on by TestUniformFactorial are correct. That is golang#21211. This change updates checkSimilarDistribution to respect closeEnough for standard deviations, which makes the test pass. The whole situation is muddy; see golang#21211 for details. There is an alternative implementation of Perm that avoids initializing m, which is more similar to the existing implementation, plus some optimizations: func (r *Rand) Perm(n int) []int { m := make([]int, n) max31 := n if n > 1<<31-1-1 { max31 = 1<<31 - 1 - 1 } i := 1 for ; i < max31; i++ { j := r.int31n(int32(i + 1)) m[i] = m[j] m[j] = i } for ; i < n; i++ { j := r.Int63n(int64(i + 1)) m[i] = m[j] m[j] = i } return m } This is a tiny bit faster than the implementation actually used in this change: name old time/op new time/op delta Perm3-8 51.8ns ± 1% 50.3ns ± 1% -2.83% (p=0.000 n=8+9) Perm30-8 405ns ± 1% 394ns ± 1% -2.66% (p=0.000 n=9+8) However, 3% in performance doesn't seem worth having the two algorithms diverge, nor the reduced readability of this alternative. Updates golang#16213. Change-Id: I11a7441ff8837ee9c241b4c88f7aa905348be781
The math/rand tests use the function
nearEqual
to compare floats:A bit of thought suggests that
maxError
should always be pretty small, if this function is ever to return false. However, maxError is usually scaled everywhere it is used (e.g. stddev * 0.08), and in practice it frequently ends up >> 2.As a result, many of the math/rand tests have no teeth. An extreme example of this is TestReadUniformity, which requires (among other things) that the mean of just 2 randomly selected bytes must be very near 255.0/2, a test which should almost always fail, but which currently succeeds, due to a large value of maxError. (Related: CL 51310.)
Replacing maxError with a constant
0.08
causes most math/rand tests to pass, except TestReadUniformity when called with small n (in which case it should fail).However, I don't have enough domain expertise to know whether this is reasonable or whether an alternative bound would be better. Advice requested.
golang.org/s/owners lists no one for math/rand, so I'll guess and cc: @rsc @aclements @robpike
The text was updated successfully, but these errors were encountered: