-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testing: Feature Request: Better flaky test support #27181
Comments
And to be clear, flaky tests are bad and they should be fixed ideally. However, flaky tests do happen from time to time and feature like this is important for productivity and velocity reasons. And here is the use case I have in mind:
Another approach is to completely disable flaky tests in CI but it clearly reduces test coverage, and these tests can be further broken before they can be restored. CI can be more resilient to flaky tests and developers won't get blocked just because a test starts to flake. This is also why test retry should be noted so that people can take action to address them as well. |
I disagree pretty strongly. The point of a test is to detect an error. A flaky test is worthless, because you don't learn anything when it fails. It just wastes time. If a test is potentially flaky, then it should be rewritten. A typical simple rewrite is to loop enough times to make it extremely unlikely that all iterations will fail for flaky reasons. With that attitude, I think the only support the testing package needs for flaky tests is |
As mentioned above, t.Skip() (i.e. disable test) is an alternative, but it reduces coverage. The related feature can be broken by other changes and it will be more difficult to reenable the test. A flaky test is bad, but it is not always worthless. As the project and number of developers grow, a 10% flaky test affects more and more developers, but the fact that it passes 90% of the time is still a valuable signal. I totally agree that all flaky tests need to be fixed/addressed, but until then, we also want minimize the impact to the 10% developers. Correct me if I am wrong, but I think the simple rewrite suggestion won't work in case TestMain() is involved. A test may require setup/cleanup steps in TestMain() that are not available in the test method itself. |
Yes, if If your goal for "things are OK" is that your test passes 90% of the time, then you should write your test that way. Don't write it so that |
No I am not saying tests passing 90% is OK. Having flaky tests is not an end state. Over the course of development, some tests are going to become flaky for various reasons. The end goal is to fix the cause of flakiness and make the tests not flaky, but until then, we are hoping to minimize impact to developers (i.e. presubmit failing) without having to remove the test. |
Just as the language does not permit unused imports or unused variables, I don't see a compelling reason that the testing package should provide additional mechanisms for permitting flaky tests. |
I don't see why this couldn't be solved in a third party package. func TestThingThatMayFlake(t *testing.T) {
otherpkg.RerunOnFlake(t, func(t testing.TB) {
if err := ThingThatMayFlake(); err != nil {
t.Fatal(err)
}
})
} |
I am in agreement with @ianlancetaylor. I don't believe we should be adding additional functionality to the go tool that placates (and inherently encourages) flaky tests. |
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (
go version
)?go version go1.10 darwin/amd64.
Does this issue reproduce with the latest release?
N/A
What operating system and processor architecture are you using (
go env
)?go version go1.10 darwin/amd64
hklai-macbookpro2:e2e hklai$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/hklai/Library/Caches/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/hklai/go"
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/3q/dk8r61l944q42b8g7d2bq57r005d_5/T/go-build136728580=/tmp/go-build -gno-record-gcc-switches -fno-common"
What did you do?
N/A
What did you expect to see?
Better flaky test support, as outlined in the following areas:
Provide a way to annotate flaky tests. For example, there can be a flaky() method in the testing package, and go developers can add a line like "testing.flaky()" in test methods to indicate the tests are flaky. Obviously, this can be done using other mechanisms as well but this needs to be express-able in the test method level (i.e. build tag will not work).
"go test" retries flaky tests natively, if the tests are marked as flaky.
Test options to control the above retry behavior (i.e. number of retries until test passes, delay between retry, etc).
Test retry should work with the -json option, such that downstream processing can determine if a test had been retried, and/or how many times a test is retired.
I understand bazel provides some form of flaky test support, but it does not work well for us because:
In absence of the above, our only option is to implement a wrapper program that calls "go test", and then run "go test" again on the failed tests that are known to be flaky.
What did you see instead?
N/A
The text was updated successfully, but these errors were encountered: