Skip to content

cmd/roachtest: remove the use of runtime.Goexit#50703

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
petermattis:pmattis/roachtest-goexit
Jun 26, 2020
Merged

cmd/roachtest: remove the use of runtime.Goexit#50703
craig[bot] merged 1 commit intocockroachdb:masterfrom
petermattis:pmattis/roachtest-goexit

Conversation

@petermattis
Copy link
Copy Markdown
Collaborator

@petermattis petermattis commented Jun 26, 2020

We were abusing runtime.Goexit to kill a goroutine when test.Fatal
was called, but then transforming then aborting the resulting panic into
a panic(errGoexit). This abuse of runtime.Goexit was broken by
go1.14 which makes the panic generated by Goexit truly
unrecoverable (previously it was only documented as so). Now we simply
panic(errGoexit) from the get-go. This opens up the possibility that a
roachtest may recover from this panic, but no current roachtests call
recover().

See #48737

Release note: None

@petermattis petermattis requested review from bdarnell, otan and tbg June 26, 2020 16:22
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@petermattis petermattis force-pushed the pmattis/roachtest-goexit branch from 7fca96d to b2fedf2 Compare June 26, 2020 16:23
Copy link
Copy Markdown
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message in errGoexit is now a little misleading.

Reviewed 3 of 4 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @otan, @petermattis, and @tbg)


pkg/cmd/roachtest/test_runner.go, line 596 at r1 (raw file):

		t.end = timeutil.Now()

		if err := recover(); err != nil && err != errGoexit {

If errGoexit reaches this point (I don't think it should), shouldn't we still treat it as a failed test?


pkg/cmd/roachtest/test_runner.go, line 747 at r1 (raw file):

		// This is the call to actually run the test.
		defer func() {
			if r := recover(); r != nil && r != errGoexit {

Ditto.

Copy link
Copy Markdown
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, we weren't strictly abusing runtime.Goexit but rather trying to stay true to *testing.T. (There's no question that then suppressing the Goexit in some goroutines is a hack). I wish *testing.T would also stop using Goexit, but that won't ever happen.

Comment thread pkg/cmd/roachtest/cluster_test.go Outdated
panic(errGoexit)
return errors.New("unreachable")
})
expectedErr := regexp.QuoteMeta(`Goexit() was called`)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message is now misleading

@tbg
Copy link
Copy Markdown
Member

tbg commented Jun 26, 2020

Just for my sanity, can you run a local roachtest that you've changed to t.Fatal (on the main goroutine) and make sure that the test failure will propagate properly?

@tbg tbg requested a review from bdarnell June 26, 2020 16:32
Copy link
Copy Markdown
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @otan, and @petermattis)


pkg/cmd/roachtest/test_runner.go, line 596 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

If errGoexit reaches this point (I don't think it should), shouldn't we still treat it as a failed test?

If errGoexit reaches this point, t.mu.failed is already true. But I had to look at this as well to figure this out. Worthy of a comment.

@petermattis petermattis force-pushed the pmattis/roachtest-goexit branch 2 times, most recently from bbc5272 to 9625f7d Compare June 26, 2020 16:38
Copy link
Copy Markdown
Collaborator Author

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @otan, and @tbg)


pkg/cmd/roachtest/cluster_test.go, line 248 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This message is now misleading

Heh, that didn't bother me, but since you noticed I renamed errGoexit to errTestFatal and changed the message.


pkg/cmd/roachtest/test_runner.go, line 596 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

If errGoexit reaches this point, t.mu.failed is already true. But I had to look at this as well to figure this out. Worthy of a comment.

Ditto what @tbg said. I'll add a comment.


pkg/cmd/roachtest/test_runner.go, line 747 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Ditto.

Done.

@petermattis
Copy link
Copy Markdown
Collaborator Author

Just for my sanity, can you run a local roachtest that you've changed to t.Fatal (on the main goroutine) and make sure that the test failure will propagate properly?

Sure. I added a t.Fatalf("testing failure") to the beginning of the kv* roachtests. Running it locally produced:

...
--- FAIL: kv0/enc=false/nodes=1 (0.70s)
test artifacts and logs in: artifacts/kv0/enc=false/nodes=1/run_1
        kv.go:57,kv.go:185,test_runner.go:757: testing failure
...

Taking the same change and running it on master produced identical looking behavior and output.

We were abusing `runtime.Goexit` to kill a goroutine when `test.Fatal`
was called, but then transforming then aborting the resulting panic into
a `panic(errGoexit)`. This abuse of `runtime.Goexit` was broken by
go1.14 which makes the panic generated by `Goexit` truly
unrecoverable (previously it was only documented as so). Now we simply
`panic(errTestFatal)` from the get-go. This opens up the possibility
that a roachtest may recover from this panic, but no current roachtests
call `recover()`.

See cockroachdb#48737

Release note: None
@petermattis petermattis force-pushed the pmattis/roachtest-goexit branch from 9625f7d to 9fc7c73 Compare June 26, 2020 17:33
@petermattis
Copy link
Copy Markdown
Collaborator Author

TFTR! CI finally passed.

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig Bot commented Jun 26, 2020

Build succeeded

@craig craig Bot merged commit 438589d into cockroachdb:master Jun 26, 2020
@petermattis petermattis deleted the pmattis/roachtest-goexit branch June 26, 2020 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants