New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest/bank: don't start chaos until nodes are serving SQL #30989

Merged
merged 3 commits into from Oct 15, 2018

Conversation

Projects
None yet
5 participants
@nvanbenschoten
Member

nvanbenschoten commented Oct 4, 2018

Informs #30064.

Release note: None

@nvanbenschoten nvanbenschoten requested a review from petermattis Oct 4, 2018

@nvanbenschoten

This comment has been minimized.

Show comment
Hide comment
@nvanbenschoten

nvanbenschoten Oct 4, 2018

Member

Please hold off on this. I thought it would fix the second class of failures we saw in #30064, but I just had a repro so it doesn't seem like it did.

Member

nvanbenschoten commented Oct 4, 2018

Please hold off on this. I thought it would fix the second class of failures we saw in #30064, but I just had a repro so it doesn't seem like it did.

@tschottdorf

I'm surprised why you didn't change the break on l156 into a return.

@nvanbenschoten

This comment has been minimized.

Show comment
Hide comment
@nvanbenschoten

nvanbenschoten Oct 11, 2018

Member

I still have this cleanup on my plate, including the l156 fix, but I'd like to actually get this test working without flakes first, which is turning out to be surprisingly tricky.

Member

nvanbenschoten commented Oct 11, 2018

I still have this cleanup on my plate, including the l156 fix, but I'd like to actually get this test working without flakes first, which is turning out to be surprisingly tricky.

@cockroach-teamcity

This comment has been minimized.

Show comment
Hide comment
@cockroach-teamcity

cockroach-teamcity Oct 15, 2018

Member

This change is Reviewable

Member

cockroach-teamcity commented Oct 15, 2018

This change is Reviewable

@nvanbenschoten nvanbenschoten changed the title from roachtest/bank: don't deadlock on teardown to roachtest/bank: don't start chaos until nodes are serving SQL Oct 15, 2018

@nvanbenschoten

This comment has been minimized.

Show comment
Hide comment
@nvanbenschoten

nvanbenschoten Oct 15, 2018

Member

Ok, this has been updated based on the discussion in #30064. PTAL.

Things changed a bit in #31011, so the first commit was replaced with a bit of cleanup around that change.

Member

nvanbenschoten commented Oct 15, 2018

Ok, this has been updated based on the discussion in #30064. PTAL.

Things changed a bit in #31011, so the first commit was replaced with a bit of cleanup around that change.

@petermattis

:lgtm:

Reviewed 1 of 1 files at r1, 1 of 1 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/bank.go, line 210 at r1 (raw file):

	consistentIdx int,
) {
	s.waitGroup.Add(1)

Is this kosher? You're calling Add from a goroutine and then calling Wait from the main routine. I think this would hit an internal assertion in WaitGroup, except that the Wait doesn't happen right away.


pkg/cmd/roachtest/bank.go, line 215 at r3 (raw file):

	// Don't begin the chaos monkey until all nodes are serving SQL connections.
	// This ensures that we don't test cluster initialization under chaos.
	for i := 0; i < c.nodes; i++ {

Nit: for i := 0; i <= c.nodes; i++. Then in the body you can use i instead of i+1.

@nvanbenschoten

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/cmd/roachtest/bank.go, line 210 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Is this kosher? You're calling Add from a goroutine and then calling Wait from the main routine. I think this would hit an internal assertion in WaitGroup, except that the Wait doesn't happen right away.

Yeah, you're absolutely right.


pkg/cmd/roachtest/bank.go, line 215 at r3 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Nit: for i := 0; i <= c.nodes; i++. Then in the body you can use i instead of i+1.

Make it for i := 1; i <= c.nodes; i++ and you've got a deal.

@tschottdorf

:lgtm: mod Peter's comments which it looks you're fixing up right now.

Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 1 of 1 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)

@nvanbenschoten

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale)


pkg/cmd/roachtest/bank.go, line 210 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Yeah, you're absolutely right.

Done.


pkg/cmd/roachtest/bank.go, line 215 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Make it for i := 1; i <= c.nodes; i++ and you've got a deal.

Done.

@petermattis

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale)

@a-robinson

Having just read through the recent comments on #30064 I agree this is the right change to make. We should also reduce the migration lease duration, though. There's no need for it to be so long given that we renew it in the background while running migrations.

nvanbenschoten added some commits Oct 4, 2018

roachtest/bank: don't start chaos until nodes are serving SQL
Informs #30064.

This prevents us from running into issues with SQL migrations.

Release note: None
@nvanbenschoten

This comment has been minimized.

Show comment
Hide comment
@nvanbenschoten

nvanbenschoten Oct 15, 2018

Member

TFTRs.

We should also reduce the migration lease duration, though.

I was hoping for a quick fix here, but it looks like this is tied to the sql.LeaseManager, which uses the same duration for all leases. This will have to wait on other improvements to the leasing system, like #24041.

bors r+

Member

nvanbenschoten commented Oct 15, 2018

TFTRs.

We should also reduce the migration lease duration, though.

I was hoping for a quick fix here, but it looks like this is tied to the sql.LeaseManager, which uses the same duration for all leases. This will have to wait on other improvements to the leasing system, like #24041.

bors r+

craig bot pushed a commit that referenced this pull request Oct 15, 2018

Merge #30989
30989: roachtest/bank: don't start chaos until nodes are serving SQL r=nvanbenschoten a=nvanbenschoten

Informs #30064.

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig

This comment has been minimized.

Show comment
Hide comment
@craig

craig bot commented Oct 15, 2018

Build succeeded

@craig craig bot merged commit b1ab7dd into cockroachdb:master Oct 15, 2018

3 checks passed

GitHub CI (Cockroach) TeamCity build finished
Details
bors Build succeeded
Details
license/cla Contributor License Agreement is signed.
Details

@nvanbenschoten nvanbenschoten deleted the nvanbenschoten:nvabenschoten/backFix branch Oct 16, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment