Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

peer/test: fix race in TestHandleNewPendingChannel #8180

Merged
merged 1 commit into from Nov 16, 2023

Conversation

carlaKC
Copy link
Collaborator

@carlaKC carlaKC commented Nov 14, 2023

While flakehunting #8128, I ran into another flake in the peer tests:

--- FAIL: TestHandleNewPendingChannel (0.00s)
--- FAIL: TestHandleNewPendingChannel/noop_on_active_channel (0.00s)
brontide_test.go:1250:
Error Trace:    /Users/carla/Work/lnd/peer/brontide_test.go:1250
Error:          Not equal:
expected: 2
actual  : 3
Test:           TestHandleNewPendingChannel/noop_on_active_channel
FAIL
FAIL    [github.com/lightningnetwork/lnd/peer](http://github.com/lightningnetwork/lnd/peer)    9.065s

We're not getting the number of channels that we expect because the parallel test cases are all updating the same peer. It can therefore be the case that:

  • noop on active channel checks how many channels we start with = 2
  • new channel should be added checks how many channels we start with = 2
  • new channel should be added adds a new channel (bumping the count to 3)
  • noop on active channel checks how many channels we end with an unexpectedly sees 3 channels

This all just depends on the order that the various test cases acquire the mutex for the underlying sync.Map that's used to track channels. Easy fix - just move state setup into test run.

@carlaKC carlaKC changed the title peer/test: run each TestHandleNewPendingChannel case in isolation peer/test: fix race in TestHandleNewPendingChannel Nov 14, 2023
@carlaKC carlaKC added tests data-race Related to data-races no-changelog labels Nov 14, 2023
Running test cases in parallel with shared state means that they
can intermingle if they're run at the same time and set up incorrect
values when we assert on the number of channels we have. Moving the
peer setup into the parallel test run fixes this because no single
test case can interfere with the other.
Copy link
Collaborator

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! LGTM🙏

Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noice 🔥 LGTM!

@ellemouton ellemouton merged commit ed179e3 into lightningnetwork:master Nov 16, 2023
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants