Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test,job: Fix deadlock in TestOneShot_RetryRecoverNoShutdown #25293

Merged
merged 1 commit into from May 8, 2023

Conversation

dylandreimerink
Copy link
Member

This test asserts that the function under test is called again if the first call to the callback returned an error. A channel is used to signal to the test function that the callback has started, this signal being the closure of a channel. But channels are only allowed to be closed once, so the code would set the channel to nil after closing it and wrap the closing code in a nil-check. This works 99.999% of the time since the consumer of the channel will continue. However, it sometimes happens that the channel is set to nil before the consumers flow has been started, resulting in a deadlock since reading from a nil channel always blocks. So, to fix it, we now simply look at the i variable and don't set the channel to nil anymore.

Fixes: #25276

Fixed flake in pkg/hive/job tests.

This test asserts that the function under test is called again if the
first call to the callback returned an error. A channel is used to
signal to the test function that the callback has started, this signal
being the closure of a channel. But channels are only allowed to be
closed once, so the code would set the channel to `nil` after closing
it and wrap the closing code in a nil-check. This works 99.999% of the
time since the consumer of the channel will continue. However, it
sometimes happens that the channel is set to `nil` before the consumers
flow has been started, resulting in a deadlock since reading from a
nil channel always blocks. So, to fix it, we now simply look at the
`i` variable and don't set the channel to `nil` anymore.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
@dylandreimerink dylandreimerink added the release-note/ci This PR makes changes to the CI. label May 5, 2023
@dylandreimerink dylandreimerink requested a review from a team as a code owner May 5, 2023 18:19
@dylandreimerink
Copy link
Member Author

/test

@dylandreimerink dylandreimerink added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 6, 2023
@dylandreimerink
Copy link
Member Author

All codeowner teams covered and all tests green, marking ready-to-merge

@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 8, 2023
@jrajahalme jrajahalme merged commit 030bb1a into cilium:main May 8, 2023
58 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/ci This PR makes changes to the CI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI: hive/job: Timed out after 10m
3 participants