Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix RingBuffer shutdown #620

Merged
merged 1 commit into from
Mar 15, 2021

Conversation

eyakubovich
Copy link
Contributor

Fixes RingBuffer's Stop to do a clean exit.
Adds a polling 300ms timeout to check for stop condition.
The Stop() method waits for the polling goroutine to exit
before returning to the caller.

Fixes #619

Fixes RingBuffer's Stop to do a clean exit.
Adds a polling 300ms timeout to check for stop condition.
The Stop() method waits for the polling goroutine to exit
before returning to the caller.
Copy link
Contributor

@grantseltzer grantseltzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this @eyakubovich! Figuring out how to avoid blocking while completely avoiding a race condition at shutdown was throwing me for quite a loop. It's a great feeling not only to get a code contribution but to also learn from it.

Can you share the command(s) you used for running the race detector against this? I'd love to add it to our github actions CI/CD.

// may have stopped at this point. Failure to drain it will
// result in a deadlock: the channel will fill up and the poll
// goroutine will block in the callback.
eventChan := eventChannels[uintptr(rb.bpfMap.fd)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this was the pattern that I failed to recognize the need for in previous iterations of this code. I couldn't get past the fact that the callback was blocking (and the timeout passed to ring_buffer__poll would not interrupt it).

@eyakubovich
Copy link
Contributor Author

The race detector ran as part of the unit tests (go test -race) but it's in another project that uses libbpfgo.

There's actually another race: eventChannels map is not protected by a mutex. However it's probably best to try to remove the map itself (at least for the ring buffer case) and just store the channel inside the RingBuffer object. However that will be a separate issue/PR.

@grantseltzer
Copy link
Contributor

Got it! If you'd like to work on that it's more than welcome, looking into it right now just to understand it. Is that project open source? If so could you share it?

@grantseltzer grantseltzer merged commit b995873 into aquasecurity:main Mar 15, 2021
@eyakubovich
Copy link
Contributor Author

Yes, it's open source and called Teleport (https://github.com/gravitational/teleport).
The patch that moves it from BCC to just libbpf is not there yet as I'm working out the kinks.
However you can look at the Makefile to see how the tests are ran.

@grantseltzer
Copy link
Contributor

@eyakubovich Thanks for the info. Please feel free to file issues as you work on moving it over. I'm actively adding functionality to libbpf, among other things I'm hoping to have the full libbpf API implemented in the next couple months.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RingBuffer.Stop is racy
2 participants