-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Several race conditions in CaptureManager #117
Comments
Note: This is on my local working branch of #88 and uses the changes in this PR: fako1024/slimcap#34 |
Raised fako1024/slimcap#35 to cover packet counter issue. |
@els0r I was playing around a little: The first two are quite easily solvable without introducing any performance penalty (albeit with a bit of a cringe because the require additional synchronization outside of the existing state machine / run group stuff). The |
More issues (or rather details) found when assessing:
I encountered this when trying to extract the state of each capture (not Status, since that also resets the counters) using a |
First off: thanks for looking into this and embracing the ugly. As discussed, let's try to rework this part and keep only what we need. The thing with the logger is concerning. I wonder if it has to do with using I want to get rid of storing the context inside the structs |
Context makes sense given the stack trace from the race, so maybe that's the culprit indeed. Nice digging!! 💪 |
After straightening out a couple of issues found in #88 via fako1024/slimcap#33 running the E2E test on a basic flow (single interface, data being piped through a mock ring buffer source from a pcap file) revealed several data races, all of which seem to be related to the
CaptureManager
(respectively the state machine of the individual captures). The following ones I have found:Free()
is calledWay back I think we had a quick chat about whether it's ensured that the capture is actually stopped when
closing()
is run. According to the test this doesn't seem to be the case:It seems that
process()
is still adding packets to the initial map whilerotate()
checks if there are any flows present at all (this is probably the easiest thing to fix because it seems to me that this pre-check is simply not guarded by the current rotation copy-on-write magic):Close()
The writeout handler seems to still be attempting to perform its routines while / after the channel is being closed. Probably also not too tricky (I think I saw some random errors regarding sending on closed channel, maybe that's related):
There is a race when accessing the (mock) stats in
rotate()
because packets are being processed (and hence the counter is accessed) at the same time. This is arguably more of a problem in the slimcap mock source (I'll address that one in a separate issue, I guess an atomic counter won't add too much overhead) but I'm still raising it here because it might hint at an issue with the state machine:It seems that there is at least one race condition when interacting with the logger still (although I recall that the new logger should(tm) be race-free and safe for concurrent use):
@els0r TBH I'm not sure if that's all of them, but I think we'll have to urgently take a look at them (they are not cosmetic, the E2E test randomly fails even without race detection turned on, and since they are not solely related to the mock sources I'm pretty sure we might see weird issues in the wild at some point)... Maybe something we can discuss in a call as well?
The text was updated successfully, but these errors were encountered: