New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/containers: fix deadlock #3311
Conversation
8ed7334
to
cce32d1
Compare
IMO this should be backported, thanks for finding this (and sorry for introducing it 😅). |
@NDStrahilevitz can you review? Also, do you think backport, or can we release 0.16.1? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct as is, so merge like this if you want, or optimize as I suggested if you think it's ok.
About the backporting, are you asking if we should branch out again to a new release branch, or backport to the existing v0.16.0 branch? If so, I believe we agreed on always doing the latter from now on. |
d095ee2
to
7a73083
Compare
2489c6e
to
bf7e685
Compare
bf7e685
to
7905e39
Compare
@NDStrahilevitz changed, can you take another look? pls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Cherry-picked at #3315 to v0.16.0 branch (will be at v0.16.1 release as well). |
The write Lock() is not getting released because of the defer, blocking the read RLock(), and the pipeline.
1. Explain what the PR does
Fix #3310
We had a deadlock introduced on the last release on the PR #3285
The write
Lock()
isn't getting release due to thedefer
, which blocks the readRLock
and the pipeline.To fix it, I remove the defer and did the lock() and unlock() on the critical path, also removed the
return
, we only update the cgroupMap inside the if logic, then move on to get the cgroupInfo and return. There are other ways of fixing it, let me know if you think it should be different.2. Explain how to test it
The simplest to simulate the bug is to run tracee into kind or minukube.
3. Other comments