-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node-manager: fix race-condition #32606
Conversation
There was a potential race-conditions in handling nodes in manager. In background-sync where we were releasing manager mutex before holding entry mutex which would result in updating node information with stale informantion. This race-condition could cause stale node information up until next background sync or a node update, so it was evenually resolving, but it could take up to 15+ minutes for large clusters, depending on number of nodes. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
c2e6fb4
to
76d19ca
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marseel Thank you for reviewing these Marcel!
I think it make sense but also could entail a dead lock. If someone else has a lock on the entry, the node map could no longer be read. This introduce a dependency and also would prevent the sync from exiting should the context be canceled. Given the reconciliation loop at stake, I think it's safer to read potentially stale state.
Am I wrong about that?
This is the official way we should be always ordering mutexes of nodeManager: cilium/pkg/node/manager/manager.go Lines 88 to 104 in 598209e
Also, we had it in this order forever. So my question is if that change was intended in #29415 (if yes, what was the reason?) or accidental. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marseel Ah I see. Tx for the correction Marcel. This is indeed my mistake ;(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change looks good to me, and now aligns with the comments in the manager 👍
There was potential race-conditions in handling nodes in manager. In background-sync where we were releasing manager mutex before holding entry mutex which would result in updating node information with stale informantion.
This race-condition could cause stale node information up until next background sync or node update, so it was evenually resolved though, but it could take up to 15+ minutes for large clusters, depending on number of nodes.
Follow up from #32577
Edit: seems like this was introduced on main and it's not present on v1.15 branch, so removing release-note.
Regression was introduced by #29415
I've requested review from Fernand, just to make sure I haven't missed anything.