Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential data race with localNode in NodeDiscovery #19884

Closed
gandro opened this issue May 19, 2022 · 1 comment
Closed

Potential data race with localNode in NodeDiscovery #19884

gandro opened this issue May 19, 2022 · 1 comment
Assignees
Labels
kind/bug This is a bug in the Cilium logic.

Comments

@gandro
Copy link
Member

gandro commented May 19, 2022

As pointed out by Louis, there is a potential data race with the NodeDiscovery module. It protects local node updates via the following mutex:

n.localNodeLock.Lock()
defer n.localNodeLock.Unlock()
if option.Config.EnableIPv4 && len(ipv4AllocCIDRs) > 0 {
ipv4PrimaryCIDR, ipv4SecondaryCIDRs := splitAllocCIDRs(ipv4AllocCIDRs)
validatePrimaryCIDR(n.localNode.IPv4AllocCIDR, ipv4PrimaryCIDR, ipam.IPv4)
n.localNode.IPv4AllocCIDR = ipv4PrimaryCIDR
n.localNode.IPv4SecondaryAllocCIDRs = ipv4SecondaryCIDRs
}

But the getters of the localNode object are not aware of the lock, and therefore could concurrently read the fields while they are getting updated.

func (n *Node) GetIPv4AllocCIDRs() []*cidr.CIDR {
result := make([]*cidr.CIDR, 0, len(n.IPv4SecondaryAllocCIDRs)+1)
if n.IPv4AllocCIDR != nil {
result = append(result, n.IPv4AllocCIDR)
}
if len(n.IPv4SecondaryAllocCIDRs) > 0 {
result = append(result, n.IPv4SecondaryAllocCIDRs...)
}
return result
}

To my knowledge, the n.localNode reference is not stored by any consumer at the moment, so I don't think the race can happen in practise at the moment. This is because the lock is held during the call to n.Manager.NodeUpdated(n.localNode) where all the data access happens via getters happens.

But regardless, we should still fix this, for example by creating a copy of the local node object, to ensure that any future consumers will not introduce a data race by accident.

@gandro gandro added the kind/bug This is a bug in the Cilium logic. label May 19, 2022
@gandro gandro self-assigned this May 19, 2022
@gandro
Copy link
Member Author

gandro commented Oct 3, 2023

LocalAllocCIDRsUpdated has been removed from the tree. This particular data race here is no longer present. #21085 looks like it remains though.

@gandro gandro closed this as completed Oct 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic.
Projects
None yet
Development

No branches or pull requests

1 participant