-
Notifications
You must be signed in to change notification settings - Fork 260
fix: [NPM] update endpointcache after remove policy #1804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
huntergregory
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noting changes from other day's discussion offline
npm/pkg/dataplane/dataplane.go
Outdated
| // because policy Manager will remove from policy from cache | ||
| // keep a local copy to remove references for ipsets | ||
| policy, ok := dp.policyMgr.GetPolicy(policyKey) | ||
| endpoints := policy.PodEndpoints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will need to make a deep copy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
npm/pkg/dataplane/dataplane.go
Outdated
| for npmEndpoint := range endpoints { | ||
| // if the endpoint is not in the policy's endpoint list, delete from cache | ||
| if _, ok := policy.PodEndpoints[npmEndpoint]; !ok { | ||
| delete(dp.endpointCache.cache, npmEndpoint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general structure is good, but we're looking to delete the policy reference from the given endpoint the endpoint cache is also indexed by IP
The endpoint cache is also indexed by IP. If Policy.Endpoints doesn't have an IP, then the lookup may not be efficient but that's ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
npm/pkg/dataplane/dataplane.go
Outdated
| // if the endpoint is not in the policy's endpoint list, delete policy reference from cache | ||
| if _, ok := policy.PodEndpoints[podIP]; !ok { | ||
| endpoint := dp.endpointCache.cache[podIP] | ||
| delete(endpoint.netPolReference, policyKey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
endpoint may not exist (chance of nil pointer exception)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
huntergregory
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm after locking endpointCache instead of updatePodCache
npm/pkg/dataplane/dataplane.go
Outdated
| klog.Infof("[DataPlane] Policy %s is not found. Might been deleted already", policyKey) | ||
| return nil | ||
| } | ||
| dp.updatePodCache.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lock the endpoint cache instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
npm/pkg/dataplane/dataplane.go
Outdated
| klog.Infof("[DataPlane] Policy %s is not found. Might been deleted already", policyKey) | ||
| return nil | ||
| } | ||
| dp.endpointCache.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why cant we take endpointCache after the RemovePolicy ? if a new ep is created or existing one is deleted we will be fine right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed offline:
The goal of this was to not block NetPol controller on making SetPolicy SysCalls while UpdatePod() is running.
New Proposal:
- Put endpoint lock below removePolicy()
- Address the blocking in another PR that makes an endpoint-level lock (instead of locking the whole cache)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated 1
| if err != nil { | ||
| return fmt.Errorf("[DataPlane] error while removing policy: %w", err) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check dp.ShouldUpdatePod() or something which is always true for windows and then execute this section, this is unnecessary for linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
* update endpointcache * update delete logic * fix lint issue * added check for endpoint in cache * add unit test * fix type lint issue * fix naming lint error * updated endpoint cache lock * moved cache lock down and added check for windows
Reason for Change:
Issue Fixed:
Requirements:
Notes: