Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

endpoint: Enhance policy map sync #14370

Merged
merged 1 commit into from
Jan 25, 2021

Conversation

jrajahalme
Copy link
Member

@jrajahalme jrajahalme commented Dec 10, 2020

When syncing policy map with dump, compare the desired policy map to
the dumped map for both deletes and adds. Record and log any
differences found.

Fixes: #14358
Fixes: #14357
Signed-off-by: Jarno Rajahalme jarno@covalent.io

@jrajahalme jrajahalme added kind/bug This is a bug in the Cilium logic. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. release-note/misc This PR makes changes that have no direct user impact. labels Dec 10, 2020
@jrajahalme jrajahalme requested a review from a team December 10, 2020 23:55
@jrajahalme jrajahalme requested a review from a team as a code owner December 10, 2020 23:55
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.10.0 Dec 10, 2020
@jrajahalme jrajahalme marked this pull request as draft December 11, 2020 00:00
@jrajahalme jrajahalme force-pushed the pr/jrajahalme/enhance-policy-map-sync branch from 663d5f0 to 9a140fa Compare December 11, 2020 00:05
@jrajahalme
Copy link
Member Author

test-me-please

@jrajahalme jrajahalme force-pushed the pr/jrajahalme/enhance-policy-map-sync branch from 9a140fa to 9992ca5 Compare December 11, 2020 00:20
@jrajahalme
Copy link
Member Author

test-me-please

@jrajahalme
Copy link
Member Author

Known flake (#12511) on netnext, no other failures.

Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, but I don't see how it addresses #14353, as claimed by PR description and commit log. Did I miss something, or did you mean another PR number?

@jrajahalme
Copy link
Member Author

Code looks good, but I don't see how it addresses #14353, as claimed by PR description and commit log. Did I miss something, or did you mean another PR number?

Yes, wrong number, should have been #14358 ;-)

@qmonnet
Copy link
Member

qmonnet commented Dec 14, 2020

Could you please fix the commit log as well?

// syncDesiredPolicyMapWith updates the bpf policy map state based on the
// difference between the realized and desired policy state without
// dumping the bpf policy map.
func (e *Endpoint) syncDesiredPolicyMapWith(realized policy.MapState, diffs []policy.MapChange) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of passing diffs as a parameter, why don't we return the slice? I know that you will say because the usage of the returned diffs is optional, but this can be achieved with a boolean:

func (e *Endpoint) syncDesiredPolicyMapWith(realized policy.MapState, withDiff bool) (count int, diffs []policy.MapChange, err error) {

Which allows us to do [0]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, updates to a slice header passed as a parameter would not even be visible to the caller, fixed this.

Comment on lines 1365 to 1388
diffs := []policy.MapChange{} // non-nil empty slice
err = e.syncDesiredPolicyMapWith(currentMap, diffs)

err = e.addPolicyMapDelta()

if errors > 0 {
return fmt.Errorf("synchronizing desired PolicyMap state failed")
if len(diffs) > 0 {
e.getLogger().WithField(logfields.Count, len(diffs)).Warning("Policy map sync fixed errors, consider running with debug verbose = policy to get detailed dumps")
e.policyDebug(logrus.Fields{"dumpedDiffs": diffs}, "syncPolicyMapWithDump")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0]

basically we are consuming memory, by storing the diff in the slice, just for debug messages. Following my suggestion above we could call this as follows:

isDebug := option.Config.Debug
count, diffs, err = e.syncDesiredPolicyMapWith(currentMap, isDebug)

if len(count) > 0 {
...

which makes sure the memory usage does not grows in production environments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if e.policyMap == nil {
return fmt.Errorf("not syncing PolicyMap state for endpoint because PolicyMap is nil")
}

currentMapContents, err := e.policyMap.DumpKeysToSlice()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this PR the DumpKeysToSlice function is no longer used anywhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove.

return err
}

currentMap, err := e.dumpPolicyMapToMapState()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a small difference that you might have noticed which is previously we were only dumping the keys, not keys + values. Do we have some benchmarks, specially for the mem-allocs, for these changes vs the previous changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the values for comparison is required for correct operation, and as this dump is only done on a periodically run controller I would not be so concerned about the memory allocations that are done to store the values in the map. While we could code the deletes directly to the dump loop itself to avoid storing the dump results in a map, we do need to store the dump keys and values in a map to be able to compare with the desired mapstate. This comparison between the dumped values and desired values was missing before this change.

@aanm aanm removed their assignment Jan 4, 2021
@jrajahalme
Copy link
Member Author

retest-runtime

@jrajahalme
Copy link
Member Author

known net-next flake #12511

@jrajahalme
Copy link
Member Author

retest-net-next

@jrajahalme
Copy link
Member Author

retest-runtime

@jrajahalme
Copy link
Member Author

retest-net-next

@qmonnet qmonnet removed their assignment Jan 5, 2021
@jrajahalme jrajahalme force-pushed the pr/jrajahalme/enhance-policy-map-sync branch from 0ba54f8 to 8f00268 Compare January 8, 2021 19:30
@jrajahalme
Copy link
Member Author

Removed duplicate unit test.

@jrajahalme
Copy link
Member Author

test-me-please

@jrajahalme
Copy link
Member Author

Test failures seem legit, will investigate.

When syncing policy map with dump, compare the desired policy map to
the dumped map for both deletes and adds. Record and log any
differences found.

Fixes: #14358
Fixes: #14357
Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
@jrajahalme jrajahalme force-pushed the pr/jrajahalme/enhance-policy-map-sync branch from 8f00268 to 365de70 Compare January 9, 2021 04:59
@jrajahalme
Copy link
Member Author

test-me-please

@jrajahalme
Copy link
Member Author

retest-gke

@jrajahalme
Copy link
Member Author

Net-next hit by a rare (?) flake: #13833

@jrajahalme
Copy link
Member Author

retest-net-next

@jrajahalme
Copy link
Member Author

retest-gke

@jrajahalme
Copy link
Member Author

retest-net-next

@jrajahalme
Copy link
Member Author

netnext known flake #13275

@jrajahalme
Copy link
Member Author

retest-net-next

@jrajahalme jrajahalme merged commit 9dc1350 into master Jan 25, 2021
@jrajahalme jrajahalme deleted the pr/jrajahalme/enhance-policy-map-sync branch January 25, 2021 02:45
jrajahalme added a commit to jrajahalme/cilium that referenced this pull request Mar 7, 2022
Shift the realized map update to the caller of addPolicyKey() and
deletePolicyKey() so that the update can be made to the correct map in
syncDesiredPolicyMapWith().

Rename addPolicyKey() and deletePolicyKey() as addBPFPolicyKey() and
deleteBPFPolicyKey(), respectively, to make the role of there functions
clearer.

Fixes: cilium#14370
Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. release-note/misc This PR makes changes that have no direct user impact. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.
Projects
No open projects
6 participants