New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve NetworkPolicy batch installation #2479
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2479 +/- ##
==========================================
- Coverage 59.83% 57.83% -2.00%
==========================================
Files 284 284
Lines 22175 22227 +52
==========================================
- Hits 13269 12856 -413
- Misses 7487 7970 +483
+ Partials 1419 1401 -18
Flags with carried forward coverage won't be shown. Click here to find out more.
|
48c16c8
to
84c43b7
Compare
4baa443
to
9f24052
Compare
|
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I am not an expert with respect to that part of the code.
I do wonder if all the differences in the code between initial batch installation and regular ongoing rule processing are completely necessary. In both cases, it seems that we should be computing a diff, and applying the required operations as one bundle. The difference is that the bundle for initial installation should only include Adds (and would be much larger). I may be missing something obvious though, and anyway it is beyond the scope of this PR.
| { | ||
| Direction: v1beta2.DirectionOut, | ||
| // conjunctive match flow for nw_src=192.168.1.40 should commit to conjunction 10 and 11. | ||
| // conjunctive match flow for nw_src=192.168.1.45 should commit to conjunction 11 only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's 192.168.1.45? Did you mean 192.168.1.51 maybe?
I don't think "commit" is the right verb here, but I'm no specialist. I would suggest "tie" instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks
| }, | ||
| }, | ||
| }, | ||
| expectedFlowsFn: func(c *client) []binding.Flow { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think it would be possible to generate these expectations in a more human-readable way? Right now there is a lot of noise because of the cookie stuff, etc.
in other words would it be possible to have something like this:
type ExpectedFlowsBuilder struct {...}
fb := &ExpectedFlowsBuilder{}
.AddConjunction(10).AddConjunction(11)
.AddConjunctiveMatch(<match info>, ConjunctionAction(10, 1, 2), ConjunctionAction(11, 1, 3))
.AddConjunctiveMatch(<match info>, ConjunctionAction(11, 1, 3))
...
fb.Flows() // return the list of Flows
It would take care of generating metric flows and all the "noise".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conjunction rule may have different in priority and table as well. If too customized for NetworkPolicy flow, it seems we will need such builders when we add verficiation for other flows, like service, pod. But if it's generic, it would be quite similar to the current FlowBuilder, and maybe we should just make FlowBuilder more simpler to use, removing some redundant part.
Maybe I missed your point, I can follow up after the fix is merged, and feel free to update the code directly if you have some mature idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I missed your point, I can follow up after the fix is merged, and feel free to update the code directly if you have some mature idea.
No, it seems that we are on the same page. I feared it may not be as simple as I hoped :) Definitely doesn't need to be part of this PR. It may be indeed be an opportunity to simplify the FlowBuilder, although I feel like some customization for testing would always be needed. Ideally the expectations I define would not need to define the metrics flow separately for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall. I assume this would supercede #2465?
I do wonder if all the differences in the code between initial batch installation and regular
ongoing rule processing are completely necessary. In both cases, it seems that we should be computing a diff, and applying the required operations as one bundle.
My understanding is that in case of batch rule install, we don't actually care about the intermediate conjunctive context changes caused by a single rule -- we only care about the final OVS flows, and in case of bundle failure, we simply revert the entire cache. This PR takes advantage of this fact and skips calculating those intermediate flow states, so the logic here is a bit different than async rule install.
@Dyanngg Yes
@antoninbas Yes, the main motivation of batch install was to avoid priority reassigning but it can avoid intermediate flows for conjunctive match flow update as well. If rules are installed asynchrously:
This is what I got from ovs-vswitchd log: Based on v1.2.0, if we simply remove batch install, the number of modification flows is same as above but took more time to install. If the rules have different priorties, there would be more modification flows caused by priority-reassigning. It didn't have memory peak as flows in memory were generated and released but it took longer time and the overhead of calculating and realizing the incremental flows were same as above. With this patch, there is no modification flow and all NetworkPolicy flows were installed via single bundle. It's faster and used less CPU and memory. |
9f24052
to
41ef36b
Compare
|
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only one minor comment.
| if err := c.sendConjunctiveFlows(allCtxChanges, allFlows); err != nil { | ||
| if err := c.ofEntryOperations.AddAll(allFlows); err != nil { | ||
| // Discard the cache if the bundle fails. | ||
| c.globalConjMatchFlowCache = map[string]*conjMatchFlowContext{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to clear up all existing entries in globalConjMatchFlowCache. I mean if BatchInstallPolicyRuleFlows is only called in restart case once, it should be OK. But if some other cases also call it, it might introduce new bugs. Better to explain this function is called only after restart case and only one time in the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we only use the method when agent starts. Will add comment about it.
Actually we already have a bug because we don't clean the cache, see the fix here: #2465.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment
BatchInstallPolicyRuleFlows first generates all flows then installs them via single bundle. However, it generates the flows incrementally. If an address is shared by multiple rules, intermediate flows will be generated, for instance: expected flow: add "nw_dst=10.128.119.108 actions=conjunction(64,2/2),conjunction(65,2/2),conjunction(66,2/2)" actual flows: add "nw_dst=10.128.119.108 actions=conjunction(64,2/2)" mod "nw_dst=10.128.119.108 actions=conjunction(64,2/2),conjunction(65,2/2)" mod "nw_dst=10.128.119.108 actions=conjunction(64,2/2),conjunction(65,2/2),conjunction(66,2/2)" The number of flows for this address will be same as the number of the actions. The number of actual actions in these flows will be O(N^2), N=number of desired actions. This increases CPU and memory usage greatly in a high scale cluster. This patches optimizes it by generating flows based on final state, reducing the time and space complexity from O(N^2) to O(N). In same cluster, it's observed that the memory usage was reduced from 1.1G to 500M, the execution time was reduced from 10s to 2s. benchmark comparison: name old time/op new time/op delta BatchInstallPolicyRuleFlows-48 458ms ± 4% 169ms ± 1% -63.22% (p=0.008 n=5+5) name old alloc/op new alloc/op delta BatchInstallPolicyRuleFlows-48 205MB ± 0% 56MB ± 0% -72.47% (p=0.008 n=5+5) name old allocs/op new allocs/op delta BatchInstallPolicyRuleFlows-48 4.29M ± 0% 0.92M ± 0% -78.60% (p=0.008 n=5+5) Signed-off-by: Quan Tian <qtian@vmware.com>
41ef36b
to
55d74a8
Compare
|
/test-all |
|
I understand the rationale for this change, or for treating the restart differently from regular rule updates, which is not new to this PR. But ultimately it seems to me that we are just reconciling between a current state and a desired state. In the restart case the current state is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I'll let you decide if this needs backporting beyond v1.2
Good suggestion. I actually thought a little about this when I was working on this change and tried to unify the single rule installation and multiple rules installation but stopped when the change was much bigger than needed for this bug itself. I will continue thinking about your suggestion and try to simplify/unify the approach as follow-up. |
BatchInstallPolicyRuleFlows first generates all flows then installs them
via single bundle. However, it generates the flows incrementally. If an
address is shared by multiple rules, intermediate flows will be
generated, for instance:
expected flow:
actual flows:
The number of flows for this address will be same as the number of the
actions. The number of actual actions in these flows will be O(N^2),
N=number of desired actions. This increases CPU and memory usage
greatly in a high scale cluster.
This patches optimizes it by generating flows based on final state,
reducing the time and space complexity from O(N^2) to O(N). In same
cluster, it's observed that the memory usage was reduced from 1.1G to
500M, the execution time was reduced from 10s to 2s.
benchmark comparison:
Signed-off-by: Quan Tian qtian@vmware.com
For #2457