Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve NetworkPolicy batch installation #2479

Merged
merged 1 commit into from Jul 30, 2021

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Jul 27, 2021

BatchInstallPolicyRuleFlows first generates all flows then installs them
via single bundle. However, it generates the flows incrementally. If an
address is shared by multiple rules, intermediate flows will be
generated, for instance:

expected flow:

add "nw_dst=10.128.119.108 actions=conjunction(64,2/2),conjunction(65,2/2),conjunction(66,2/2)"

actual flows:

add "nw_dst=10.128.119.108 actions=conjunction(64,2/2)"
mod "nw_dst=10.128.119.108 actions=conjunction(64,2/2),conjunction(65,2/2)"
mod "nw_dst=10.128.119.108 actions=conjunction(64,2/2),conjunction(65,2/2),conjunction(66,2/2)"

The number of flows for this address will be same as the number of the
actions. The number of actual actions in these flows will be O(N^2),
N=number of desired actions. This increases CPU and memory usage
greatly in a high scale cluster.

This patches optimizes it by generating flows based on final state,
reducing the time and space complexity from O(N^2) to O(N). In same
cluster, it's observed that the memory usage was reduced from 1.1G to
500M, the execution time was reduced from 10s to 2s.

benchmark comparison:

name                            old time/op    new time/op    delta
BatchInstallPolicyRuleFlows-48     458ms ± 4%     169ms ± 1%  -63.22%  (p=0.008 n=5+5)

name                            old alloc/op   new alloc/op   delta
BatchInstallPolicyRuleFlows-48     205MB ± 0%      56MB ± 0%  -72.47%  (p=0.008 n=5+5)

name                            old allocs/op  new allocs/op  delta
BatchInstallPolicyRuleFlows-48     4.29M ± 0%     0.92M ± 0%  -78.60%  (p=0.008 n=5+5)

Signed-off-by: Quan Tian qtian@vmware.com

For #2457

@codecov-commenter
Copy link

codecov-commenter commented Jul 27, 2021

Codecov Report

Merging #2479 (55d74a8) into main (7ad23ef) will decrease coverage by 1.99%.
The diff coverage is 71.23%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2479      +/-   ##
==========================================
- Coverage   59.83%   57.83%   -2.00%     
==========================================
  Files         284      284              
  Lines       22175    22227      +52     
==========================================
- Hits        13269    12856     -413     
- Misses       7487     7970     +483     
+ Partials     1419     1401      -18     
Flag Coverage Δ
e2e-tests ∅ <ø> (?)
kind-e2e-tests 43.73% <19.17%> (-3.26%) ⬇️
unit-tests 42.21% <71.23%> (+0.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/ovs/openflow/ofctrl_flow.go 40.00% <0.00%> (-5.62%) ⬇️
pkg/util/ip/ip.go 75.71% <0.00%> (-4.59%) ⬇️
pkg/agent/openflow/pipeline.go 58.38% <66.66%> (-14.59%) ⬇️
pkg/agent/openflow/network_policy.go 75.92% <88.46%> (+0.21%) ⬆️
pkg/agent/proxy/metrics/metrics.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/agent/proxy/types/groupcounter.go 0.00% <0.00%> (-87.50%) ⬇️
pkg/agent/proxy/types/types.go 0.00% <0.00%> (-46.67%) ⬇️
pkg/ovs/openflow/ofctrl_group.go 0.00% <0.00%> (-45.95%) ⬇️
pkg/agent/openflow/pipeline_other.go 40.00% <0.00%> (-20.00%) ⬇️
pkg/agent/proxy/proxier.go 45.68% <0.00%> (-18.22%) ⬇️
... and 22 more

@tnqn tnqn force-pushed the improve-batch-install branch 2 times, most recently from 48c16c8 to 84c43b7 Compare July 28, 2021 17:08
@tnqn tnqn changed the title Improve BatchInstallPolicyRuleFlows Improve NetworkPolicy batch installation Jul 28, 2021
@tnqn tnqn marked this pull request as ready for review July 28, 2021 17:31
@tnqn tnqn force-pushed the improve-batch-install branch 2 times, most recently from 4baa443 to 9f24052 Compare July 28, 2021 17:39
@tnqn
Copy link
Member Author

tnqn commented Jul 28, 2021

/test-all

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I am not an expert with respect to that part of the code.

I do wonder if all the differences in the code between initial batch installation and regular ongoing rule processing are completely necessary. In both cases, it seems that we should be computing a diff, and applying the required operations as one bundle. The difference is that the bundle for initial installation should only include Adds (and would be much larger). I may be missing something obvious though, and anyway it is beyond the scope of this PR.

pkg/agent/openflow/network_policy_test.go Show resolved Hide resolved
{
Direction: v1beta2.DirectionOut,
// conjunctive match flow for nw_src=192.168.1.40 should commit to conjunction 10 and 11.
// conjunctive match flow for nw_src=192.168.1.45 should commit to conjunction 11 only.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's 192.168.1.45? Did you mean 192.168.1.51 maybe?

I don't think "commit" is the right verb here, but I'm no specialist. I would suggest "tie" instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks

},
},
},
expectedFlowsFn: func(c *client) []binding.Flow {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think it would be possible to generate these expectations in a more human-readable way? Right now there is a lot of noise because of the cookie stuff, etc.

in other words would it be possible to have something like this:

type ExpectedFlowsBuilder struct {...}

fb := &ExpectedFlowsBuilder{}
  .AddConjunction(10).AddConjunction(11)
  .AddConjunctiveMatch(<match info>, ConjunctionAction(10, 1, 2), ConjunctionAction(11, 1, 3))
  .AddConjunctiveMatch(<match info>, ConjunctionAction(11, 1, 3))
...
fb.Flows() // return the list of Flows

It would take care of generating metric flows and all the "noise".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conjunction rule may have different in priority and table as well. If too customized for NetworkPolicy flow, it seems we will need such builders when we add verficiation for other flows, like service, pod. But if it's generic, it would be quite similar to the current FlowBuilder, and maybe we should just make FlowBuilder more simpler to use, removing some redundant part.
Maybe I missed your point, I can follow up after the fix is merged, and feel free to update the code directly if you have some mature idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I missed your point, I can follow up after the fix is merged, and feel free to update the code directly if you have some mature idea.

No, it seems that we are on the same page. I feared it may not be as simple as I hoped :) Definitely doesn't need to be part of this PR. It may be indeed be an opportunity to simplify the FlowBuilder, although I feel like some customization for testing would always be needed. Ideally the expectations I define would not need to define the metrics flow separately for example.

pkg/agent/openflow/network_policy.go Outdated Show resolved Hide resolved
Copy link
Contributor

@Dyanngg Dyanngg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. I assume this would supercede #2465?

I do wonder if all the differences in the code between initial batch installation and regular
ongoing rule processing are completely necessary. In both cases, it seems that we should be computing a diff, and applying the required operations as one bundle.

My understanding is that in case of batch rule install, we don't actually care about the intermediate conjunctive context changes caused by a single rule -- we only care about the final OVS flows, and in case of bundle failure, we simply revert the entire cache. This PR takes advantage of this fact and skips calculating those intermediate flow states, so the logic here is a bit different than async rule install.

@tnqn
Copy link
Member Author

tnqn commented Jul 29, 2021

LGTM overall. I assume this would supercede #2465?

@Dyanngg Yes

I do wonder if all the differences in the code between initial batch installation and regular
ongoing rule processing are completely necessary. In both cases, it seems that we should be computing a diff, and applying the required operations as one bundle.

My understanding is that in case of batch rule install, we don't actually care about the intermediate conjunctive context changes caused by a single rule -- we only care about the final OVS flows, and in case of bundle failure, we simply revert the entire cache. This PR takes advantage of this fact and skips calculating those intermediate flow states, so the logic here is a bit different than async rule install.

@antoninbas Yes, the main motivation of batch install was to avoid priority reassigning but it can avoid intermediate flows for conjunctive match flow update as well. If rules are installed asynchrously:

  1. Installing rule B may update priority of rule A's flows, and installing rule C may affect rule A and B.
  2. Installing rule B may update the shared conjunctive match flow if it has same from/to address as A, and installing rule C may update it again.

This is what I got from ovs-vswitchd log:
With v1.2.0, flows were calculated incrementally, 100136 modification are for conjunction match flow update. It used 1.1G memory during the peak as all flows were kept in memory before they were installed together.

2021-07-27T07:09:05.226Z|00197|connmgr|INFO|br-int<->unix#0: 159657 flow_mods in the 9 s starting 10 s ago (59520 adds, 1 deletes, 100136 modifications)

Based on v1.2.0, if we simply remove batch install, the number of modification flows is same as above but took more time to install. If the rules have different priorties, there would be more modification flows caused by priority-reassigning. It didn't have memory peak as flows in memory were generated and released but it took longer time and the overhead of calculating and realizing the incremental flows were same as above.


2021-07-27T07:10:59.359Z|00139|connmgr|INFO|br-int<->unix#0: 133058 flow_mods in the 9 s starting 10 s ago (59230 adds, 1 deletes, 73827 modifications)
2021-07-27T07:11:59.359Z|00140|connmgr|INFO|br-int<->unix#0: 51223 flow_mods in the 59 s starting 60 s ago (24913 adds, 1 deletes, 26309 modifications)

With this patch, there is no modification flow and all NetworkPolicy flows were installed via single bundle. It's faster and used less CPU and memory.

2021-07-27T11:23:07.830Z|00139|connmgr|INFO|br-int<->unix#0: 61759 flow_mods in the last 10 s (61758 adds, 1 deletes)

@tnqn
Copy link
Member Author

tnqn commented Jul 29, 2021

/test-all

Copy link
Contributor

@wenyingd wenyingd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only one minor comment.

if err := c.sendConjunctiveFlows(allCtxChanges, allFlows); err != nil {
if err := c.ofEntryOperations.AddAll(allFlows); err != nil {
// Discard the cache if the bundle fails.
c.globalConjMatchFlowCache = map[string]*conjMatchFlowContext{}
Copy link
Contributor

@wenyingd wenyingd Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to clear up all existing entries in globalConjMatchFlowCache. I mean if BatchInstallPolicyRuleFlows is only called in restart case once, it should be OK. But if some other cases also call it, it might introduce new bugs. Better to explain this function is called only after restart case and only one time in the comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we only use the method when agent starts. Will add comment about it.
Actually we already have a bug because we don't clean the cache, see the fix here: #2465.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment

BatchInstallPolicyRuleFlows first generates all flows then installs them
via single bundle. However, it generates the flows incrementally. If an
address is shared by multiple rules, intermediate flows will be
generated, for instance:

expected flow:
add "nw_dst=10.128.119.108 actions=conjunction(64,2/2),conjunction(65,2/2),conjunction(66,2/2)"

actual flows:
add "nw_dst=10.128.119.108 actions=conjunction(64,2/2)"
mod "nw_dst=10.128.119.108 actions=conjunction(64,2/2),conjunction(65,2/2)"
mod "nw_dst=10.128.119.108 actions=conjunction(64,2/2),conjunction(65,2/2),conjunction(66,2/2)"

The number of flows for this address will be same as the number of the
actions. The number of actual actions in these flows will be O(N^2),
N=number of desired actions. This increases CPU and memory usage
greatly in a high scale cluster.

This patches optimizes it by generating flows based on final state,
reducing the time and space complexity from O(N^2) to O(N). In same
cluster, it's observed that the memory usage was reduced from 1.1G to
500M, the execution time was reduced from 10s to 2s.

benchmark comparison:

name                            old time/op    new time/op    delta
BatchInstallPolicyRuleFlows-48     458ms ± 4%     169ms ± 1%  -63.22%  (p=0.008 n=5+5)

name                            old alloc/op   new alloc/op   delta
BatchInstallPolicyRuleFlows-48     205MB ± 0%      56MB ± 0%  -72.47%  (p=0.008 n=5+5)

name                            old allocs/op  new allocs/op  delta
BatchInstallPolicyRuleFlows-48     4.29M ± 0%     0.92M ± 0%  -78.60%  (p=0.008 n=5+5)

Signed-off-by: Quan Tian <qtian@vmware.com>
@tnqn
Copy link
Member Author

tnqn commented Jul 29, 2021

/test-all

@antoninbas
Copy link
Contributor

I understand the rationale for this change, or for treating the restart differently from regular rule updates, which is not new to this PR. But ultimately it seems to me that we are just reconciling between a current state and a desired state. In the restart case the current state is EMPTY, and we can compute the desired state from all the received rules, and do the actual reconciliation in one go using only ADD operations. In the case of ongoing rule updates, the diff between current & desired state is typically small, and requires a combination of ADD / UPDATE / DELETE operations. But ultimately, we do reconciliation between 2 states using an OpenFlow bundle. So my question was more future-looking, and I was wondering about whether we could represent the state in such a fashion that the logic is mostly the same in both cases, without sacrificing performance. That may not be possible though...

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I'll let you decide if this needs backporting beyond v1.2

@tnqn
Copy link
Member Author

tnqn commented Jul 30, 2021

I understand the rationale for this change, or for treating the restart differently from regular rule updates, which is not new to this PR. But ultimately it seems to me that we are just reconciling between a current state and a desired state. In the restart case the current state is EMPTY, and we can compute the desired state from all the received rules, and do the actual reconciliation in one go using only ADD operations. In the case of ongoing rule updates, the diff between current & desired state is typically small, and requires a combination of ADD / UPDATE / DELETE operations. But ultimately, we do reconciliation between 2 states using an OpenFlow bundle. So my question was more future-looking, and I was wondering about whether we could represent the state in such a fashion that the logic is mostly the same in both cases, without sacrificing performance. That may not be possible though...

Good suggestion. I actually thought a little about this when I was working on this change and tried to unify the single rule installation and multiple rules installation but stopped when the change was much bigger than needed for this bug itself. I will continue thinking about your suggestion and try to simplify/unify the approach as follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants