-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iptables: Add rules runtime reconciliation #31372
iptables: Add rules runtime reconciliation #31372
Conversation
1818cf0
to
a592c68
Compare
a592c68
to
a5c5816
Compare
This comment was marked as outdated.
This comment was marked as outdated.
a5c5816
to
fda5710
Compare
fda5710
to
c2658f9
Compare
0d4ef88
to
bf39519
Compare
Added a further commit to bail out from waiting in case of reconciliation error. In that case, the reconciler retries the rules update at a later time, but unblock the waiting goroutines to avoid delaying other (potentially more important) operations indefinitely. |
bf39519
to
0a920d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! - Couple comments/questions.
Also, this may be a bit deeper on sig-proxy specific stuff than I can give a review, we may want to ping someone with a bit more experience here.
0a920d6
to
4a6c89d
Compare
Adding @jrajahalme as a reviewer 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good on my end!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor question for future enhancement :-)
As an attempt to make the agent sensitive to runtime changes, the iptables manager reinstalls all the iptables rules, including the ones related to the proxies, when the datapath is explicitly reinitialized. This approach does not consider the updates to the external facing network devices, that are read only once at the module startup through the daemon configuration options. This commit adds a reconciler to update the rules following any updates to the external facing network devices or to the local node. The reconciler listens for those updates and independently rewrites all the rules accordingly, keeping into account proxy rules and no track pod rules too. This reconciler is not based on the stateDB generic one but is designed from scratch for this specific case. This is needed since the order in which the iptables rules are installed in the chains matters, and the stateDB generic reconciler does not care about ordering. To migrate to this new asynchronous reconciliation approach, the manager is reworked so that the methods exposed do not immediately reconcile the rules, but collect the desired state and ask the underlying reconciler to rewrite the rules accordingly. To rate limit the calls to the iptables command, a ticker is used to decide when to reconcile. This also has the advantage of coalescing a burst of changes into a single call to the iptables/ip6tables commands. Since a rewrite operation can be pretty expensive in terms of time, this results in an overall lower latency for those burst updates. With this new iptables manager implementation the rules reinstallation request from the datapath reinitilization is not needed anymore, hence it is removed. Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
A full iptables rules rewrite is pretty expensive, as all the chains must be rewritten by the iptables/ip6tables command. To speed up the reconciliation in case of a proxy rule update, the reconciler first tries to synchronously update only the proxy related rules, falling back to a full asynchronous rules reconciliation only in case of a failure. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
As already done for the proxy rules, an incremental reconciliation is added for the rules to skip kernel connection tracking. This is needed to speed up reconciliation related to a change to no track rules only. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
The current manager implementation relies on the reconciler to rewrite the rules in case of no track pods or proxy updates. Unlike the previous implementation, this is carried out asynchronously. Since all the tests in the CI expects these operations to be executed synchronously, a notification channel is returned and closed when the reconciliation has been completed successfully. Despite this should not be strictly needed, waiting for the iptables rules to be updated avoids a lot of flakes in the current e2e tests implementation. Being needed just for backward compatibility with the tests suite, it may be removed in the future when the tests are rewritten to be less unforgiven. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Manage ipv4 and ipv6 native routing CIDRs in the local node struct. This allows the iptables reconciler to subscribe to the local node update and: - get the native routing CIDRs set at startup as a config option - get the updated v4 native routing CIDRs in case of auto detection (when IPAM is one of CRD, ENI, Azure or Alibaba) at runtime Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Add a unit-style integration test for the reconciler. The test verifies that the reconciler emits the correct desired state in a handful of scenarios (devices updates, local node allocation CIDR updates, proxy rules and so on). Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
In case of a failed iptables reconciliation, unblock the waiting goroutines closing the channel. This is done to avoid keeping the consumer goroutines stuck forever and potentially blocking other and more important agent operations. In case of an error, the reconciler retries the iptables rules update at a later time, in order to achieve eventual consistency. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Rework the helper to check if we should install notrack rules. Notrack rules installation can be skipped in case of IPv6 or if install-no-conntrack-iptables-rules option is set to true (that is, if we are already skipping CT for all pod traffic). Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Add rate limiting to errors logging in both partial (proxy and no track rules) and full reconciliation operations. Two separate rate limiters are used so that they will not interfere with each other. Also, raise the log level to error in case of a failed reconciliaton. Finally, improve the wording of the full reconciliation error log. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
4a6c89d
to
b45d3db
Compare
/test |
As an attempt to make the agent sensitive to runtime changes, the iptables manager reinstalls all the iptables rules, including the ones related to the proxies, when the datapath is explicitly reinitialized.
This approach does not consider the updates to the external facing network devices, that are read only once at the module startup through the daemon configuration options.
This PR adds a reconciler to update the rules following any updates to the external facing network devices or to the local node. The reconciler listens for those updates and independently rewrites all the rules accordingly, keeping into account proxy rules and no track pod rules too.
This reconciler is not based on the stateDB generic one but is designed from scratch for this specific case. This is needed since the order in which the iptables rules are installed in the chains matters, and the stateDB generic reconciler does not care about ordering.
To migrate to this new asynchronous reconciliation approach, the manager is reworked so that the methods exposed do not immediately reconcile the rules, but collect the desired state and ask the underlying reconciler to rewrite the rules accordingly.
With this new iptables manager implementation the rules reinstallation request from the datapath reinitilization is not needed anymore, hence it is removed.
For more details, please refer to the individual commit messages.
Note to the reviewers: please review per commit.
Related: #30024
Depends on
#31068,#31099and#31368