Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iptables: Add rules runtime reconciliation #31372

Merged

Conversation

pippolo84
Copy link
Member

@pippolo84 pippolo84 commented Mar 13, 2024

As an attempt to make the agent sensitive to runtime changes, the iptables manager reinstalls all the iptables rules, including the ones related to the proxies, when the datapath is explicitly reinitialized.

This approach does not consider the updates to the external facing network devices, that are read only once at the module startup through the daemon configuration options.

This PR adds a reconciler to update the rules following any updates to the external facing network devices or to the local node. The reconciler listens for those updates and independently rewrites all the rules accordingly, keeping into account proxy rules and no track pod rules too.

This reconciler is not based on the stateDB generic one but is designed from scratch for this specific case. This is needed since the order in which the iptables rules are installed in the chains matters, and the stateDB generic reconciler does not care about ordering.

To migrate to this new asynchronous reconciliation approach, the manager is reworked so that the methods exposed do not immediately reconcile the rules, but collect the desired state and ask the underlying reconciler to rewrite the rules accordingly.

With this new iptables manager implementation the rules reinstallation request from the datapath reinitilization is not needed anymore, hence it is removed.

For more details, please refer to the individual commit messages.

Note to the reviewers: please review per commit.

Related: #30024

Depends on #31068, #31099 and #31368

@pippolo84 pippolo84 added release-note/major This PR introduces major new functionality to Cilium. dont-merge/preview-only Only for preview or testing, don't merge it. area/iptables Impacts how Cilium interacts with iptables. labels Mar 13, 2024
@pippolo84 pippolo84 force-pushed the pr/pippolo84/iptables-reconciler branch 2 times, most recently from 1818cf0 to a592c68 Compare March 14, 2024 11:09
@pippolo84 pippolo84 changed the title iptables: Runtime reconciliation for rules iptables: Add rules runtime reconciliation Mar 14, 2024
@pippolo84 pippolo84 force-pushed the pr/pippolo84/iptables-reconciler branch from a592c68 to a5c5816 Compare March 14, 2024 11:11
@pippolo84

This comment was marked as outdated.

@pippolo84 pippolo84 force-pushed the pr/pippolo84/iptables-reconciler branch from a5c5816 to fda5710 Compare March 22, 2024 10:00
@pippolo84 pippolo84 force-pushed the pr/pippolo84/iptables-reconciler branch from fda5710 to c2658f9 Compare April 2, 2024 19:57
@pippolo84 pippolo84 requested a review from joamaki April 2, 2024 20:04
@pippolo84 pippolo84 marked this pull request as ready for review April 2, 2024 20:21
@pippolo84 pippolo84 requested review from a team as code owners April 2, 2024 20:21
@pippolo84 pippolo84 removed the dont-merge/preview-only Only for preview or testing, don't merge it. label Apr 2, 2024
pkg/datapath/types/loader.go Outdated Show resolved Hide resolved
@pippolo84 pippolo84 force-pushed the pr/pippolo84/iptables-reconciler branch 3 times, most recently from 0d4ef88 to bf39519 Compare April 4, 2024 13:44
@pippolo84
Copy link
Member Author

Added a further commit to bail out from waiting in case of reconciliation error. In that case, the reconciler retries the rules update at a later time, but unblock the waiting goroutines to avoid delaying other (potentially more important) operations indefinitely.

@pippolo84 pippolo84 requested a review from lmb April 4, 2024 13:46
@pippolo84 pippolo84 force-pushed the pr/pippolo84/iptables-reconciler branch from bf39519 to 0a920d6 Compare April 4, 2024 15:42
Copy link
Contributor

@tommyp1ckles tommyp1ckles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! - Couple comments/questions.

Also, this may be a bit deeper on sig-proxy specific stuff than I can give a review, we may want to ping someone with a bit more experience here.

pkg/datapath/iptables/reconciler.go Outdated Show resolved Hide resolved
pkg/datapath/iptables/reconciler.go Outdated Show resolved Hide resolved
pkg/datapath/iptables/iptables.go Outdated Show resolved Hide resolved
pkg/datapath/iptables/reconciler.go Show resolved Hide resolved
pkg/datapath/iptables/reconciler.go Show resolved Hide resolved
pkg/datapath/iptables/reconciler.go Outdated Show resolved Hide resolved
pkg/datapath/iptables/reconciler.go Outdated Show resolved Hide resolved
@pippolo84 pippolo84 force-pushed the pr/pippolo84/iptables-reconciler branch from 0a920d6 to 4a6c89d Compare April 10, 2024 19:08
@pippolo84
Copy link
Member Author

Also, this may be a bit deeper on sig-proxy specific stuff than I can give a review, we may want to ping someone with a bit more experience here.

Adding @jrajahalme as a reviewer 👍

Copy link
Contributor

@tommyp1ckles tommyp1ckles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on my end!

Copy link
Member

@jrajahalme jrajahalme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor question for future enhancement :-)

pkg/datapath/iptables/reconciler.go Show resolved Hide resolved
As an attempt to make the agent sensitive to runtime changes, the
iptables manager reinstalls all the iptables rules, including the ones
related to the proxies, when the datapath is explicitly reinitialized.

This approach does not consider the updates to the external facing
network devices, that are read only once at the module startup through
the daemon configuration options.

This commit adds a reconciler to update the rules following any updates
to the external facing network devices or to the local node.  The
reconciler listens for those updates and independently rewrites all the
rules accordingly, keeping into account proxy rules and no track pod
rules too.

This reconciler is not based on the stateDB generic one but is designed
from scratch for this specific case. This is needed since the order in
which the iptables rules are installed in the chains matters, and the
stateDB generic reconciler does not care about ordering.

To migrate to this new asynchronous reconciliation approach, the manager
is reworked so that the methods exposed do not immediately reconcile the
rules, but collect the desired state and ask the underlying reconciler
to rewrite the rules accordingly.

To rate limit the calls to the iptables command, a ticker is used to
decide when to reconcile.  This also has the advantage of coalescing a
burst of changes into a single call to the iptables/ip6tables commands.
Since a rewrite operation can be pretty expensive in terms of time, this
results in an overall lower latency for those burst updates.

With this new iptables manager implementation the rules reinstallation
request from the datapath reinitilization is not needed anymore, hence
it is removed.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
A full iptables rules rewrite is pretty expensive, as all the chains
must be rewritten by the iptables/ip6tables command.

To speed up the reconciliation in case of a proxy rule update, the
reconciler first tries to synchronously update only the proxy related
rules, falling back to a full asynchronous rules reconciliation only in
case of a failure.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
As already done for the proxy rules, an incremental reconciliation is
added for the rules to skip kernel connection tracking.

This is needed to speed up reconciliation related to a change to no
track rules only.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
The current manager implementation relies on the reconciler to rewrite
the rules in case of no track pods or proxy updates.  Unlike the
previous implementation, this is carried out asynchronously.

Since all the tests in the CI expects these operations to be executed
synchronously, a notification channel is returned and closed when the
reconciliation has been completed successfully.

Despite this should not be strictly needed, waiting for the iptables
rules to be updated avoids a lot of flakes in the current e2e tests
implementation.

Being needed just for backward compatibility with the tests suite, it
may be removed in the future when the tests are rewritten to be less
unforgiven.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Manage ipv4 and ipv6 native routing CIDRs in the local node struct. This
allows the iptables reconciler to subscribe to the local node update
and:

- get the native routing CIDRs set at startup as a config option
- get the updated v4 native routing CIDRs in case of auto detection
  (when IPAM is one of CRD, ENI, Azure or Alibaba) at runtime

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Add a unit-style integration test for the reconciler. The test verifies
that the reconciler emits the correct desired state in a handful of
scenarios (devices updates, local node allocation CIDR updates, proxy
rules and so on).

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
In case of a failed iptables reconciliation, unblock the waiting
goroutines closing the channel. This is done to avoid keeping the
consumer goroutines stuck forever and potentially blocking other and
more important agent operations.

In case of an error, the reconciler retries the iptables rules update at
a later time, in order to achieve eventual consistency.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Rework the helper to check if we should install notrack rules.

Notrack rules installation can be skipped in case of IPv6 or if
install-no-conntrack-iptables-rules option is set to true (that is, if
we are already skipping CT for all pod traffic).

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Add rate limiting to errors logging in both partial (proxy and no track
rules) and full reconciliation operations.

Two separate rate limiters are used so that they will not interfere with
each other.

Also, raise the log level to error in case of a failed reconciliaton.

Finally, improve the wording of the full reconciliation error log.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
@pippolo84 pippolo84 force-pushed the pr/pippolo84/iptables-reconciler branch from 4a6c89d to b45d3db Compare April 15, 2024 16:39
@pippolo84
Copy link
Member Author

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 16, 2024
@julianwiedmann julianwiedmann added this pull request to the merge queue Apr 16, 2024
Merged via the queue into cilium:main with commit 77917b6 Apr 16, 2024
62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/iptables Impacts how Cilium interacts with iptables. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/major This PR introduces major new functionality to Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants