Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1.15] route: Specify "proto kernel" for ip routes and rules #31777

Merged
merged 2 commits into from Apr 9, 2024

Conversation

jschwinger233
Copy link
Member

@jschwinger233 jschwinger233 commented Apr 4, 2024

This is a customized backport to take care of upgrade/downgrade. Please see commit message for details.

@maintainer-s-little-helper maintainer-s-little-helper bot added backport/1.15 This PR represents a backport for Cilium 1.15.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. labels Apr 4, 2024
@jschwinger233
Copy link
Member Author

/ci-ipsec-upgrade

@jschwinger233 jschwinger233 force-pushed the gray/1.15/route-fix branch 2 times, most recently from 8c1da7e to 144e080 Compare April 5, 2024 05:19
@jschwinger233 jschwinger233 changed the title Gray/1.15/route fix [v1.15] route: Specify "proto kernel" for ip routes and rules Apr 5, 2024
@jschwinger233
Copy link
Member Author

/test-backport-1.15

Comment on lines 211 to 237
// removeStaleProxyRulesIPv6 removes stale proxy rules. This is a v1.15 only function.
func removeStaleProxyRulesIPv6() error {
return removeProtoUnspecRules(netlink.FAMILY_V6)
}

// removeProtoUnspecRules removes all routing rules with protocol RTPROT_UNSPEC. This is a v1.15 only function.
func removeProtoUnspecRules(family int) error {
rules, err := route.ListRules(family, nil)
if err != nil {
return fmt.Errorf("listing routing rules: %w", err)
}
for _, r := range rules {
if r.Protocol == unix.RTPROT_UNSPEC {
if err := netlink.RuleDel(&r); err != nil {
if !errors.Is(err, syscall.ENOENT) && !errors.Is(err, syscall.EAFNOSUPPORT) {
return fmt.Errorf("removing proto unspec routing rule: %w", err)
}
}
}
}
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it help for review to split the custom cleanup logic into a separate patch (and make the first patch a clean cherry-pick from upstream) ? Or is it crucial for bisectability that all these changes go in as one patch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems no harm to split into two patches. 1.15-old -> 1.15-tip upgrade is the only one that could be affected in terms of git bisect, but I don't see too many risks.

@jschwinger233 jschwinger233 force-pushed the gray/1.15/route-fix branch 3 times, most recently from 0d13ff3 to 9ca1a9a Compare April 5, 2024 07:22
@jschwinger233
Copy link
Member Author

/test-backport-1.15

pkg/proxy/routes.go Outdated Show resolved Hide resolved
[ upstream commit: 318a648 ]

v1.14 installs routes and rules with specific "proto kernel", v1.15
missed them. Without "proto kernel", it causes troubles when downgrade
from v1.15 to v1.14, as v1.14 is deleting routes with "proto kernel" but
there are no matching ones.

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
This commit adds "removeStaleProxyRulesIPvX()" which removes any ip
rules with "proto unspec" to ensure upgrade/downgrade goes smoothly.

Scenario 1: upgrade from v1.15-old to v1.15-tip
  v1.15-old cilium installs ip rules with "proto unspec", then
  v1.15-tip will install "duplicate" ip rules with "proto kernel".
  This is the moment when "removeStaleProxyRulesIPvX()" plays a role,
  it cleans those "proto unspec" stale rules without breaking
  connectivity.

Scenario 2: downgrade from v1.15-tip to v1.15-old
  v1.15-tip has rules with "proto kernel". When v1.15-old tries to
  "ReplaceRule()" with "proto unspec", thanks to
  [this](https://github.com/cilium/cilium/blob/v1.15.3/pkg/datapath/linux/route/route_linux.go#L402),
  "ReplaceRule()" won't replace the rules because they already exist
  (with a different proto). This ensures connectivity can survive
  the downgrade too.

Scenario 3: upgrade from v1.15-tip to v1.16
  Since v1.15-tip installs correct rules with "proto kernel", v1.16
  will do nothing after confirming existance by "lookupRule()". It
  should be painless as well.

This is a v1.15-only commit because:
1. v1.14 is still using bpf/init.sh which sets rules with "proto kernel"
   properly;
2. v1.16 has been fixed to set "proto kernel";
3. v1.15-tip -> v1.16 upgrade has been discussed above without any
   issue;

Also please note that we don't have to clean up leftover ip routes with
"proto unspec", because we replace them via "route.Upsert()" which
replaces the old ones unconditionally, leaving no stale routes.

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
@jschwinger233
Copy link
Member Author

/test-backport-1.15

@jschwinger233 jschwinger233 marked this pull request as ready for review April 9, 2024 14:00
@jschwinger233 jschwinger233 requested a review from a team as a code owner April 9, 2024 14:00
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 9, 2024
@lmb lmb merged commit a3fd8ce into cilium:v1.15 Apr 9, 2024
59 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.15 This PR represents a backport for Cilium 1.15.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants