Skip to content

ipam: fix inconsistent behavior for same-prefix routes between IPv4 and IPv6#1261

Open
yushoyamaguchi wants to merge 1 commit into
containernetworking:mainfrom
yushoyamaguchi:multiple_routes_same_prefix1
Open

ipam: fix inconsistent behavior for same-prefix routes between IPv4 and IPv6#1261
yushoyamaguchi wants to merge 1 commit into
containernetworking:mainfrom
yushoyamaguchi:multiple_routes_same_prefix1

Conversation

@yushoyamaguchi
Copy link
Copy Markdown
Contributor

@yushoyamaguchi yushoyamaguchi commented May 24, 2026

Overview

When multiple routes have the same prefix, IPv4 inserted them as separate entries while IPv6 kernel behavior implicitly merged them into ECMP.
This caused the same config to behave differently depending on the address family.
This patch absorbs the kernel behavior difference on the plugins side to ensure consistent ECMP behavior regardless of address family.

Related issue/PR

#615
#1249

Current behavior

IPv4

When we write as below in cni's json,

      "routes": [
        { "dst": "0.0.0.0/0", "gw": "10.0.0.1" },
        { "dst": "0.0.0.0/0", "gw": "10.0.0.2" }
      ]

Pod's FIB is below

default via 10.0.0.1 dev ens3 
default via 10.0.0.2 dev ens3 

IPv6

When we write as below in cni's json,

  "routes": [
    { "dst": "::/0", "gw": "2001:db8::1" },
    { "dst": "::/0", "gw": "2001:db8::2" }
  ]

Pod's FIB is below

default metric 1024 pref medium
        nexthop via 2001:db8::1 dev ens3 weight 1 
        nexthop via 2001:db8::2 dev ens3 weight 1 

Implementation

Group routes with the same destination prefix, table, scope, and priority, then apply them as a single ECMP route via RouteAddEcmp with MultiPath.

Previously, routes were added one by one without grouping. When
multiple routes shared the same prefix, IPv4 inserted them as
separate entries while IPv6 kernel behavior implicitly merged
them into ECMP.
This caused the same config to behave differently
depending on the address family.

To fix the inconsistency, group same-prefix routes in the plugin
and apply them as a single ECMP route, aligning IPv4 behavior
with IPv6.

Signed-off-by: Yusho Yamaguchi <ys-yamaguchi@kddi.com>
@yushoyamaguchi
Copy link
Copy Markdown
Contributor Author

cc @s1061123

@yushoyamaguchi
Copy link
Copy Markdown
Contributor Author

yushoyamaguchi commented May 24, 2026

This inconsistency has been causing confusion in our operation of production environment, so I am submitting this as a bug fix rather than a feature change.

Copy link
Copy Markdown
Contributor

@s1061123 s1061123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold

As mentioned in #1249, this code changes IPAM behavior completely and it causes its regression (i.e. this bugfix causes bigger bugs) and current valid configuration is deprecated. In addition, CNI spec does not specify how CNI result object shows ECMP path as well.

It should be held until we conclude its design discussion, in #1249.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants