node with wrong ipv6 route table after ipv6 vip failover #8381

ltgentoo · 2023-12-29T15:27:38Z

k8s with dual stack enabled,use haproxy and keepalived for ha，after failover，the node with vip previously get wrong ipv6 route table

Expected Behavior

vip address: 2001::201 on 2001::21
we have a test cluster with 3 nodes
2001::21
2001::22
2001::23
calico mode: ipip crosssubnet
before failover,the ipv6 route table is:

2001::23:
2000:100:100:100:19ca:52ab:2617:eac0/122 2001::22 UG 1024 1 0 ens33
2000:100:100:100:891c:ddc:b181:4840/122 2001::21 UG 1024 1 0 ens33
2001::22:
2000:100:100:100:891c:ddc:b181:4840/122 2001::21 UG 1024 1 0 ens33
2000:100:100:100:97a2:de77:c193:200/122 2001::23 UG 1024 2 0 ens33
2001::21:
2000:100:100:100:19ca:52ab:2617:eac0/122 2001::22 UG 1024 1 0 ens33
2000:100:100:100:97a2:de77:c193:200/122 2001::23 UG 1024 1 0 ens33
before vip failover,everything works fine
after vip failovers,the ipv6 route should not changed

Current Behavior

after failover
vip [a](address: 2001::201 on 2001::22，the ipv6 route tables is:

2001::22:
2000:100:100:100:891c:ddc:b181:4840/122 2001::21 UG 1024 2 0 ens33
2000:100:100:100:97a2:de77:c193:200/122 2001::23 UG 1024 3 0 ens33
2001::23:
2000:100:100:100:19ca:52ab:2617:eac0/122 2001::22 UG 1024 2 0 ens33
2000:100:100:100:891c:ddc:b181:4840/122 2001::21 UG 1024 2 0 ens33
2001::21:
2000:100:100:100:19ca:52ab:2617:eac0/122 2001::2839:3654:bcd8:88c3 UG 1024 2 0 ens33
2000:100:100:100:97a2:de77:c193:200/122 2001::2839:3654:bcd8:88c3 UG 1024 1 0 ens33

ipv6 route on node 2001::21 changed , 2001::2839:3654:bcd8:88c3 is our defautl ipv6 gateway, i don't know why
of course， with the wrong ipv6 route ,can't reach pod on other node from 2001::21

Possible Solution

we change the config of calico-node env IP6_AUTODETECTION_METHOD : kubernetes-internal-ip and interface=ens33
both config;the result is same
check calicoctl get nodes -oyaml result ,and bgp.ipv6Address is correct
if we delete pod calico-node on 2001::21,the ipv6 route tables will be resume correct

Steps to Reproduce (for bugs)

deploy a k8s cluster with dual stack enabled
keepalive + haproxy for ha
stop haproxy for vip failover manually
check the ip v6 route

Context

Your Environment

Calico version 3.23.1 3.26 3.27
Orchestrator version (e.g. kubernetes, mesos, rkt): k8s 1.23.6
Operating System and version: centos7
Link to your project (optional):

The text was updated successfully, but these errors were encountered:

mazdakn · 2024-01-09T17:45:48Z

@ltgentoo have you followed our docs for this setup? We don't use HAProxy/Keepalived for high availability. In this case of failover Calico is not aware of the change. this is controlled by keepalived, and the routes are added by it.

ltgentoo · 2024-01-12T06:38:54Z

thanks for your reply, we use HAProxy/Keepalived for apiserver ha,i know calico don‘t need HAProxy/Keepalived,maybe there are some conflicts with them.we try to solve the problem

I would like to add some information
with this config: IP6_AUTODETECTION_METHOD : kubernetes-internal-ip,the ipv6 route will be ok after ipv6 vip failover in vxlan mode, calico_backend: vxlan
but when calico_backend: bird, bgp mode,the route will be wrong after vip failover。

My confusion is that：
it looks like the problem is: when vip was deleted from the interface,the calico route lost,then the defautl ipv6 gateway be added
even i don't know,the problem is keepavlied,felix,or bird?
Do you have any suggestions?

ltgentoo · 2024-01-12T09:29:30Z

I would like to add more information
We found that this is not related to keepalive

when we stop keepalive/harproxy
calico config:

IP6_AUTODETECTION_METHOD : kubernetes-internal-ip
calico_backend: bird
CALICO_IPV6POOL_VXLAN: Never

 - name: CALICO_IPV4POOL_IPIP
    value: Never
  - name: CALICO_IPV6POOL_IPIP
    value: Never
  - name: CALICO_IPV4POOL_VXLAN
    value: Never
  - name: CALICO_IPV6POOL_VXLAN
    value: Never

we manually add a ip v6 address for ens33,then delete this address
route for pod cidr befor ipv6 address delete:

ip addr add 2001::201/64 dev ens33
ip addr del 2001::201/64 dev ens33

after we delete this address,the route table for pod cidr is:

it seems that when delete ip address ,It will lead to incorrect route of IPv6

im-jinxinwang · 2024-01-12T09:50:51Z

Hi @mazdakn
I have encountered the same problem, and can conduct a failure test on a Kubernetes cluster with dual stack enabled.

im-jinxinwang · 2024-01-15T08:14:53Z

Hi @mazdakn
I found from the log that Felix updated the routing gateway address multiple times. This is not correct, because 2001::2839:3654:bcd8:88c3 is the default gateway address for the host.

node1.txt

mazdakn · 2024-01-23T18:10:13Z

@fasaxc can you please comment on this? It seems we ignore non local routes here:

calico/felix/ifacemonitor/update_filter.go

Line 116 in 126ddce

if !routeIsLocalUnicast(routeUpd.Route) {

but routes for virtual addresses are not local. WDYT?

im-jinxinwang · 2024-01-24T01:06:30Z

@mazdakn The new IPV6 address added theoretically does not belong to the local route, so why does it cause changes in calico routing?

im-jinxinwang · 2024-01-24T01:09:17Z

@mazdakn The normal logic is that Calico will change the IPv6 address of the node to the pod IPV6 network segment gateway address, but the phenomenon here is abnormal.

fasaxc · 2024-01-24T11:13:18Z

Please can you add the output from these commands:

ip addr show
ip -6 route show

I'm not sure that route shows all the information that we need. If you don't have ip route installed, you can exec it int he calico-node pod.

Note that IPIP is not an option for IPv6. The options are to

Form a mesh over a permissive L2 fabric (i.e. all nodes in one L2 broadcast domain)
Peer with your routers
Use VXLANv6 (added in Calico v3.23).

The first two options use BIRD to distribute routes. At a guess, BIRD is picking up the extra IP address and concluding that it is not in the same subnet as the other nodes so it routes via the default gateway instead. I'm not sure why BIRD would be preferring that IP, hopefully the above output will shed some light.

With VXLAN, I think we explicitly use the autodetected IP so that might work here.

ltgentoo · 2024-02-02T07:51:31Z

@fasaxc
ok, let's just talk about bgp，ipip not included
before vip failover:

after vip failover:

abasitt · 2024-05-17T09:15:37Z

possibly related to #8739

lwr20 · 2024-06-18T16:30:35Z

With VXLAN, I think we explicitly use the autodetected IP so that might work here.

@ltgentoo did you get a chance to try VXLANv6?

coutinhop · 2024-09-24T17:08:20Z

@ltgentoo any updates on trying VXLANv6?

mazdakn added the kind/support label Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

node with wrong ipv6 route table after ipv6 vip failover #8381

node with wrong ipv6 route table after ipv6 vip failover #8381

ltgentoo commented Dec 29, 2023 •

edited

Loading

mazdakn commented Jan 9, 2024 •

edited

Loading

ltgentoo commented Jan 12, 2024 •

edited

Loading

ltgentoo commented Jan 12, 2024

im-jinxinwang commented Jan 12, 2024

im-jinxinwang commented Jan 15, 2024

mazdakn commented Jan 23, 2024

im-jinxinwang commented Jan 24, 2024

im-jinxinwang commented Jan 24, 2024

fasaxc commented Jan 24, 2024 •

edited

Loading

ltgentoo commented Feb 2, 2024

abasitt commented May 17, 2024

lwr20 commented Jun 18, 2024

coutinhop commented Sep 24, 2024

node with wrong ipv6 route table after ipv6 vip failover #8381

node with wrong ipv6 route table after ipv6 vip failover #8381

Comments

ltgentoo commented Dec 29, 2023 • edited Loading

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Context

Your Environment

mazdakn commented Jan 9, 2024 • edited Loading

ltgentoo commented Jan 12, 2024 • edited Loading

ltgentoo commented Jan 12, 2024

im-jinxinwang commented Jan 12, 2024

im-jinxinwang commented Jan 15, 2024

mazdakn commented Jan 23, 2024

im-jinxinwang commented Jan 24, 2024

im-jinxinwang commented Jan 24, 2024

fasaxc commented Jan 24, 2024 • edited Loading

ltgentoo commented Feb 2, 2024

abasitt commented May 17, 2024

lwr20 commented Jun 18, 2024

coutinhop commented Sep 24, 2024

ltgentoo commented Dec 29, 2023 •

edited

Loading

mazdakn commented Jan 9, 2024 •

edited

Loading

ltgentoo commented Jan 12, 2024 •

edited

Loading

fasaxc commented Jan 24, 2024 •

edited

Loading