Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After gateway failover sometimes an extra 220 routing table (strongswan) will be left and break connectivity #346

Closed
mangelajo opened this issue Feb 12, 2020 · 0 comments · Fixed by #350
Assignees
Labels
datapath Datapath related issues or enhancements ocs
Milestone

Comments

@mangelajo
Copy link
Contributor

mangelajo commented Feb 12, 2020

After a gateway failover, a worker that previously was a master gateway would leave some entries in routing table 220 (strongswan) which take precedence to the default routing rules.

This will make connectivity to the remote cluster from such node. This breaks the E2E tests sometimes.

NOTE: 10.246.224.3 is a remote cluster pod

root@cluster2-worker:/# ip r get 10.246.224.3
10.246.224.3 via 172.17.0.6 dev eth0 table 220 src 172.17.0.5 uid 0

root@cluster2-worker:/# ip r
default via 172.17.0.1 dev eth0
10.245.0.0/16 dev weave proto kernel scope link src 10.245.0.1
10.246.0.0/16 via 240.17.0.8 dev vx-submariner proto static
100.96.0.0/16 via 240.17.0.8 dev vx-submariner proto static
172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.0.5
240.0.0.0/8 dev vx-submariner proto kernel scope link src 240.17.0.5

root@cluster2-worker:/# ip r show table 220
10.246.0.0/16 via 172.17.0.6 dev eth0 proto static src 172.17.0.5
100.96.0.0/16 via 172.17.0.6 dev eth0 proto static src 172.17.0.5
172.17.0.6 via 172.17.0.6 dev eth0 proto static src 172.17.0.5

fixed:

root@cluster2-worker:/# ip r flush table 220
root@cluster2-worker:/# ip r get 10.246.224.3
10.246.224.3 via 240.17.0.8 dev vx-submariner src 240.17.0.5 uid 0

@mangelajo mangelajo added the datapath Datapath related issues or enhancements label Feb 12, 2020
@mangelajo mangelajo added this to the v0.1.0 milestone Feb 12, 2020
@mangelajo mangelajo self-assigned this Feb 13, 2020
mangelajo added a commit to mangelajo/submariner that referenced this issue Feb 13, 2020
HA Failover tests start working flawlesly after this change, otherwise
sometimes the node that previously was a master gw, left over the 220
routing table, causing conflict with the default routes managed by
the route agent.

Fixes: submariner-io#346
mangelajo added a commit to mangelajo/submariner that referenced this issue Feb 13, 2020
HA Failover tests start working flawlesly after this change, otherwise
sometimes the node that previously was a master gw, left over the 220
routing table, causing conflict with the default routes managed by
the route agent.

Fixes: submariner-io#346
mangelajo added a commit to mangelajo/submariner that referenced this issue Feb 13, 2020
HA Failover tests start working flawlesly after this change, otherwise
sometimes the node that previously was a master gw, left over the 220
routing table, causing conflict with the default routes managed by
the route agent.

Fixes: submariner-io#346
mangelajo added a commit to mangelajo/submariner that referenced this issue Feb 13, 2020
HA Failover tests start working flawlesly after this change, otherwise
sometimes the node that previously was a master gw, left over the 220
routing table, causing conflict with the default routes managed by
the route agent.

Fixes: submariner-io#346
mangelajo added a commit that referenced this issue Feb 13, 2020
HA Failover tests start working flawlesly after this change, otherwise
sometimes the node that previously was a master gw, left over the 220
routing table, causing conflict with the default routes managed by
the route agent.

Fixes: #346
deanlorenz pushed a commit to deanlorenz/submariner that referenced this issue Feb 16, 2020
HA Failover tests start working flawlesly after this change, otherwise
sometimes the node that previously was a master gw, left over the 220
routing table, causing conflict with the default routes managed by
the route agent.

Fixes: submariner-io#346
deanlorenz pushed a commit to deanlorenz/submariner that referenced this issue Feb 20, 2020
HA Failover tests start working flawlesly after this change, otherwise
sometimes the node that previously was a master gw, left over the 220
routing table, causing conflict with the default routes managed by
the route agent.

Fixes: submariner-io#346
@mangelajo mangelajo added the ocs label Feb 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datapath Datapath related issues or enhancements ocs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant