Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete IPv6 route and neigh to peer gateway if necessary #3490

Merged
merged 1 commit into from Mar 25, 2022

Conversation

xliuxu
Copy link
Contributor

@xliuxu xliuxu commented Mar 21, 2022

Check whether the route and neigh to peer gateway are still needed
and delete the route and neigh if necessary. This can happen when
the traffic encryption mode or traffic encapsulation mode changes.

Fixes #3437

Signed-off-by: Xu Liu xliu2@vmware.com

@xliuxu xliuxu force-pushed the clean_gw_routes_for_wireguard branch 2 times, most recently from 047d9df to 48cf995 Compare March 21, 2022 01:40
@xliuxu
Copy link
Contributor Author

xliuxu commented Mar 21, 2022

/test-ipv6-only-e2e

@codecov-commenter
Copy link

codecov-commenter commented Mar 21, 2022

Codecov Report

Merging #3490 (5973624) into main (3c2bfe0) will decrease coverage by 10.19%.
The diff coverage is 46.51%.

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #3490       +/-   ##
===========================================
- Coverage   65.47%   55.28%   -10.20%     
===========================================
  Files         268      384      +116     
  Lines       26909    52707    +25798     
===========================================
+ Hits        17620    29137    +11517     
- Misses       7366    21096    +13730     
- Partials     1923     2474      +551     
Flag Coverage Δ
e2e-tests 53.50% <39.53%> (?)
integration-tests 35.77% <ø> (?)
kind-e2e-tests 55.20% <51.72%> (-0.45%) ⬇️
unit-tests 43.38% <ø> (+0.70%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/route/route_linux.go 44.33% <46.51%> (-5.60%) ⬇️
pkg/agent/cniserver/pod_configuration_linux.go 26.31% <0.00%> (-40.36%) ⬇️
pkg/controller/ipam/antrea_ipam_controller.go 48.71% <0.00%> (-31.57%) ⬇️
pkg/controller/networkpolicy/endpoint_querier.go 61.46% <0.00%> (-29.97%) ⬇️
pkg/controller/egress/controller.go 62.19% <0.00%> (-26.26%) ⬇️
pkg/controller/ipam/validate.go 57.95% <0.00%> (-24.31%) ⬇️
pkg/agent/cniserver/ipam/antrea_ipam_controller.go 56.66% <0.00%> (-22.86%) ⬇️
pkg/apiserver/handlers/endpoint/handler.go 47.82% <0.00%> (-22.77%) ⬇️
pkg/agent/util/iptables/lock.go 60.00% <0.00%> (-21.82%) ⬇️
pkg/controller/externalippool/validate.go 55.17% <0.00%> (-21.02%) ⬇️
... and 357 more

@@ -726,7 +726,9 @@ func (c *Client) Reconcile(podCIDRs []string, svcIPs map[string]bool) error {
}
// IPv6 doesn't support "on-link" route, routes to the peer IPv6 gateways need to
// be added separately. So don't delete such routes.
if desiredIPv6GWs.Has(route.Dst.IP.String()) {
// If WireGuard is enabled, the route to peer IPv6 gateway should be deleted since
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When it starts, if there is an IPv6 "onlink" route, it means it was normal mode and the remote Node still exists. I think it's more proper to "replace" the routes together when we call InstallNodeRoute for this Node, instead of removing "onlink" route here and the other route in InstallNodeRoute.
The issue is essentially we thought route.Replace can achieve route removal and creation, but it didn't work for wireguard case as the latter has only 1 route.
I think we could probably first check if the normal case really needs two routes. If yes, we should fix how "replace" is implemented. If no, we should just remove the "onlink" route.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some digging, I found that kernel > 4.16 supports adding IPv6 routes with the onlink flag as IPv4 does. Do you think we can remove the routes for the peer gateway and add routes to peer CIDRs with the onlink flag directly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current kernel version requirement is >4.6. I think it's not worth to bump up the requirement to 4.16 just for this, given that it's easy to resolve with existing solution. Besides, Ubuntu 18.04 (<18.04.2) ships with 4.15, which might still be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @tnqn. I think we can remove routes for peer gateway if WireGuard is enabled in AddRoutes or create a gateway route for WireGuard to replace the one in normal mode. I am not sure which one is better. I initially thought that AddRoutes is responsible for adding or replacing routes. But we did need to delete the route for WireGuard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only WireGuard, there is also an issue when changing from encap to noEncap mode (in same subnet case).
I think we shouldn't add an unneeded route for this bug, and it doesn't fix encap -> noEncap (in different subnet case) anyway.
I feel the proper fix is to clean up possible stable routes installed by other modes when it installs routes for one mode. For example:

routeToNodeGwIPReplaced := false
routeToPodCIDRReplaced := false
for _, route := range routes {
	if err := netlink.RouteReplace(route); err != nil {
		return fmt.Errorf("failed to install route to peer %s (%s) with netlink. Route config: %s. Error: %v", nodeName, nodeIP, route.String(), err)
	}
	if route.Dst == podCIDR {
		routeToPodCIDRReplaced = true
	} else if route.Dst == ... {
		routeToNodeGwIPReplaced = true
	}
}
if !routeToNodeGwIPReplaced {
	if err := netlink.RouteDel(...); err != nil {
		...
	}
}
if !routeToPodCIDRReplaced {
	if err := netlink.RouteDel(...); err != nil {
		...
	}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the code per discussion. But I did not remove the route to peer Pod CIDR when the route to peer Pod CIDR was not replaced. I am not confident about removing the route by just setting the Pod CIDR as the only parameter. Will it delete the route added manually in noEncap mode plus different subnet cases?

@tnqn tnqn added this to the Antrea v1.6 release milestone Mar 21, 2022
@xliuxu xliuxu force-pushed the clean_gw_routes_for_wireguard branch from 48cf995 to dd00ccd Compare March 21, 2022 15:02
@xliuxu xliuxu changed the title Delete IPv6 routes to peer gateway if WireGuard is enabled Delete IPv6 routes to peer gateway if necessary Mar 21, 2022
}

if podCIDR.IP.To4() == nil {
if !routeToNodeGwIPv6Replaced && utilnet.IsIPv6(nodeGwIP) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only routeToNodeGwIPNetv6, we also need to remove routeToPodCIDRReplaced when it's NoEncap mode in different subnet. And second thought about how to identify stale routes:

staleRouteDsts := sets.NewString(podCIDR.String())
if utilnet.IsIPv6(nodeGwIP) {
	staleRouteDsts.Insert(&net.IPNet{IP: nodeGwIP, Mask: net.CIDRMask(128, 128).String())
}
if c.networkConfig.TrafficEncryptionMode == config.TrafficEncryptionModeWireGuard {
	route.LinkIndex = c.nodeConfig.WireGuardConfig.LinkIndex
	route.Scope = netlink.SCOPE_LINK
	if podCIDR.IP.To4() != nil {
		route.Src = c.nodeConfig.GatewayConfig.IPv4
	} else {
		route.Src = c.nodeConfig.GatewayConfig.IPv6
	}
	staleRouteDsts.Delete(podCIDR.String())
} else if ... {
...
}
...
for dst := range staleRouteDsts {
        if err := netlink.RouteDel(&netlink.Route{Dst: dst}); err != nil {
	...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure about the CIDR route for Pods in NoEncap mode & different subnets. Do you think we may delete the route added by users or other CNIs mistakenly if we only use the Pod CIDR as dst IP to identify a route?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I think you are right. antrea can work with kube-router, which may install route for PodCIDR for noEncap case.

routeToNodeGwIPNetv6 := &netlink.Route{
Dst: nodeGwIPNetv6,
}
if err := netlink.RouteDel(routeToNodeGwIPNetv6); err != nil && err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err != nil && err == nil ?

pkg/agent/route/route_linux.go Outdated Show resolved Hide resolved
@xliuxu xliuxu force-pushed the clean_gw_routes_for_wireguard branch from dd00ccd to 0d124bf Compare March 22, 2022 01:23
}

if podCIDR.IP.To4() == nil {
if !routeToNodeGwIPv6Replaced && utilnet.IsIPv6(nodeGwIP) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I think you are right. antrea can work with kube-router, which may install route for PodCIDR for noEncap case.

nodeGwIPNetv6, nodeName, nodeIP, err)
}
}
if routeToPodCIDRReplaced && podCIDR.IP.To4() == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the original code is wrong, because it seems the neighbor entry is only needed in encap case.
If you could verify and fix it in the PR it would be great. Or we could keep its current behavior for now, but the code could be simpler (sorry for the second thought):

requireNodeGwIPv6RouteAndNeigh := false

if c.networkConfig.TrafficEncryptionMode == config.TrafficEncryptionModeWireGuard {
...
} else if c.networkConfig.NeedsTunnelToPeer(nodeIP, nodeTransportIPAddr) {
	if podCIDR.IP.To4() == nil {
                requireNodeGwIPv6RouteAndNeigh = true
		// "on-link" is not identified in IPv6 route entries, so split the configuration into 2 entries.
		routes = []*netlink.Route{
			{
				Dst:       &net.IPNet{IP: nodeGwIP, Mask: net.IPMask{0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}},
				LinkIndex: c.nodeConfig.GatewayConfig.LinkIndex,
			},
		}
	} else {
        ...
        }
}
if !requireNodeGwIPv6RouteAndNeigh && utilnet.IsIPv6(nodeGwIP) {
        err := netlink.RouteDel(...)
	...
}

if requireNodeGwIPv6RouteAndNeigh && utilnet.IsIPv6(nodeGwIP) {
	// Add IPv6 neighbor if the given podCIDR is using IPv6 address.
	neigh := &netlink.Neigh{...}
...

@tnqn
Copy link
Member

tnqn commented Mar 24, 2022

/test-ipv6-only-e2e
The test should fail if my comment in #3490 (comment) is correct.

@tnqn
Copy link
Member

tnqn commented Mar 25, 2022

/test-ipv6-only-e2e

@xliuxu xliuxu force-pushed the clean_gw_routes_for_wireguard branch from 0d124bf to 14ffad8 Compare March 25, 2022 01:02
@xliuxu xliuxu changed the title Delete IPv6 routes to peer gateway if necessary Delete IPv6 route and neigh to peer gateway if necessary Mar 25, 2022
@xliuxu xliuxu force-pushed the clean_gw_routes_for_wireguard branch from 14ffad8 to dbfe802 Compare March 25, 2022 01:14
@xliuxu
Copy link
Contributor Author

xliuxu commented Mar 25, 2022

/test-ipv6-only-e2e

@xliuxu xliuxu force-pushed the clean_gw_routes_for_wireguard branch from dbfe802 to 11ab140 Compare March 25, 2022 01:15
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's see if test passes

Dst: &net.IPNet{IP: nodeGwIP, Mask: net.CIDRMask(128, 128)},
}
if err := netlink.RouteDel(routeToNodeGwIPNetv6); err == nil {
klog.InfoS("Deleted route to peer gateway", "node", nodeName, "ip", nodeIP)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
klog.InfoS("Deleted route to peer gateway", "node", nodeName, "ip", nodeIP)
klog.InfoS("Deleted route to peer gateway", "node", nodeName, "nodeIP", nodeIP, "nodeGatewayIP", nodeGwIP)

as this route is mainly related to nodeGatewayIP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

IP: nodeGwIP,
}
if err := netlink.NeighDel(neigh); err == nil {
klog.InfoS("Deleted neigh to peer gateway", "node", nodeName, "ip", nodeIP)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. thanks!

Check whether the route and neigh to peer gateway are still needed
and delete the route and neigh if necessary. This can happen when
the traffic encryption mode or traffic encapsulation mode changes.

Fixes antrea-io#3437

Signed-off-by: Xu Liu <xliu2@vmware.com>
@xliuxu xliuxu force-pushed the clean_gw_routes_for_wireguard branch from 11ab140 to 5973624 Compare March 25, 2022 04:12
@xliuxu
Copy link
Contributor Author

xliuxu commented Mar 25, 2022

/test-e2e
/test-ipv6-e2e
/test-ipv6-only-e2e

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tnqn
Copy link
Member

tnqn commented Mar 25, 2022

/test-integration
/test-conformance
/test-networkpolicy

@tnqn tnqn merged commit 7facff8 into antrea-io:main Mar 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TestWireGuard/testServiceConnectivity failing in IPv6 cluster
3 participants