Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.9 backport: node: Remove check whether nextHop is in same L2 #14453

Merged
merged 2 commits into from
Jan 6, 2021

Conversation

brb
Copy link
Member

@brb brb commented Dec 18, 2020

This is a backport of #14354 for v1.9.

node: Remove check whether nextHop is in same L2

@brb brb added kind/backports This PR provides functionality previously merged into master. backport/1.9 labels Dec 18, 2020
@brb brb requested a review from a team as a code owner December 18, 2020 17:06
@brb
Copy link
Member Author

brb commented Dec 18, 2020

test-backport-1.9

pkg/datapath/linux/node.go Outdated Show resolved Hide resolved
pkg/datapath/linux/node.go Outdated Show resolved Hide resolved
@brb brb requested a review from aanm December 21, 2020 18:07
@brb brb force-pushed the pr/brb/v1.9-rm-arping-l2-addr-check branch from 7595d5d to fac4b87 Compare December 21, 2020 18:07
pkg/datapath/linux/node.go Outdated Show resolved Hide resolved
@brb brb requested a review from aanm January 4, 2021 08:36
It was reported that on a GKE cluster the neigh handling logged the
following error:

    level=error msg="IP is not L2 reachable" error="iface: 'eth0' can't
    reach ip: '10.164.0.1'" interface=eth0 ipAddr=10.164.0.1
    subsys=linux-datapath

The "ip route" and "ip addr" had the following relevant entries:

    default via 10.164.0.1 dev eth0 proto dhcp metric 1024
    10.164.0.1 dev eth0 proto dhcp scope link metric 1024

    eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq state UP
    group default qlen 1000 link/ether 42:01:0a:a4:00:4f brd ff:ff:ff:ff:ff:ff
    inet 10.164.0.79/32 scope global dynamic eth0

As we can see, eth0 had IP addr assigned from /32 subnet which made the
check for nexthop in the same L2 to fail regardless of the "10.164.0.1
dev eth0" route.

This commit removes the check, as nexthop is guaranteed to be L2-reachable
by the Linux kernel routing system.

Reported-by: Joe Stringer <joe@cilium.io>
Reported-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
@brb brb force-pushed the pr/brb/v1.9-rm-arping-l2-addr-check branch from fac4b87 to a2dd505 Compare January 4, 2021 08:38
The updated version contains the following change:

    arping: Make PingOverIface* to accept source IP addr

    Instead of trying to derive the src IP with FindIPInNetworkFromIface(),
    accept src IP addr as a param. The latter function is broken for the
    case when the iface has IP addr assigned from the /32 subnet and there
    exists an route to the gw via the iface (GKE).

Also, use the source IP addr for arping from the nexthop route.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
@brb
Copy link
Member Author

brb commented Jan 4, 2021

test-backport-1.9

@aanm aanm removed their assignment Jan 4, 2021
@brb
Copy link
Member Author

brb commented Jan 5, 2021

retest-net-next

@brb
Copy link
Member Author

brb commented Jan 5, 2021

retest-runtime

@brb
Copy link
Member Author

brb commented Jan 6, 2021

CI net-next failed with:

/home/jenkins/workspace/Cilium-PR-K8s-1.13-net-next/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:514
Can not connect to service "http://192.168.36.11:31362" from outside cluster
Expected command: kubectl exec -n kube-system log-gatherer-784hz -- curl --path-as-is -s -D /dev/stderr --fail --connect-timeout 5 --max-time 20 http://192.168.36.11:31362 -w "time-> DNS: '%{time_namelookup}(%{remote_ip})', Connect: '%{time_connect}',Transfer '%{time_starttransfer}', total '%{time_total}'" 
To succeed, but it failed:
Exitcode: 1 
Err: exit status 1
Stdout:
 	 
Stderr:
 	 error: unable to upgrade connection: Authorization error (user=kube-apiserver-kubelet-client, verb=create, resource=nodes, subresource=proxy)

@brb
Copy link
Member Author

brb commented Jan 6, 2021

retest-net-next

@brb brb added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 6, 2021
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two backported commits are missing references to the upstream commits. Those references are automatically added when using our cherry-pick script.

@brb If you want to fix that, I can merge without rerunning the whole tests after the update. Otherwise, I'll merge before end of day.

@brb
Copy link
Member Author

brb commented Jan 6, 2021

The two backported commits are missing references to the upstream commits. Those references are automatically added when using our cherry-pick script.

There are no upstream commits. The commits in this PR are only for v1.9.

@pchaigno
Copy link
Member

pchaigno commented Jan 6, 2021

The two backported commits are missing references to the upstream commits. Those references are automatically added when using our cherry-pick script.

There are no upstream commits. The commits in this PR are only for v1.9.

Oh, I missed that the "upstream" PR was closed and I had completely forgot our discussion. Merging both backport PRs 🚢

@pchaigno pchaigno merged commit c696ae4 into v1.9 Jan 6, 2021
@pchaigno pchaigno deleted the pr/brb/v1.9-rm-arping-l2-addr-check branch January 6, 2021 15:36
@aanm aanm added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants