Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: Reset Pod's queue mapping in host veth to fix phys dev mq selection #18388

Merged
merged 1 commit into from
Jan 6, 2022

Conversation

borkmann
Copy link
Member

@borkmann borkmann commented Jan 5, 2022

Fix TX queue selection problem on the phys device as reported by Laurent.
At high throughput, they noticed a significant amount of TCP retransmissions
that they tracked back to qdic drops (fq_codel was used).

Suspicion is that kernel commit edbea9220251 ("veth: Store queue_mapping
independently of XDP prog presence") caused this due to its unconditional
skb_record_rx_queue() which sets queue mapping to 1, and thus this gets
propagated all the way to the physical device hitting only single queue
in a mq device.

Lets have bpf_lxc reset it as a workaround until we have a kernel fix.
Doing this unconditionally is good anyway in order to avoid Pods messing
with TX queue selection.

Related: #18311
Reported-by: Laurent Bernaille laurent.bernaille@datadoghq.com
Signed-off-by: Daniel Borkmann daniel@iogearbox.net
Link: torvalds/linux@edbea92

@borkmann borkmann added release-note/misc This PR makes changes that have no direct user impact. needs-backport/1.9 labels Jan 5, 2022
@borkmann borkmann requested review from a team as code owners January 5, 2022 21:37
@borkmann borkmann requested a review from ldelossa January 5, 2022 21:37
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.7 Jan 5, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.9.12 Jan 5, 2022
@borkmann
Copy link
Member Author

borkmann commented Jan 5, 2022

/test

Job 'Cilium-PR-K8s-1.22-kernel-4.19' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sServicesTest Checks graceful termination of service endpoints Checks client terminates gracefully on service endpoint deletion

Failure Output

FAIL: Timed out after 60.000s.

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.22-kernel-4.19 so I can create a new GitHub issue to track it.

Job 'Cilium-PR-K8s-GKE' hit: #17628 (96.35% similarity)

@lbernail
Copy link
Contributor

lbernail commented Jan 6, 2022

I tested this patch on a test cluster and it worked great (details on the test here : #18311 (comment))

@borkmann borkmann merged commit ecdff12 into master Jan 6, 2022
@borkmann borkmann deleted the pr/fix-qm branch January 6, 2022 19:26
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.12 Jan 10, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.10.7 Jan 14, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.12 Jan 18, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.10 in 1.10.7 Jan 18, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.10 in 1.10.7 Jan 18, 2022
@joestringer joestringer added backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. and removed backport-pending/1.11 labels Jan 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. release-note/misc This PR makes changes that have no direct user impact.
Projects
No open projects
1.10.7
Backport done to v1.10
1.9.12
Backport done to v1.9
Development

Successfully merging this pull request may close these issues.

None yet

8 participants