Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arp: Add retries to arping #14601

Merged
merged 1 commit into from
Jan 18, 2021
Merged

arp: Add retries to arping #14601

merged 1 commit into from
Jan 18, 2021

Conversation

brb
Copy link
Member

@brb brb commented Jan 13, 2021

It has been observed that sometimes arping fails with "i/o timeout".
Further investigation [1] has shown that this happen due to the kernel
not sending packets. Therefore, to mitigate the issue, try multiple times
to send the request if the timeout error is encountered.

[1]: #14125 (comment)


Currently, I'm not labeling it for backporting to v1.{8,9}, as I've opened a discussion whether we could backport the arp library too which would significantly reduce complexity of arping-related backports.

@brb brb added pending-review area/daemon Impacts operation of the Cilium daemon. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. labels Jan 13, 2021
@brb brb requested a review from a team January 13, 2021 15:07
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.10.0 Jan 13, 2021
@brb brb requested a review from fristonio January 13, 2021 15:07
@brb
Copy link
Member Author

brb commented Jan 13, 2021

test-me-please

@brb
Copy link
Member Author

brb commented Jan 14, 2021

retest-4.19

Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After fixing the requested changes I would suggest to just run 1 more time the runtime CI since the test was faulty in there.

pkg/datapath/linux/node_linux_test.go Outdated Show resolved Hide resolved
It has been observed that sometimes arping fails with "i/o timeout".
Further investigation [1] has shown that this happen due to the kernel
not sending packets. Therefore, to mitigate the issue,try multiple times
to send the request if the timeout error is encountered.

[1]: #14125 (comment)

Signed-off-by: Martynas Pumputis <m@lambda.lt>
@brb
Copy link
Member Author

brb commented Jan 14, 2021

Previously all tests were green. Pushed only a comment change. Retrying the runtime suite to see whether the flake is not reintroduced.

@brb
Copy link
Member Author

brb commented Jan 14, 2021

test-runtime

1 similar comment
@brb
Copy link
Member Author

brb commented Jan 17, 2021

test-runtime

@brb
Copy link
Member Author

brb commented Jan 18, 2021

@aanm Ran the test 3 times - all passed.

@aanm aanm merged commit 2284c76 into master Jan 18, 2021
@aanm aanm deleted the pr/brb/retry-arping branch January 18, 2021 09:00
@brb
Copy link
Member Author

brb commented Jan 18, 2021

Once cilium/arping#6 has been merged, I will manually backport to v1.8 and v1.9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/daemon Impacts operation of the Cilium daemon. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants