Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci/K8sHubble: Retry failed requests on hubble-relay #11708

Merged
merged 1 commit into from May 27, 2020

Conversation

gandro
Copy link
Member

@gandro gandro commented May 27, 2020

The current version of hubble-relay is intentionally not very aggressive
when reconnecting to failed nodes. It will only try to reconnect once it
receives a new request.

In CI, this behavior causes flakes. The first request to hubble-relay
may fail in case cilium-agent was not yet ready when hubble-relay was
deployed (see referenced issue below for examples).

This PR introduces a retry mechanism for CI: We retry the
request to hubble-relay after 5 seconds, this allows it to re-establish
the connection to each node for subsequent requests.

Fixes: #11707

The current version of hubble-relay is intentionally not very aggressive
when reconnecting to failed nodes. It will only try to reconnect once it
receives a new request.

In CI, this behavior causes flakes. The first request to hubble-relay
may fail in case cilium-agent was not yet ready when hubble-relay was
deployed. This commit introduces a retry mechanism for CI: We retry the
request to hubble-relay after 5 seconds, this allows it to re-establish
the connection to each node for subsequent requests.

Fixes: #11707

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@gandro gandro added area/CI Continuous Integration testing issue or flake release-note/ci This PR makes changes to the CI. labels May 27, 2020
@gandro gandro requested a review from rolinh May 27, 2020 11:01
@gandro gandro requested a review from a team as a code owner May 27, 2020 11:01
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.8.0 May 27, 2020
@gandro
Copy link
Member Author

gandro commented May 27, 2020

test-me-please

@coveralls
Copy link

Coverage Status

Coverage remained the same at 36.885% when pulling 43c8621 on pr/gandro/ci-hubble-relay-retry into 9ba07e3 on master.

@gandro
Copy link
Member Author

gandro commented May 27, 2020

@gandro gandro added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 27, 2020
@christarazi christarazi merged commit 40320d6 into master May 27, 2020
1.8.0 automation moved this from In progress to Merged May 27, 2020
@christarazi christarazi deleted the pr/gandro/ci-hubble-relay-retry branch May 27, 2020 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/ci This PR makes changes to the CI.
Projects
No open projects
1.8.0
  
Merged
Development

Successfully merging this pull request may close these issues.

CI: Suite-k8s-1.17.K8sHubbleTest Hubble Observe Test L3/L4 Flow with hubble-relay
6 participants