ci-multicluster: Fix post-test information gathering #16712

gandro · 2021-06-30T16:16:07Z

This fixes two issues with the ci-multicluster log gathering:

cilium clustermesh status can unfortunately block even when
not using --wait.
See cilium clustermesh status blocks even when not using --wait cilium-cli#384
This causes a failing job to be cancelled, thus causing any
post-failiure steps to be skipped as well.
This commit works around the issue by manually adding a timeout to
the post-failiure cilium clustermes status invocation.
Sysdumps were not collected for both clusters. This is fixed by
manually switching to each cluster context using kubectl.

This fixes two issues with the ci-multicluster log gathering: 1. `cilium clustermesh status` can unfortunately block even when not using `--wait`. See cilium/cilium-cli#384 This causes a failing job to be cancelled, thus causing any post-failiure steps to be skipped as well. This commit works around the issue by manually adding a timeout to the post-failiure `cilium clustermes status` invocation. 2. Sysdumps were not collected for both clusters. This is fixed by manually switching to each cluster context using `kubectl`. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>

gandro · 2021-06-30T16:16:28Z

I have successfully tested this in #16705 - the artificial failure caused the correct double sysdump to be created: https://github.com/cilium/cilium/pull/16705/checks?check_run_id=2954156565

Artifacts:

nbusseneau

Thanks LGTM. Do you want to make changes in cilium-cli (https://github.com/cilium/cilium-cli/blob/daed305be87be2ba609a077d766ddd8b404c5623/.github/workflows/multicluster.yaml#L131-L138) accordingly for retrieving sysdump for both clusters? Otherwise I'll make a note of doing that later.

nbusseneau · 2021-07-01T16:06:41Z

We should probably also make a note of removing the workarounds once we bump up the CLI version to a release with the fix for cilium/cilium-cli#384.

gandro · 2021-07-05T09:12:54Z

Thanks LGTM. Do you want to make changes in cilium-cli (https://github.com/cilium/cilium-cli/blob/daed305be87be2ba609a077d766ddd8b404c5623/.github/workflows/multicluster.yaml#L131-L138) accordingly for retrieving sysdump for both clusters? Otherwise I'll make a note of doing that later.

Oh, good catch. I will open a PR on cilium/cilium-cli as well then.

gandro added area/CI-improvement Topic or proposal to improve the Continuous Integration workflow release-note/ci This PR makes changes to the CI. labels Jun 30, 2021

gandro requested review from a team as code owners June 30, 2021 16:16

gandro requested review from christarazi and nbusseneau June 30, 2021 16:16

maintainer-s-little-helper bot assigned christarazi and nbusseneau Jun 30, 2021

christarazi approved these changes Jun 30, 2021

View reviewed changes

maintainer-s-little-helper bot unassigned christarazi Jun 30, 2021

nbusseneau approved these changes Jul 1, 2021

View reviewed changes

maintainer-s-little-helper bot unassigned nbusseneau Jul 1, 2021

aanm approved these changes Jul 2, 2021

View reviewed changes

aanm merged commit 2641808 into cilium:master Jul 2, 2021

gandro mentioned this pull request Jul 5, 2021

ci/multicluster: Run sysdump in both clusters cilium/cilium-cli#396

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci-multicluster: Fix post-test information gathering #16712

ci-multicluster: Fix post-test information gathering #16712

gandro commented Jun 30, 2021

gandro commented Jun 30, 2021

nbusseneau left a comment •

edited

Loading

nbusseneau commented Jul 1, 2021

gandro commented Jul 5, 2021

ci-multicluster: Fix post-test information gathering #16712

ci-multicluster: Fix post-test information gathering #16712

Conversation

gandro commented Jun 30, 2021

gandro commented Jun 30, 2021

nbusseneau left a comment • edited Loading

Choose a reason for hiding this comment

nbusseneau commented Jul 1, 2021

gandro commented Jul 5, 2021

nbusseneau left a comment •

edited

Loading