Skip to content

fix(ci): apply cr in nested cluster#2319

Merged
universal-itengineer merged 1 commit into
mainfrom
fix/ci/e2e-nested-apply-cr
May 5, 2026
Merged

fix(ci): apply cr in nested cluster#2319
universal-itengineer merged 1 commit into
mainfrom
fix/ci/e2e-nested-apply-cr

Conversation

@universal-itengineer
Copy link
Copy Markdown
Member

@universal-itengineer universal-itengineer commented May 5, 2026

Description

Add retry logic around kubectl apply calls in the nested E2E Configure Virtualization step.

The workflow now writes the manifest received from stdin to a temporary file and retries the same kubectl apply command several times before failing. This covers transient Kubernetes API or DNS failures while applying ModuleSource, ModuleConfig, and ModulePullOverride resources.

Why do we need it, and what problem does it solve?

The nested E2E pipeline can fail during Virtualization configuration when kubectl apply cannot reach the cluster API long enough to download the OpenAPI schema. This is a transient infrastructure/connectivity issue, but it currently fails the whole CI run immediately.

Example failure:

[INFO] Apply ModuleSource dev config
error: error validating "STDIN": error validating data: failed to download openapi: Get "https://api.nightly-e2e-replicated-077b199-b827.e2e.virtlab.flant.com/openapi/v2?timeout=32s": dial tcp: lookup api.nightly-e2e-replicated-077b199-b827.e2e.virtlab.flant.com on 127.0.0.53:53: read udp 127.0.0.1:46600->127.0.0.53:53: i/o timeout; if you choose to ignore these errors, turn validation off with --validate=false
Error: Process completed with exit code 1.

Retrying the apply operation gives the nested cluster a chance to recover from short DNS/API hiccups without disabling manifest validation.

What is the expected result?

  1. Run the nested E2E workflow.
  2. Trigger or encounter a short DNS/API outage during the Configure Virtualization step.
  3. Verify that the workflow retries kubectl apply instead of failing on the first transient error.
  4. Confirm the job continues once the apply command succeeds.

Checklist

  • The code is covered by unit tests.
  • e2e tests passed.
  • Documentation updated according to the changes.
  • Changes were tested in the Kubernetes cluster manually.

Changelog entries

section: ci
type: fix
summary: Retry applying Virtualization configuration in nested E2E workflow.
impact_level: low

Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
@universal-itengineer universal-itengineer added this to the v1.9.0 milestone May 5, 2026
@universal-itengineer universal-itengineer marked this pull request as ready for review May 5, 2026 10:50
@universal-itengineer universal-itengineer force-pushed the fix/ci/e2e-nested-apply-cr branch from 1235732 to 08bfd52 Compare May 5, 2026 11:07
@universal-itengineer universal-itengineer merged commit c92489f into main May 5, 2026
55 of 58 checks passed
@universal-itengineer universal-itengineer deleted the fix/ci/e2e-nested-apply-cr branch May 5, 2026 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants