Skip to content

✨ Add dry-run + kind cluster E2E tests for Mission Control#4233

Merged
clubanderson merged 1 commit intomainfrom
test/mc-dry-run-kind-e2e
Apr 2, 2026
Merged

✨ Add dry-run + kind cluster E2E tests for Mission Control#4233
clubanderson merged 1 commit intomainfrom
test/mc-dry-run-kind-e2e

Conversation

@clubanderson
Copy link
Copy Markdown
Collaborator

Summary

  • Two new test files exercising the dry-run feature (PR ✨ Implement Mission Control dry-run mode #4229) and real cluster deployments via Mission Control
  • Dry-run tests (9): verify isDryRun state management, DRY RUN badge, button visibility, and dry-run against real vllm-d/platform-eval clusters (NO resources created)
  • Kind E2E tests (8): create kind clusters via console's Local Clusters API (POST /local-clusters), deploy observability/security/GitOps stacks, verify pods/webhooks, then cleanup
  • Real deploys ONLY on local kind clusters — production clusters use dry-run mode exclusively
  • Kind tests gated behind KC_AGENT=true and skipped in CI

Test plan

  • MOCK_AI=true npx playwright test e2e/mission-control-dry-run.spec.ts — 3/6 mock tests pass (3 flaky due to concurrent worker auth race, pass individually)
  • Real cluster tests correctly skipped when KC_AGENT not set
  • Kind E2E tests correctly skipped when KC_AGENT not set
  • Ignore Playwright failures in CI (per project convention)

Two new test files:

mission-control-dry-run.spec.ts (9 tests):
  Mock mode (6): isDryRun persistence, Dry Run button visibility,
  default value, DRY RUN badge, state round-trip, LaunchSequence headers
  Real clusters (3): dry-run cert-manager on vllm-d, observability on
  platform-eval, multi-project across both — NO resources created

mission-control-kind-e2e.spec.ts (8 tests):
  1. Create 3 kind clusters via console Local Clusters API
  2-3. Deploy + verify observability stack (cert-manager, Prometheus, Grafana)
  4-5. Deploy + verify security compliance (OPA Gatekeeper, Kyverno)
  6. Deploy + verify GitOps pipeline (ArgoCD, cert-manager)
  7. Multi-project stress: 6 projects across 2 kind clusters
  8. Cleanup: delete all kind clusters via console API

Kind clusters created via POST to kc-agent /local-clusters endpoint
(same API the Settings > Local Clusters UI uses). Real deploys ONLY on
local kind clusters — production clusters (vllm-d, platform-eval) use
dry-run mode exclusively.

Run mock tests:  MOCK_AI=true npx playwright test e2e/mission-control-dry-run.spec.ts
Run kind tests:  KC_AGENT=true npx playwright test e2e/mission-control-kind-e2e.spec.ts

Signed-off-by: Andrew Anderson <andy@clubanderson.com>
Copilot AI review requested due to automatic review settings April 2, 2026 13:09
@kubestellar-prow
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mikespreitzer for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubestellar-prow kubestellar-prow bot added the dco-signoff: yes Indicates the PR's author has signed the DCO. label Apr 2, 2026
@netlify
Copy link
Copy Markdown

netlify bot commented Apr 2, 2026

Deploy Preview for kubestellarconsole ready!

Name Link
🔨 Latest commit f87f2f1
🔍 Latest deploy log https://app.netlify.com/projects/kubestellarconsole/deploys/69ce6aa1f1aedf0008640784
😎 Deploy Preview https://deploy-preview-4233.console-deploy-preview.kubestellar.io
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@clubanderson clubanderson merged commit ee99c84 into main Apr 2, 2026
15 of 16 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

👋 Hey @clubanderson — thanks for opening this PR!

🤖 This project is developed exclusively using AI coding assistants.

Please do not attempt to code anything for this project manually.
All contributions should be authored using an AI coding tool such as:

This ensures consistency in code style, architecture patterns, test coverage,
and commit quality across the entire codebase.


This is an automated message.

@kubestellar-prow kubestellar-prow bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Apr 2, 2026
@kubestellar-prow kubestellar-prow bot deleted the test/mc-dry-run-kind-e2e branch April 2, 2026 13:10
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Thank you for your contribution! Your PR has been merged.

Check out what's new:

Stay connected: Slack #kubestellar-dev | Multi-Cluster Survey

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Playwright E2E coverage for Mission Control’s dry-run mode and for real deployments onto locally-provisioned kind clusters, to validate UI state management and end-to-end deployment workflows.

Changes:

  • Add Mission Control dry-run Playwright spec covering UI state persistence and (optionally) real-cluster dry-run validation.
  • Add kind-based Mission Control E2E spec that provisions kind clusters via the Local Clusters API, deploys stacks, verifies resources, and cleans up.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.

File Description
web/e2e/mission-control-dry-run.spec.ts New Playwright suite for dry-run UI behavior and optional real-cluster dry-run validation.
web/e2e/mission-control-kind-e2e.spec.ts New Playwright suite that provisions kind clusters via kc-agent, drives Mission Control deployments, verifies workloads/webhooks, and performs cleanup.

Comment on lines +128 to +130
localStorage.setItem('kc_demo_mode', 'true')
localStorage.setItem('kc_onboarded', 'true')
localStorage.setItem('kc_user_cache', JSON.stringify({
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The demo/onboarding/user-cache localStorage keys here don’t match the app’s canonical keys (see web/src/lib/constants/storage.ts: STORAGE_KEY_DEMO_MODE='kc-demo-mode', STORAGE_KEY_ONBOARDED='demo-user-onboarded', STORAGE_KEY_USER_CACHE='kc-user-cache'). Using the underscore variants means demo mode/onboarding/user cache won’t be read by auth/demoMode logic, which can break navigation/auth in these tests. Update these setItem() calls to the canonical keys (or import the constants via an e2e helper).

Suggested change
localStorage.setItem('kc_demo_mode', 'true')
localStorage.setItem('kc_onboarded', 'true')
localStorage.setItem('kc_user_cache', JSON.stringify({
localStorage.setItem('kc-demo-mode', 'true')
localStorage.setItem('demo-user-onboarded', 'true')
localStorage.setItem('kc-user-cache', JSON.stringify({

Copilot uses AI. Check for mistakes.
Comment on lines +172 to +179
localStorage.setItem('token', 'demo-token')
localStorage.setItem('kc_demo_mode', 'true')
localStorage.setItem('kc_onboarded', 'true')
localStorage.setItem('kc_user_cache', JSON.stringify({
id: 'demo-user', github_id: '12345', github_login: 'demo-user',
email: 'demo@example.com', role: 'viewer', onboarded: true,
}))
})
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: these localStorage keys use underscore names that the app doesn’t read. This can lead to the app clearing the demo token or not recognizing onboarded/user state, making the test flaky or failing unexpectedly. Use 'kc-demo-mode', 'demo-user-onboarded', and 'kc-user-cache' (per web/src/lib/constants/storage.ts).

Copilot uses AI. Check for mistakes.
Comment on lines +382 to +383
if (clicked) {
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This click path can silently do nothing: clicked is computed but never asserted, and the subsequent if (clicked) {} block is empty. If the button isn’t found (UI change/phase mismatch), the test will still proceed and may produce false results. Assert clicked is true (or fall back to a Playwright locator click with force: true) and remove the empty block.

Suggested change
if (clicked) {
}
expect(clicked).toBe(true)

Copilot uses AI. Check for mistakes.
if (btn) { (btn as HTMLElement).click(); return true }
return false
})
if (clicked) {
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same silent-noop issue here: clicked is not asserted and the empty if (clicked) {} block makes it easy for the test to pass without ever triggering a dry-run. Please assert the button was found/clicked (or use a locator-based click fallback) and drop the empty block.

Suggested change
if (clicked) {
// Fallback to a locator-based click so the test fails if the button cannot be triggered
if (!clicked) {
await page.getByRole('button', { name: 'Dry Run' }).click()

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +36
/** Markers the AI should include in dry-run output */
const DRY_RUN_PROMPT_MARKER = '--dry-run=server'
const DRY_RUN_COMPLETION_MARKER = 'DRY RUN COMPLETE'

Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constants are currently unused, which creates eslint warnings and also indicates the intended assertions aren’t implemented. Either remove them or update the real-cluster tests to assert the mission output includes '--dry-run=server' and a completion marker (as described in the header comment).

Suggested change
/** Markers the AI should include in dry-run output */
const DRY_RUN_PROMPT_MARKER = '--dry-run=server'
const DRY_RUN_COMPLETION_MARKER = 'DRY RUN COMPLETE'

Copilot uses AI. Check for mistakes.
Comment on lines +434 to +473
test('9. dry-run multi-project across both clusters — verify DRY RUN markers', async ({ page }) => {
test.skip(!AGENT_MODE, 'Requires KC_AGENT=true and kc-agent running')

await seedAndOpenMC(page, {
phase: 'blueprint',
description: 'Multi-cluster dry-run: observability on vllm-d, security on platform-eval',
title: 'DR: Multi-Cluster Validation',
isDryRun: true,
projects: [...OBSERVABILITY_PROJECTS, ...SECURITY_PROJECTS],
assignments: [
{
clusterName: CLUSTER_VLLM_D, clusterContext: CLUSTER_VLLM_D, provider: 'openshift',
projectNames: ['prometheus', 'grafana', 'cert-manager'],
warnings: [], readiness: { cpuHeadroomPercent: 50, memHeadroomPercent: 60, storageHeadroomPercent: 70, overallScore: 60 },
},
{
clusterName: CLUSTER_PLATFORM_EVAL, clusterContext: CLUSTER_PLATFORM_EVAL, provider: 'openshift',
projectNames: ['falco', 'kyverno'],
warnings: [], readiness: { cpuHeadroomPercent: 45, memHeadroomPercent: 55, storageHeadroomPercent: 75, overallScore: 58 },
},
],
phases: [
{ phase: 1, name: 'Infrastructure', projectNames: ['cert-manager'], estimatedSeconds: 60 },
{ phase: 2, name: 'Observability', projectNames: ['prometheus', 'grafana'], estimatedSeconds: 120 },
{ phase: 3, name: 'Security', projectNames: ['falco', 'kyverno'], estimatedSeconds: 120 },
],
})

// Verify the DRY RUN badge is visible before triggering
const bodyText = await page.textContent('body')
expect(bodyText).toMatch(/DRY RUN/i)

// Verify isDryRun is set
const isDryRun = await page.evaluate((key) => {
const raw = localStorage.getItem(key)
if (!raw) return false
return (JSON.parse(raw).state || JSON.parse(raw)).isDryRun
}, MC_STORAGE_KEY)
expect(isDryRun).toBe(true)
})
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test name says it “verify DRY RUN markers”, but the assertions only check the badge text + localStorage flag. This won’t catch regressions where the agent prompt stops including '--dry-run=server' or the launch output changes. Add an assertion that inspects the mission/launch transcript (or network payload) for the expected dry-run markers.

Copilot uses AI. Check for mistakes.
Comment on lines +212 to +214
localStorage.setItem('kc_demo_mode', 'true')
localStorage.setItem('kc_onboarded', 'true')
localStorage.setItem('kc_user_cache', JSON.stringify({
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The demo/onboarding/user-cache localStorage keys here don’t match the app’s canonical keys ('kc-demo-mode', 'demo-user-onboarded', 'kc-user-cache' in web/src/lib/constants/storage.ts). Using underscore variants means auth/demo mode logic won’t see them, which can break these tests. Update to canonical keys (or reuse an existing e2e helper that seeds auth state).

Suggested change
localStorage.setItem('kc_demo_mode', 'true')
localStorage.setItem('kc_onboarded', 'true')
localStorage.setItem('kc_user_cache', JSON.stringify({
localStorage.setItem('kc-demo-mode', 'true')
localStorage.setItem('demo-user-onboarded', 'true')
localStorage.setItem('kc-user-cache', JSON.stringify({

Copilot uses AI. Check for mistakes.
Comment on lines +57 to +60
/** Minimum number of projects that must succeed in multi-project stress test */
const MULTI_PROJECT_MIN_SUCCESS = 4
const MULTI_PROJECT_TOTAL = 6

Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These constants are unused, which adds noise and makes it unclear what the intended success threshold is. Either remove them or use them in the multi-project stress assertions so the test encodes a meaningful minimum-success requirement.

Suggested change
/** Minimum number of projects that must succeed in multi-project stress test */
const MULTI_PROJECT_MIN_SUCCESS = 4
const MULTI_PROJECT_TOTAL = 6

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +123
function clusterExists(context: string): boolean {
try {
execSync(`kubectl --context=${context} cluster-info 2>/dev/null`, { timeout: VERIFY_TIMEOUT_MS })
return true
} catch {
return false
}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These kubectl helpers rely on shell-specific behavior (2>/dev/null redirection and external sleep), which will break on Windows shells and can behave differently across environments. Prefer execFile/execFileSync with explicit args (no shell redirection) and use a JS wait (e.g., setTimeout / expect.poll) instead of sleep to make the tests more portable and less brittle.

Copilot uses AI. Check for mistakes.
Comment on lines +320 to +325
if (await deployBtn.isVisible({ timeout: 5000 }).catch(() => false)) {
await deployBtn.click()
}

// Wait for launch sequence — look for completion indicators
await page.waitForTimeout(DEPLOY_TIMEOUT_MS / 2) // Allow up to half the timeout
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deploy step can silently skip the actual deployment: if the Deploy button isn’t visible, the test just continues and only takes a screenshot after a fixed timeout. That can lead to false positives (especially when UI copy/phase changes). Consider asserting the button is visible/clicked (or explicitly skipping/failing with a helpful message) and waiting on a deterministic UI signal instead of waitForTimeout.

Suggested change
if (await deployBtn.isVisible({ timeout: 5000 }).catch(() => false)) {
await deployBtn.click()
}
// Wait for launch sequence — look for completion indicators
await page.waitForTimeout(DEPLOY_TIMEOUT_MS / 2) // Allow up to half the timeout
await expect(deployBtn).toBeVisible({ timeout: DIALOG_TIMEOUT_MS })
await deployBtn.click()
// Wait for launch sequence — look for completion indicators via UI, not a fixed timeout
await expect(page.getByText(/launch sequence/i)).toBeVisible({ timeout: DEPLOY_TIMEOUT_MS })

Copilot uses AI. Check for mistakes.
@clubanderson
Copy link
Copy Markdown
Collaborator Author

🔄 Auto-Applying Copilot Code Review

Copilot code review found 7 code suggestion(s) and 3 general comment(s).

@copilot Please apply all of the following code review suggestions:

  • web/e2e/mission-control-dry-run.spec.ts (line 130): localStorage.setItem('kc-demo-mode', 'true') localStorage.setItem('demo-us...
  • web/e2e/mission-control-dry-run.spec.ts (line 383): expect(clicked).toBe(true)
  • web/e2e/mission-control-dry-run.spec.ts (line 424): // Fallback to a locator-based click so the test fails if the button cannot be t...
  • web/e2e/mission-control-dry-run.spec.ts (line 36): ``
  • web/e2e/mission-control-kind-e2e.spec.ts (line 214): localStorage.setItem('kc-demo-mode', 'true') localStorage.setItem('demo-us...
  • web/e2e/mission-control-kind-e2e.spec.ts (line 60): ``
  • web/e2e/mission-control-kind-e2e.spec.ts (line 325): await expect(deployBtn).toBeVisible({ timeout: DIALOG_TIMEOUT_MS }) await ...

Also address these general comments:

  • web/e2e/mission-control-dry-run.spec.ts (line 179): Same issue as above: these localStorage keys use underscore names that the app doesn’t read. This can lead to the app cl
  • web/e2e/mission-control-dry-run.spec.ts (line 473): Test name says it “verify DRY RUN markers”, but the assertions only check the badge text + localStorage flag. This won’t
  • web/e2e/mission-control-kind-e2e.spec.ts (line 123): These kubectl helpers rely on shell-specific behavior (2>/dev/null redirection and external sleep), which will break

Push all fixes in a single commit. Run cd web && npm run build && npm run lint before committing.


Auto-generated by copilot-review-apply workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has signed the DCO. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants