ci(live): run cli/test/live/ on every PR against a freshly-booted backend#99
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a GitHub Actions workflow that boots the API+MySQL backend, seeds MySQL, mints a short-lived API key, runs the live test suite ( ChangesLive Integration Test Suite CI Automation and Documentation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies" Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
cli/pkg/client/client_test.go (1)
1800-1806: ⚡ Quick winLock
instance_idscontract in all stack bulk success stubs.
TestBulkStop_Success,TestBulkClean_Success, andTestBulkDelete_Successdon’t decode/assert the request body, so a payload-key regression could pass these tests while onlyTestBulkDeploy_Successcatches it.Suggested patch
func TestBulkStop_Success(t *testing.T) { t.Parallel() server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { assert.Equal(t, http.MethodPost, r.Method) assert.Equal(t, "/api/v1/stack-instances/bulk/stop", r.URL.Path) + var body types.BulkInstancesRequest + require.NoError(t, json.NewDecoder(r.Body).Decode(&body)) + assert.Equal(t, []string{"1"}, body.InstanceIDs) w.WriteHeader(http.StatusOK) json.NewEncoder(w).Encode(types.BulkResponse{ Results: []types.BulkOperationResult{ {InstanceID: "1", Status: "success"}, }, @@ func TestBulkClean_Success(t *testing.T) { t.Parallel() server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { assert.Equal(t, http.MethodPost, r.Method) assert.Equal(t, "/api/v1/stack-instances/bulk/clean", r.URL.Path) + var body types.BulkInstancesRequest + require.NoError(t, json.NewDecoder(r.Body).Decode(&body)) + assert.Equal(t, []string{"5", "6"}, body.InstanceIDs) w.WriteHeader(http.StatusOK) json.NewEncoder(w).Encode(types.BulkResponse{ Results: []types.BulkOperationResult{ {InstanceID: "5", Status: "success"}, {InstanceID: "6", Status: "success"}, @@ func TestBulkDelete_Success(t *testing.T) { t.Parallel() server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { assert.Equal(t, http.MethodPost, r.Method) assert.Equal(t, "/api/v1/stack-instances/bulk/delete", r.URL.Path) + var body types.BulkInstancesRequest + require.NoError(t, json.NewDecoder(r.Body).Decode(&body)) + assert.Equal(t, []string{"10"}, body.InstanceIDs) w.WriteHeader(http.StatusOK) json.NewEncoder(w).Encode(types.BulkResponse{ Results: []types.BulkOperationResult{ {InstanceID: "10", Status: "success"}, },Also applies to: 1835-1841, 1870-1876
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cli/pkg/client/client_test.go` around lines 1800 - 1806, The three tests TestBulkStop_Success, TestBulkClean_Success, and TestBulkDelete_Success need to assert the request body contract (ensure the JSON contains "instance_ids") like TestBulkDeploy_Success does: in each httptest handler for those tests decode the request body (e.g., via json.NewDecoder(r.Body).Decode into a struct or map) and assert that the "instance_ids" key (or the expected slice) is present and correct before writing the success response; update the handlers inside TestBulkStop_Success, TestBulkClean_Success, and TestBulkDelete_Success to perform this decode/assert to lock the payload-key contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/live-tests.yml:
- Around line 86-94: The failure path currently echoes the full login JSON
(login) including sensitive JWTs and user IDs; change the log to redact or
remove sensitive fields instead of printing the raw payload—replace the echo
"$login" | jq . >&2 call with a sanitized jq transformation that deletes or
masks .token, .access_token, and .user.id (or sets them to "REDACTED") so you
still get useful debug output without exposing credentials; locate the
login/jwt/admin_id usage in the login block where BACKEND_URL, ADMIN_USERNAME,
ADMIN_PASSWORD, jwt, admin_id, and login are referenced and apply the jq
redaction there.
- Around line 36-40: Replace floating action tags and leaking logs: pin
actions/checkout@v6 and actions/setup-go@v6 to specific commit SHAs (replace the
`@v6` references with the corresponding full SHA), add persist-credentials: false
to the checkout step (the step currently using actions/checkout) to avoid
leaving credentials in the workspace, and stop printing raw auth JSON by
removing or replacing the echo "$login" | jq . >&2 and echo "$mint" | jq . >&2
lines—instead log only non-sensitive fields or a redacted summary (e.g., status
and expiry) so tokens/JWTs are never emitted.
In `@cli/cmd/bulk_test.go`:
- Around line 25-31: The template bulk tests are using the instance-scoped
fixture sampleBulkResponse instead of a template-scoped fixture, so swap those
usages to use sampleBulkTemplateResponse; locate the template
publish/unpublish/delete test cases that call sampleBulkResponse (mentions
around the blocks that assert instance_id) and replace those calls with
sampleBulkTemplateResponse, ensuring the assertions check template_id in the
returned types.BulkResponse.Results entries (and that sampleBulkTemplateResponse
returns TemplateID/populates the template-scoped fields expected by the tests).
In `@cli/test/live/notification_live_test.go`:
- Around line 63-66: The test is order-dependent because it compares prefs[i] to
got[i]; change it to check by EventType instead: assert the lengths match, build
a map from got items keyed by EventType (e.g., gotByType :=
map[string]NotificationPreference{...}), then iterate over prefs and for each
prefs[j].EventType look up the corresponding got entry and assert its Enabled
equals prefs[j].Enabled (and that the lookup exists). Update the loop using
prefs and gotByType rather than index-based comparisons to avoid ordering
issues.
---
Nitpick comments:
In `@cli/pkg/client/client_test.go`:
- Around line 1800-1806: The three tests TestBulkStop_Success,
TestBulkClean_Success, and TestBulkDelete_Success need to assert the request
body contract (ensure the JSON contains "instance_ids") like
TestBulkDeploy_Success does: in each httptest handler for those tests decode the
request body (e.g., via json.NewDecoder(r.Body).Decode into a struct or map) and
assert that the "instance_ids" key (or the expected slice) is present and
correct before writing the success response; update the handlers inside
TestBulkStop_Success, TestBulkClean_Success, and TestBulkDelete_Success to
perform this decode/assert to lock the payload-key contract.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: de82d5da-0556-4bd4-8f3e-58b1cb8c3bcc
⛔ Files ignored due to path filters (1)
cli/test/live/testdata/ci-seed.sqlis excluded by!**/testdata/**
📒 Files selected for processing (18)
.github/workflows/live-tests.ymlcli/cmd/bulk.gocli/cmd/bulk_test.gocli/pkg/client/client.gocli/pkg/client/client_test.gocli/pkg/types/types.gocli/pkg/types/types_test.gocli/test/e2e/cli_e2e_test.gocli/test/live/audit_live_test.gocli/test/live/bulk_live_test.gocli/test/live/cluster_live_test.gocli/test/live/definition_live_test.gocli/test/live/doc.gocli/test/live/favorite_live_test.gocli/test/live/helpers_test.gocli/test/live/live_test.gocli/test/live/notification_live_test.gocli/test/live/template_live_test.go
| - name: Check out stackctl | ||
| uses: actions/checkout@v6 | ||
|
|
||
| - name: Set up Go | ||
| uses: actions/setup-go@v6 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Verify all GitHub actions are pinned by commit SHA and checkout disables persisted credentials.
fd -a '\.ya?ml$' .github/workflows | while read -r wf; do
echo "== $wf =="
rg -n '^\s*uses:\s*[^@]+@' "$wf"
rg -n 'actions/checkout@' "$wf" -A3 -B1
doneRepository: omattsson/stackctl
Length of output: 112
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Show the exact relevant sections in live-tests.yml (with line numbers)
echo "---- live-tests.yml (lines 1-140) ----"
nl -ba .github/workflows/live-tests.yml | sed -n '1,140p'
# 2) Check whether persist-credentials is set anywhere in live-tests.yml
echo "---- persist-credentials occurrences ----"
rg -n "persist-credentials" .github/workflows/live-tests.yml || true
# 3) Inspect .github/instructions for rules related to pinned actions + secret/log handling
echo "---- .github/instructions/*.md files ----"
ls -la .github/instructions || true
echo "---- instruction keyword matches ----"
rg -n "(pin(ned)?|commit sha|checkout|persist-credentials|JWT|token|secret|logs?|supply chain)" .github/instructions/*.md || true
# 4) Also find any login/echo/jq payload printing patterns in live-tests.yml
echo "---- login/echo/jq printing patterns ----"
rg -n "(name:\s*Login|login\b|mint\b|jq\b|echo\s+\"\\$login\"|echo\s+\"\\$mint\"|validation failure|fail(ed)?)" .github/workflows/live-tests.yml || trueRepository: omattsson/stackctl
Length of output: 162
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Basic file existence
echo "---- file checks ----"
test -f .github/workflows/live-tests.yml && echo "live-tests.yml: exists" || echo "live-tests.yml: missing"
test -d .github/instructions && echo "instructions/: exists" || echo "instructions/: missing"
# Show live-tests.yml with line numbers (1-180)
echo "---- live-tests.yml (lines 1-180) ----"
awk '{printf "%5d %s\n", NR, $0}' .github/workflows/live-tests.yml | sed -n '1,180p'
# Search action usage lines
echo "---- uses: actions/checkout@ / setup-go@ ----"
rg -n 'uses:\s*actions/(checkout|setup-go)@' .github/workflows/live-tests.yml || true
# Check persist-credentials
echo "---- persist-credentials occurrences ----"
rg -n 'persist-credentials' .github/workflows/live-tests.yml || true
# Check for any jq+echo of login/mint patterns
echo "---- login/mint/jq/echo sensitive payload printing patterns ----"
rg -n '(name:\s*Login|login\b|mint\b|jq\b|echo\s+"\\$login"|echo\s+"\\$mint"|echo\s+["'\'']\\$login|echo\s+["'\'']\\$mint|validation failure|fail(ed|ing)?|Authorization|Bearer|JWT|token)' .github/workflows/live-tests.yml || true
# Instructions directory listing
echo "---- .github/instructions listing ----"
ls -la .github/instructions || true
# Keyword matches in instructions
echo "---- instruction keyword matches ----"
rg -n '(pin(ned)?|commit sha|checkout|persist-credentials|JWT|token|secret|log(s)?|supply chain|credential)' .github/instructions/*.md || trueRepository: omattsson/stackctl
Length of output: 9369
Pin GitHub Actions to commit SHAs; redact auth payloads in CI logs.
.github/workflows/live-tests.ymluses floating tags (actions/checkout@v6,actions/setup-go@v6) and doesn’t setpersist-credentials: falsefor checkout, weakening CI supply-chain/security posture.- On auth failures, the workflow prints full JSON responses (
echo "$login" | jq . >&2andecho "$mint" | jq . >&2), which can include the JWT/token fields—avoid dumping raw auth payloads; redact or print only non-sensitive fields.
Suggested hardening
- - name: Check out stackctl
- uses: actions/checkout@<pinned-commit-sha>
- with:
- persist-credentials: false
+ - name: Check out stackctl
+ uses: actions/checkout@<pinned-commit-sha>
+ with:
+ persist-credentials: false
- - name: Set up Go
- uses: actions/setup-go@<pinned-commit-sha>
+ - name: Set up Go
+ uses: actions/setup-go@<pinned-commit-sha>🧰 Tools
🪛 zizmor (1.25.2)
[warning] 36-37: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[error] 37-37: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
[error] 40-40: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/live-tests.yml around lines 36 - 40, Replace floating
action tags and leaking logs: pin actions/checkout@v6 and actions/setup-go@v6 to
specific commit SHAs (replace the `@v6` references with the corresponding full
SHA), add persist-credentials: false to the checkout step (the step currently
using actions/checkout) to avoid leaving credentials in the workspace, and stop
printing raw auth JSON by removing or replacing the echo "$login" | jq . >&2 and
echo "$mint" | jq . >&2 lines—instead log only non-sensitive fields or a
redacted summary (e.g., status and expiry) so tokens/JWTs are never emitted.
| func sampleBulkResponse() types.BulkResponse { | ||
| return types.BulkResponse{ | ||
| Results: []types.BulkOperationResult{ | ||
| {ID: "1", Success: true}, | ||
| {ID: "2", Success: true}, | ||
| {ID: "3", Success: false, Error: "not found"}, | ||
| {InstanceID: "1", Status: "success"}, | ||
| {InstanceID: "2", Status: "success"}, | ||
| {InstanceID: "3", Status: "error", Error: "not found"}, | ||
| }, |
There was a problem hiding this comment.
Template bulk tests are asserting the wrong result field (instance_id).
Template bulk responses should be template-scoped. Reusing an instance-scoped fixture here means these tests won’t catch regressions where template_id is broken or missing.
Suggested fix
-func sampleBulkResponse() types.BulkResponse {
+func sampleBulkInstanceResponse() types.BulkResponse {
return types.BulkResponse{
Results: []types.BulkOperationResult{
{InstanceID: "1", Status: "success"},
{InstanceID: "2", Status: "success"},
{InstanceID: "3", Status: "error", Error: "not found"},
},
}
}
+
+func sampleBulkTemplateResponse() types.BulkResponse {
+ return types.BulkResponse{
+ Results: []types.BulkOperationResult{
+ {TemplateID: "1", Status: "success"},
+ {TemplateID: "2", Status: "success"},
+ {TemplateID: "3", Status: "error", Error: "not found"},
+ },
+ }
+}-assert.Contains(t, out, "instance_id: \"1\"")
+assert.Contains(t, out, "template_id: \"1\"")Use sampleBulkTemplateResponse() across template publish/unpublish/delete output tests.
Also applies to: 892-894, 1022-1024, 1228-1230
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cli/cmd/bulk_test.go` around lines 25 - 31, The template bulk tests are using
the instance-scoped fixture sampleBulkResponse instead of a template-scoped
fixture, so swap those usages to use sampleBulkTemplateResponse; locate the
template publish/unpublish/delete test cases that call sampleBulkResponse
(mentions around the blocks that assert instance_id) and replace those calls
with sampleBulkTemplateResponse, ensuring the assertions check template_id in
the returned types.BulkResponse.Results entries (and that
sampleBulkTemplateResponse returns TemplateID/populates the template-scoped
fields expected by the tests).
| for i := range prefs { | ||
| assert.Equal(t, prefs[i].EventType, got[i].EventType, "event_type[%d]", i) | ||
| assert.Equal(t, prefs[i].Enabled, got[i].Enabled, "enabled[%d]", i) | ||
| } |
There was a problem hiding this comment.
Avoid order-dependent preference assertions.
At Line 63, comparing prefs[i] with got[i] assumes stable ordering from the API. If ordering changes, this test can fail even when payload content is correct.
Proposed fix
got, err := c.UpdateNotificationPreferences(prefs)
require.NoError(t, err, "update notification preferences (echo)")
require.Len(t, got, len(prefs), "response length must match input")
- for i := range prefs {
- assert.Equal(t, prefs[i].EventType, got[i].EventType, "event_type[%d]", i)
- assert.Equal(t, prefs[i].Enabled, got[i].Enabled, "enabled[%d]", i)
- }
+ assert.ElementsMatch(t, prefs, got, "response must match input regardless of order")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for i := range prefs { | |
| assert.Equal(t, prefs[i].EventType, got[i].EventType, "event_type[%d]", i) | |
| assert.Equal(t, prefs[i].Enabled, got[i].Enabled, "enabled[%d]", i) | |
| } | |
| got, err := c.UpdateNotificationPreferences(prefs) | |
| require.NoError(t, err, "update notification preferences (echo)") | |
| require.Len(t, got, len(prefs), "response length must match input") | |
| assert.ElementsMatch(t, prefs, got, "response must match input regardless of order") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cli/test/live/notification_live_test.go` around lines 63 - 66, The test is
order-dependent because it compares prefs[i] to got[i]; change it to check by
EventType instead: assert the lengths match, build a map from got items keyed by
EventType (e.g., gotByType := map[string]NotificationPreference{...}), then
iterate over prefs and for each prefs[j].EventType look up the corresponding got
entry and assert its Enabled equals prefs[j].Enabled (and that the lookup
exists). Update the loop using prefs and gotByType rather than index-based
comparisons to avoid ordering issues.
|
Actionable comments posted: 0 |
…kend Closes #96. Boots k8s-stack-manager (origin/main, api-only docker-compose profile) in the same job, applies a SQL fixture so require* helpers don't skip, mints a 1-day API key via the public auth endpoints, then runs the live suite. PR-time signal that the wire contracts still match — the class of bug stub-based unit tests can't see (recently surfaced in PR #98). What ships: .github/workflows/live-tests.yml - Triggers on push to main + every PR targeting main. - Clones k8s-sm main, `docker compose up backend` (no --profile → api-only, since frontend is gated to "full"). - Polls /health/live for up to 60s. - Pipes cli/test/live/testdata/ci-seed.sql into the mysql container. - Logs in as the env-seeded admin, mints a key via POST /api/v1/users/<id>/api-keys with expires_in_days=1, masks it in logs, exports as STACKCTL_LIVE_API_KEY. - `go test -tags live -timeout 10m ./test/live/...` - Dumps backend logs on failure. cli/test/live/testdata/ci-seed.sql - 1 cluster (no kubeconfig — test-connection will fail, tests expect that), 1 stack_definition + chart_config, 1 published stack_template + template_chart_config. - owner_id is looked up via SELECT … WHERE username='admin' so we don't need to inject the non-deterministic admin UUID. cli/test/live/doc.go - Documents the env vars + how to reproduce the CI flow locally (clone k8s-sm, compose up, apply seed, mint key, run suite). Design decisions (from the open questions in #96): - Bootstrap: backend SQL seed for fixtures, login-then-mint for the API key. Avoids hardcoding bcrypt/SHA-256 hashes in test data. - Backend pin: track origin/main. Surfaces cross-repo contract breaks immediately — that's exactly the signal #96 wants. - Workflow location: separate file. Easy to mark non-required if it ever gets flaky without blocking the rest of CI. Two contract-drift gotchas surfaced while validating against rancher-desktop k8s-sm (the value this workflow will catch on PRs): - POST /api/v1/users/<id>/api-keys requires expires_at OR expires_in_days. Initial draft sent neither and got 400. - The minted-key response field is `raw_key`, not `key`. Both fixed in the workflow + doc.go before pushing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
k8s-stack-manager's docker-compose.yml bind-mounts ./backend:/app for hot-reload during dev. With target: production, that overlays the image's baked-in /app/main binary with the host source tree, which has no compiled binary — container exits immediately with "stat ./main: no such file or directory". Ships testdata/docker-compose.ci.yml as a compose override that resets the volumes list (Compose ≥1.28 !reset semantics), and wires both -f files into every compose call in the workflow. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
If login or api-key mint shapes drift, the workflow dumps the raw response to stderr for debugging. Both responses carry credentials (JWT in login, raw_key in mint) that shouldn't end up in CI logs even on the failure branch — masked outputs still surface the field names and status, which is the actual diagnostic value. Addresses CodeRabbit on PR #99. The sister findings about pinning actions/checkout@v6 and actions/setup-go@v6 to commit SHAs and adding persist-credentials: false are deliberately skipped — none of the four existing workflows in this repo follow that policy, and the right place to harden is a dedicated repo-wide change, not a one-workflow special case. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
f9faebf to
a38daf1
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
.github/workflows/live-tests.yml (1)
36-40:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPin action SHAs and disable checkout credential persistence.
Line 37 and Line 40 use floating action tags, and Line 36 does not set
persist-credentials: false. This is still a supply-chain/security hardening gap.Suggested patch
- name: Check out stackctl - uses: actions/checkout@v6 + uses: actions/checkout@<full_commit_sha_for_v6> + with: + persist-credentials: false - name: Set up Go - uses: actions/setup-go@v6 + uses: actions/setup-go@<full_commit_sha_for_v6>#!/bin/bash set -euo pipefail # Verify actions are pinned to SHAs and checkout disables credential persistence. nl -ba .github/workflows/live-tests.yml | sed -n '30,55p' echo rg -n '^\s*uses:\s*actions/(checkout|setup-go)@' .github/workflows/live-tests.yml echo rg -n 'persist-credentials:\s*false' .github/workflows/live-tests.yml🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/live-tests.yml around lines 36 - 40, Update the workflow to pin actions to specific SHAs and disable credential persistence: replace floating tags actions/checkout@v6 and actions/setup-go@v6 with their respective full SHA-pinned refs, and add persist-credentials: false under the checkout step (referencing the uses: actions/checkout and uses: actions/setup-go entries and the persist-credentials key) so the checkout does not leave credentials in the runner.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In @.github/workflows/live-tests.yml:
- Around line 36-40: Update the workflow to pin actions to specific SHAs and
disable credential persistence: replace floating tags actions/checkout@v6 and
actions/setup-go@v6 with their respective full SHA-pinned refs, and add
persist-credentials: false under the checkout step (referencing the uses:
actions/checkout and uses: actions/setup-go entries and the persist-credentials
key) so the checkout does not leave credentials in the runner.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 1a6c76bb-420c-487a-b2a2-435fc701e1ce
⛔ Files ignored due to path filters (2)
cli/test/live/testdata/ci-seed.sqlis excluded by!**/testdata/**cli/test/live/testdata/docker-compose.ci.ymlis excluded by!**/testdata/**
📒 Files selected for processing (2)
.github/workflows/live-tests.ymlcli/test/live/doc.go
✅ Files skipped from review due to trivial changes (1)
- cli/test/live/doc.go
Upstream backend healthcheck runs `curl -f http://localhost:8081/health/live` inside the container, but the production alpine image only ships ./main + helm + ca-certificates — no curl. The healthcheck fails forever and `docker compose up --wait` hits its timeout. We already poll /health/live from the host in the next step, so the in-container check is redundant. Override with `test: ["NONE"]` and keep the host-side gate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Last attempt set `test: ["NONE"]` to disable the broken curl-based healthcheck, but `docker compose up --wait` errors out with "no healthcheck configured" instead of treating "running" as ready. Alpine's base image ships busybox `wget` (curl is not installed), so override the healthcheck command to use that. Shorter interval + retries chosen so a healthy backend is flagged ready within ~10s rather than upstream's 75s start_period. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The CI live-tests workflow runs against a stub cluster (no kubeconfig, no real kube-apiserver). The backend correctly returns 500 with "Failed to connect to cluster" for diagnostic endpoints in that state, so two subtests of TestLiveCluster_HealthAndTest were failing in CI while passing locally against rancher-desktop. Two changes: 1. The `health` subtest now skips on the same "cluster unreachable" conditions the `nodes` subtest already handles (added in #98 via CodeRabbit's tightening). Symmetric with `nodes` so a wire-shape regression on either endpoint still surfaces as a failure. 2. The substring set is extracted into isClusterUnreachable() and widened to include "failed to connect to cluster" — the actual literal in the backend's 500 response when kubeconfig is empty or the cluster is down. Verified against rancher-desktop k8s-sm: all three subtests still pass (real cluster, real kubeconfig). The CI api-only path will now skip the two diagnostic subtests cleanly instead of failing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cli/test/live/cluster_live_test.go (1)
64-112: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winBring this test in line with required table-driven + parallel test conventions.
This still uses ad-hoc subtests and is missing
t.Parallel()on the parent and subtests, which violates the repo test rules for this path.As per coding guidelines, "
**/*_test.go: Usetestify/assertwith table-driven tests andt.Parallel()on parent and subtests" and "cli/**/*_test.go: ... with table-driven test patterns".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cli/test/live/cluster_live_test.go` around lines 64 - 112, The TestLiveCluster_HealthAndTest function must be converted to a table-driven test with t.Parallel() on the parent and each subtest: replace the ad-hoc t.Run calls with a slice of test cases (e.g., names "health", "test_connection", "nodes") and iterate calling t.Run(tc.name, func(t *testing.T){ t.Parallel(); ... }), keeping the existing test logic inside the corresponding case bodies (use the same calls GetClusterHealth, TestClusterConnection, GetClusterNodes and preserve isClusterUnreachable handling and assertions); ensure the top-level TestLiveCluster_HealthAndTest begins with t.Parallel() and use the table-driven pattern required by the repo.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cli/test/live/cluster_live_test.go`:
- Around line 121-124: The skip-token list used in the test loop (the []string
literal iterated by for _, needle := range ...) is too broad because it includes
"unavailable"; remove that generic token and replace it with a cluster-specific
token such as "unreachable" (if that was the intended wording), leaving only
cluster-related phrases like "not reachable", "unreachable", and "connection
refused" so unrelated backend errors still surface as test failures.
---
Outside diff comments:
In `@cli/test/live/cluster_live_test.go`:
- Around line 64-112: The TestLiveCluster_HealthAndTest function must be
converted to a table-driven test with t.Parallel() on the parent and each
subtest: replace the ad-hoc t.Run calls with a slice of test cases (e.g., names
"health", "test_connection", "nodes") and iterate calling t.Run(tc.name, func(t
*testing.T){ t.Parallel(); ... }), keeping the existing test logic inside the
corresponding case bodies (use the same calls GetClusterHealth,
TestClusterConnection, GetClusterNodes and preserve isClusterUnreachable
handling and assertions); ensure the top-level TestLiveCluster_HealthAndTest
begins with t.Parallel() and use the table-driven pattern required by the repo.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 3363f8fc-b532-4f79-82fc-0af4ae935c82
⛔ Files ignored due to path filters (1)
cli/test/live/testdata/docker-compose.ci.ymlis excluded by!**/testdata/**
📒 Files selected for processing (1)
cli/test/live/cluster_live_test.go
Drop "unavailable" — too generic, could mask a legitimate 503 from a wire-shape regression. Add "unreachable" instead, which matches the backend's models.ClusterUnreachable status constant and is the more accurate token for a cluster-down condition. Addresses CodeRabbit on PR #99. The companion suggestion to convert TestLiveCluster_HealthAndTest to a table-driven + t.Parallel() form is deliberately skipped: none of the four sibling *_live_test.go files in this package follow that pattern (the convention applies to unit tests; live tests against a shared backend run serially by design), and a mass-refactor of the live suite is out of scope for a workflow PR. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Actionable comments posted: 0 |
…okups Two correctness fixes from CodeRabbit on PR #100: - apikey_live_test.go: promote the raw_key length check from assert.Truef to require.Truef. A failing length check followed by created.RawKey[:3] would panic instead of failing cleanly. - user_live_test.go: after DisableUser/EnableUser, look up the created user in the list response first (find-first pattern, matched by the other live tests) and require.NotNil before asserting on the flag. The previous range-and-skip would silently pass if the user was missing from the response. The companion table-driven + t.Parallel() refactor suggestion for cleanup_policy_live_test.go is deliberately skipped — same reason as on PR #99: every other *_live_test.go file in this package uses ad-hoc subtests and runs serially against a shared backend by design. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…olicies, cluster CRUD (#100) * test(live): expand suite — apikey CRUD, user lifecycle, template versions, cleanup policies, cluster CRUD Adds five new endpoint-group live tests covering the highest-blast-radius surfaces still missing from cli/test/live/. All tests are wire-shape focused (no real workloads created), follow the existing helpers/cleanup conventions in this package, and run cleanly under the CI api-only flow introduced in #99. New files: apikey_live_test.go - Create → list → revoke cycle against the calling user (whoami). - Locks the raw_key contract: sk_-prefixed, returned once, never in list. Was implicitly relied on by the CI bootstrap but never asserted. user_live_test.go - Register a throwaway user, list (admin-only path), disable, enable, reset-password, delete. Never operates on admin — locking out the caller would break the rest of the suite. template_versions_live_test.go - Publishes the same template twice (description-only change in between) to materialise two version snapshots, then exercises list → get → diff with shape assertions on left/right/chart_diffs. cleanup_policy_live_test.go - Full admin CRUD plus a dry-run execution. The condition "idle_days:9999" deliberately matches nothing so the run never mutates a real instance. cluster_lifecycle_live_test.go - Stub-cluster create → get → update → delete. IsDefault stays false so the test never disrupts requireCluster() for other tests. Exercises registry_* + image_pull_secret_name fields (the registry_password drift in PR #95 is the canonical example of why this surface needs a live test). Bonus findings surfaced during local validation against rancher-desktop (left as commented follow-ups, not blockers for this PR): - stackctl's UpdateTemplateRequest.Name is `omitempty` but the backend rejects PUT with "name is required" when omitted. Either drop the omitempty or relax the backend. - stackctl's CreateTemplateRequest has no `version` field, so the template-level Version is unsettable through the CLI — the version snapshot's `version` round-trips empty as a result. - Backend rejects kubeconfig_data unless KUBECONFIG_ENCRYPTION_KEY is configured (the CI compose env doesn't set it); kubeconfig_path works without that prerequisite. Verified locally against rancher-desktop k8s-stack-manager: full live suite passes (21 passed, 2 skipped by design, 0 failed). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(live): address review — guard raw_key slice + find-first user lookups Two correctness fixes from CodeRabbit on PR #100: - apikey_live_test.go: promote the raw_key length check from assert.Truef to require.Truef. A failing length check followed by created.RawKey[:3] would panic instead of failing cleanly. - user_live_test.go: after DisableUser/EnableUser, look up the created user in the list response first (find-first pattern, matched by the other live tests) and require.NotNil before asserting on the flag. The previous range-and-skip would silently pass if the user was missing from the response. The companion table-driven + t.Parallel() refactor suggestion for cleanup_policy_live_test.go is deliberately skipped — same reason as on PR #99: every other *_live_test.go file in this package uses ad-hoc subtests and runs serially against a shared backend by design. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Olof Mattsson <olof.mattsson@klaravik.se> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Closes #96.
Adds a CI job that boots
k8s-stack-manager(origin/main, api-only docker-compose profile) in the same runner, applies a SQL fixture, mints a 1-day API key, then runscli/test/live/...against it. PR-time signal that the wire contracts still match between the two repos — the class of bug stub-based unit tests can't see, and that surfaced four times in the past week alone (#95, k8s-sm#264, #98).Design decisions
Three open questions in #96 were resolved before writing the workflow:
origin/main.github/workflows/live-tests.ymlValidation
Bootstrap flow validated against the rancher-desktop k8s-sm before pushing:
SHOW COLUMNSconfirms the SQL fixture matches the GORM-migrated schema forclusters,stack_definitions,chart_configs,stack_templates,template_chart_configs,users./api/v1/auth/mereturns{id, username, role}as expected.POST /api/v1/users/<id>/api-keysreturns.raw_key(validated key minted + revoked).Two contract-drift gotchas surfaced during that validation (precisely the kind of bug this workflow will surface on PRs):
POST /api/v1/users/<id>/api-keysrequiresexpires_atORexpires_in_days. Initial draft sent neither → 400.raw_key, notkey.Both fixed in the workflow +
cli/test/live/doc.gobefore pushing.Depends on
#98 (live suite expansion + bulk wire-contract fix). The live tests this job runs were introduced there — needs to merge first.
Test plan
livejob passes)cli/test/live/doc.goproduces the same result🤖 Generated with Claude Code
Summary by CodeRabbit
Chores
Documentation
Bug Fixes