ci: add post-deploy smoke test workflow#18
Conversation
|
Hey! Your PR title Please update it to start with one of:
Where See CONTRIBUTING.md for details. |
|
The following comment was made by an LLM, it may be inaccurate: |
Claude Single-Pass ReviewSummaryThis PR adds a workflow_run-triggered smoke test workflow and corrects the runner for build-push.yml from an unavailable Blacksmith runner to ubuntu-latest. The workflow structure is sound, but the smoke test script has two bugs that will cause spurious test failures in CI. Findings[FINDING-1] issue: P1 | scripts/ci/test-opencode-integration.sh:46 | Test 3 comparison has a trailing space in the string literal — comparing against "200 " (with space) instead of "200" means this test will always fail. Verified via Read: if [ "$HTTP_CODE" = "200 " ]; then. Fix: Remove the trailing space, change "200 " to "200". [FINDING-2] issue: P1 | scripts/ci/test-opencode-integration.sh:60 | Typo HTTP_CDE (missing P) in the Test 4 failure message — the variable is unset and will expand to empty or cause an error under set -u. Verified via Read: fail "Session creation expected 200, got ${HTTP_CDE}". Fix: Change HTTP_CDE to HTTP_CODE. [FINDING-3] nit: P2 | .github/workflows/build-push.yml:14 | Runner changed to ubuntu-latest but project convention calls for ubicloud-standard-2. Consider switching once Ubicloud is available for this fork, or document the exception. [FINDING-4] nit: P2 | .github/workflows/smoke-test.yml:6 | workflow_run fires from the default branch workflow definition, not the PR branch. Changes here will not take effect until merged — worth noting for future contributors. Code Quality
Recommendation[ ] Approve | [ ] Approve with changes | [x] Request changes |
Claude Review - Out-of-Diff FindingsThe following findings are on lines outside the PR diff: scripts/ci/test-opencode-integration.sh:46 scripts/ci/test-opencode-integration.sh:60 .github/workflows/build-push.yml:14 .github/workflows/smoke-test.yml:6 |
Claude Single-Pass ReviewSummaryThis PR adds a Findings[FINDING-1] issue: P1 | .github/workflows/build-push.yml:14 | Runner regresses to [FINDING-2] nit: P2 | .github/workflows/smoke-test.yml:20 | Code Quality
Recommendation[ ] Approve | [ ] Approve with changes | [x] Request changes |
andreiships-bot
left a comment
There was a problem hiding this comment.
Claude Review
See inline comments for details.
303c7ba to
696147b
Compare
Codex ReviewSummaryPR scope is small and focused, but the workflow currently validates post-build state rather than post-deploy state. That mismatch can produce false-positive smoke results for the intended goal. Findings[FINDING-1] issue: P1 | Code Quality
Architecture
PR MetadataSuggested PR Title: Recommendation
Escalate to Gemini?
|
Gemini Deep ReviewSummaryThe PR introduces a post-deploy smoke test workflow that validates the headless OpenCode server after successful Docker image builds. While the workflow follows project runner conventions, there are risks regarding deployment synchronization and authentication that could lead to false negatives or test failures in production. Findings[gemini-1] issue: P1 | .github/workflows/smoke-test.yml:16 | Synchronization Risk: The smoke test triggers immediately after the Docker image is pushed to the registry. However, [gemini-2] issue: P1 | .github/workflows/smoke-test.yml:19 | Authentication Gap: The [gemini-3] nit: P2 | .github/workflows/smoke-test.yml:18 | Hardcoded URL: The PR MetadataSuggested PR Title: Questions
Recommendation[ ] Approve | [x] Approve with changes | [ ] Request changes Note: Verify the auth and deployment sync before merging to avoid noisy CI failures on every release. |
Resolution SummaryResolving findings from Codex Review, Gemini Review, Claude Review:
|
Codex ReviewSummaryThe previous blocker is addressed: deployment now occurs inside FindingsNone. Code Quality
Architecture
PR MetadataSuggested PR Title: Recommendation
Escalate to Gemini?
|
Gemini Deep ReviewSummaryThe PR introduces a deployment step to Fly.io and a post-deployment smoke test. While the integration of these steps is a positive step for CI/CD, the smoke test is currently fragile due to the lack of retries for cold-starting Fly machines and fails to verify that the correctly tagged version was actually deployed. Findings[gemini-1] blocking: P0 | scripts/ci/test-opencode-integration.sh:30 | No retries for initial health check. [gemini-2] issue: P1 | scripts/ci/test-opencode-integration.sh:31 | Health check doesn't verify deployment version. [gemini-3] issue: P1 | .github/workflows/smoke-test.yml:19 | Smoke test URL is hardcoded. [gemini-4] nit: P2 | scripts/ci/test-opencode-integration.sh:30 | Suppressing curl errors makes debugging difficult. [gemini-5] nit: P2 | .github/workflows/build-push.yml:52 | Unpinned PR MetadataSuggested PR Title: Questions
Recommendation[ ] Approve | [ ] Approve with changes | [x] Request changes |
Resolution SummaryResolving findings from Gemini Review:
|
Codex ReviewSummaryThis PR adds automated deploy + smoke coverage for tagged releases, but there is one critical CI security risk and one release-validation reliability gap that should be fixed before merge. Findings[FINDING-1][codex-1] blocking: P0 | .github/workflows/build-push.yml:53 | [FINDING-2][codex-2] issue: P1 | .github/workflows/smoke-test.yml:4 | Code Quality
Architecture
PR MetadataSuggested PR Title: Recommendation[ ] Approve | [ ] Approve with changes | [x] Request changes Escalate to Gemini?[x] Yes - security-sensitive CI/deploy workflow changes (privileged token + mutable action ref) | [ ] No |
Gemini Deep ReviewSummaryThe PR introduces a post-deploy smoke test for the Fly.io deployment, but it contains a critical authentication gap that will cause the tests to fail on a secured server, and a trigger limitation that prevents verification within the PR itself. Findings[gemini-1] blocking: P0 | HTTP_CODE=$(curl -sf -o "${TMPDIR_RUN}/health-response.json" -w "%{http_code}" "${BASE_URL}/global/health" 2>/dev/null || echo "000")Fix: Update the script to accept [gemini-2] blocking: P0 | [gemini-3] issue: P1 | [gemini-4] issue: P1 | [gemini-5] suggestion: P2 | [gemini-6] nit: P2 | PR MetadataSuggested PR Title: Questions
Recommendation[ ] Approve | [ ] Approve with changes | [x] Request changes |
Resolution SummaryResolving findings from Codex Review, Gemini Review:
|
Codex ReviewSummaryPR #18 is focused and appropriately scoped for CI/deploy automation. I found one must-fix determinism issue in the new Findings[codex-1][FINDING-1] issue: P1 | .github/workflows/smoke-test.yml:14 | - uses: actions/checkout@v4
with:
ref: ${{ github.event.workflow_run.head_sha }}Code Quality
Architecture
PR MetadataSuggested PR Title: Recommendation[ ] Approve | [ ] Approve with changes | [x] Request changes Escalate to Gemini?[ ] Yes - [reason] | [x] No |
Gemini Deep ReviewSummaryThe PR implements a post-deploy smoke test workflow that triggers after a successful Docker build and push (on version tags). It adds a Fly.io deployment step to the Findings[gemini-1] nit: P2 | .github/workflows/smoke-test.yml:15 | Style inconsistency in action reference - uses: actions/checkout @v4Verified via context: Other workflows like [gemini-2] issue: P1 | scripts/ci/test-opencode-integration.sh:38 | Brittle JSON parsing in smoke test script HEALTHY=$(grep -o '"healthy":true' "${TMPDIR_RUN}/health-response.json" || true)Fix: Use HEALTHY=$(jq -r '.healthy' "${TMPDIR_RUN}/health-response.json")
if [ "$HEALTHY" = "true" ]; then ...[gemini-3] issue: P1 | scripts/ci/test-opencode-integration.sh:78 | Brittle regex for session ID extraction SESSION_ID=$(grep -o '"id":"[^"]*"' "${TMPDIR_RUN}/session-create.json" | head -1 | cut -d'"' -f4 || true)Fix: Use SESSION_ID=$(jq -r '.id' "${TMPDIR_RUN}/session-create.json")[gemini-4] question: P2 | scripts/ci/test-opencode-integration.sh:69 | Missing Authentication in Smoke Test [gemini-5] suggestion: P2 | scripts/ci/test-opencode-integration.sh | Lack of session cleanup PR MetadataSuggested PR Title: Questions
Recommendation[ ] Approve | [x] Approve with changes | [ ] Request changes |
Resolution SummaryResolving findings from Codex Review, Gemini Review:
|
Dual-Review Summary (Round 5)
Convergence: ❌ Not achieved (3 blocking findings) Rounds: 5 Finding Classification: NEW: 0 | RE-RAISED: 1 Findings[Codex #1] P1 -
[Gemini #1] P2 -
[Gemini #2] P1 -
[Gemini #3] P1 -
[Gemini #4] P2 -
[Gemini #5] P2 -
|
Summary
smoke-test.ymlworkflow triggered byworkflow_runon "Build and Push Docker Image"scripts/ci/test-opencode-integration.shagainst the live Fly.io endpoint after successful deploysconclusion == 'success')pistachiorama-opencode.fly.dev)Related
Spec 135 (OpenCode fork CI QoL improvements), Phase 3
Test plan