Skip to content

[chore] Wire GitHub Actions runner env vars + secrets for the CI suite #19

@dogkeeper886

Description

@dogkeeper886

Spec: docs/feature-requests/FR-005-github-actions-env-vars-and-secrets.md (FR-005)

Context

All .github/workflows/*.yml are currently workflow_dispatch only (set during issue #5). Running them on a GitHub-hosted runner today would mostly work — TL_PORT, TL_URL, and TL_DEV_KEY all have correct defaults in cicd/scripts/ and cicd/tests/src/ — except for the LLM judge:

  • LLM_JUDGE_URL defaults to http://localhost:11434, which has nothing listening on a clean runner. The judge blocks for its 5-minute timeout per test or falls back to "not available" depending on the path.
  • LLM_JUDGE_MODEL defaults to llama3:8b, which won't match whatever model we actually point at.
  • Other secrets (e.g., a rotated TL_DEV_KEY) are not currently plumbed.

Locally, these values live in cicd/tests/.env (gitignored). On a runner, there's no .env. The current shell scripts source .env if present and fall back to env-var defaults otherwise, so the plumbing is in place — what's missing is a decision on how the runner gets the values.

Decisions needed

  1. Where does the LLM judge live? Candidates:
    • Self-hosted runner with network access to an Ollama instance
    • A hosted model endpoint (Anthropic/OpenAI) via API key
    • A cloud Ollama VM behind a secret-gated URL
    • Skip LLM judging in CI (--no-llm) and rely on the simple judge
  2. How are secrets passed to the runner? Two viable shapes:
    • A. Materialize .env. Workflow step writes cicd/tests/.env from ${{ secrets.* }} / ${{ vars.* }} before run-tests.sh runs. Matches local dev exactly. Downside: secrets touch disk briefly.
    • B. Export via workflow env:. Workflow sets env vars directly; scripts pick them up since .env sourcing is optional and defaults exist. No file written. Recommended starting point — simpler, secrets stay in process env.
  3. Trigger scope. Stay workflow_dispatch only, or add push/PR triggers to the main branch? Automated triggers make cost/runtime a live concern if a self-hosted runner isn't available.
  4. Rotating TL_DEV_KEY. The hardcoded seed (a1b2c3d4...) is fine for ephemeral CI containers; a rotated value via secret would prove the env-var path works end-to-end. Low value unless paired with a separate test that uses the key.

Scope

Must

  • Decide [feat] Add deleteTestCase and deleteTestSuite to XML-RPC API #1 (LLM location) and Add CI pipeline using test-framework-template #2 (secret transport). Document the decision in CLAUDE.md or a new doc so the runner setup is reproducible.
  • Update .github/workflows/test-pipeline.yml and test-suite.yml to set the required env vars (LLM_JUDGE_URL, LLM_JUDGE_MODEL, and TL_DEV_KEY if rotated) via the chosen transport.
  • Confirm the full suite runs green end-to-end on a runner — or that --no-llm is the deliberate default and the simple judge carries CI.
  • Document how a developer adds or rotates a secret — single paragraph in CLAUDE.md or a new cicd/CI_SETUP.md.

Nice to have

  • Optional re-enablement of automatic triggers (push-to-main or PR-to-main), gated on runner cost/availability.
  • A smoke-only CI variant that skips the build+plan+execution suites (cheaper, runs on every PR) while the full suite stays manual.

Out of scope

  • Replacing the LLM judge. Separate FR if the decision is to move to a hosted API.
  • Refactoring the env-var plumbing inside the test framework — it already handles external env vars cleanly. The only gap is the runner-side wiring.

Acceptance criteria

  • A fresh run of the test-pipeline workflow on GitHub Actions completes end-to-end (simple judge at minimum; LLM judge if decision [feat] Add deleteTestCase and deleteTestSuite to XML-RPC API #1 provides a reachable endpoint).
  • cicd/tests/.env remains the local-dev source of truth and is not referenced by workflows directly.
  • Documentation names the secrets/variables a developer must add to the repo settings, with example values.

Spec

FR doc to follow alongside this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions