Skip to content

feat(template-drift): nightly diff vs service-template (Task 33)#8

Merged
ensarkovankaya merged 3 commits into
mainfrom
feat/template-drift
May 10, 2026
Merged

feat(template-drift): nightly diff vs service-template (Task 33)#8
ensarkovankaya merged 3 commits into
mainfrom
feat/template-drift

Conversation

@ensarkovankaya
Copy link
Copy Markdown
Contributor

What

Nightly drift detector. Diffs each backend service repo against an init.sh-rendered copy of paper-board/service-template@main. On drift, files (or appends a comment to) a GitHub issue in this repo.

Why

Service-template (Task 32) gives new services a clean baseline. Without drift CI, services slide off the baseline silently — middleware.go grows custom imports, release.yml loses the cosign-warning comment, helm chart structure diverges. Drift CI surfaces this nightly so we catch it before it compounds.

Files

Path Purpose
.github/workflows/template-drift.yml Nightly cron (06:00 UTC) + workflow_dispatch. Matrix per service.
scripts/compare-template.sh 2-tier comparator (exact + presence). Exit 0 = clean, 1 = drift.
template-drift-allowlist.md Comparator semantics doc, per-service overrides, known drift, how to update.

Comparator semantics

Tier 1 — exact content match (drift = required fix):

  • internal/middleware/middleware.go
  • migrations/embed.go
  • Dockerfile
  • .github/workflows/release.yml
  • .github/workflows/ci.yml (after enable-e2e flip)

Tier 2 — presence-only (path must exist; content may diverge): 19 skeleton paths covering Makefile, .golangci.yml, docker-compose.yaml, sqlc.yaml, cmd/, internal/{api,core,config,store}, helm//, scripts/cover-check.sh, migrations/000001_*.{up,down}.sql (glob).

Tier 3 — implicit (everything else): unchecked. Services own their domain code.

Per-service overrides

Today: only agents. enable_e2e: true → workflow seds enable-e2e: false → true in rendered ci.yml before diff.

Setup requirement

secrets.DRIFT_PAT must be added at the org or repo level — PAT (classic or fine-grained) with read:repo on paper-board/* private repos. Without it, every matrix job fails at the first cross-repo actions/checkout with an auth error. Action item for an admin before first dispatch.

Local dry-run (agents)

✓ Tier 1: middleware.go, embed.go, Dockerfile, ci.yml all match.
✗ Tier 1: release.yml DIVERGED — agents lacks the cosign+SBOM WARNING
          comment block that template ships post-Task 31.
✓ Tier 2: all 19 skeleton paths present (migrations/000001_minimal vs
          000001_init both glob-match).

The 1 known drift is documented in template-drift-allowlist.md under "Known divergences". First nightly run (or first manual workflow_dispatch) will auto-file the corresponding issue. Resolution: hygiene PR adds the WARNING comment to agents (out of scope for this PR).

Validation

  • actionlint clean (docker rhysd/actionlint:latest, exit 0)
  • compare-template.sh shellcheck-friendly (set -euo pipefail; arg parser; clear exit codes)
  • ✓ Local dry-run vs agents detects exactly the 1 expected drift item

Spec

agent-manager/tasks/2026-05-02-adr-0010-gap-analysis.md §33.

🤖 Generated with Claude Code

Adds .github/workflows/template-drift.yml + scripts/compare-template.sh +
template-drift-allowlist.md.

Cron 06:00 UTC + workflow_dispatch. Per-service matrix entry; today only
agents (advisory_lock=3, port=8080, enable_e2e=true). Identity/billing/
platform commented as placeholders for when their repos land.

Comparator: 2 tiers. Tier 1 (exact match after init.sh render +
per-service overrides): middleware.go, embed.go, Dockerfile,
ci.yml, release.yml. Tier 2 (presence-only): 19 skeleton paths
covering Makefile, .golangci.yml, docker-compose.yaml, sqlc.yaml,
cmd/{server,migrator}/main.go, internal/{api,core,config,store}/*,
helm/<svc>/{Chart,values,templates/*}, scripts/cover-check.sh,
migrations/000001_*.{up,down}.sql (glob; descriptive suffix is service-local).

On drift: searches for an open issue titled
"drift: <svc> diverges from service-template" in paper-board/.github;
appends a comment if found, opens a new issue otherwise. issues:write
permission scoped at workflow level.

Cross-repo private read requires secrets.DRIFT_PAT (PAT with read:repo
on paper-board/*). Workflow header documents setup.

Local dry-run vs paper-board/agents reveals 1 expected drift item:
agents .github/workflows/release.yml lacks the cosign+SBOM WARNING
comment block that template ships (added post-Task 31). Tracked as
"open drift" in template-drift-allowlist.md; first nightly run will
auto-file an issue. Resolution: hygiene PR adds the comment to agents
in a follow-up.

Spec: agent-manager/tasks/2026-05-02-adr-0010-gap-analysis.md §33.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 9, 2026

Greptile Summary

  • Adds a nightly (cron: 0 6 * * *) drift detection workflow that renders paper-board/service-template@main via init.sh, applies per-service overrides, and diffs each matrix service repo using a two-tier comparator (exact match for skeleton files, presence-only for structural paths).
  • compare-template.sh guards Tier 1 checks for missing service-side files but not for missing template-side files; a template rendering gap falls through to the DIVERGED branch and emits a diff error inside a fence rather than a clear diagnostic. The issue-filing trigger fires on any non-zero exit (including exit code 2 setup errors), which can create misleading issues in the tracker.
  • Requires secrets.DRIFT_PAT to be provisioned by an admin before first run; documented in both the workflow header and PR description.

Confidence Score: 4/5

Safe to merge with two P2 edge-case fixes recommended before the first nightly run.

No P0 or P1 findings. Previous review comments (nested fence, SC2086, matrix injection) are all correctly addressed. Two P2s remain: the Tier 1 loop's missing template-side guard produces confusing diff output on a template rendering gap, and the issue-filing condition triggering on exit code 2 (setup error) can create misleading tracker noise. Both are low-probability edge cases that don't affect the happy path.

scripts/compare-template.sh (Tier 1 template-side guard) and .github/workflows/template-drift.yml (exit code 2 trigger condition)

Important Files Changed

Filename Overview
.github/workflows/template-drift.yml New nightly cron + workflow_dispatch that checks out service-template, renders it, then diffs each matrix service. Previous P1 (nested fence) and P2 (matrix injection) concerns addressed; issue-filing step triggers on any non-zero exit code including setup errors (exit 2).
scripts/compare-template.sh Two-tier drift comparator with clean arg parsing and exit codes. Tier 1 loop guards for missing service-side files but not missing template-side files, which produces a misleading error inside a diff fence instead of a clear diagnostic.
template-drift-allowlist.md Documentation-only companion file. Accurately mirrors the two-tier model, per-service overrides, and known divergences. No issues.

Sequence Diagram

sequenceDiagram
    participant Cron as Cron or workflow_dispatch
    participant WF as template-drift.yml
    participant ST as service-template repo
    participant SVC as service repo
    participant Script as compare-template.sh
    participant GH as GitHub Issues API

    Cron->>WF: trigger 06:00 UTC or manual
    WF->>ST: "checkout@main via DRIFT_PAT"
    WF->>WF: render init.sh with SVC/PORT/LOCK params
    WF->>WF: apply per-service overrides
    WF->>SVC: checkout via DRIFT_PAT
    WF->>Script: --template rendered --service svc
    Script->>Script: Tier 1 exact diff middleware.go Dockerfile etc
    Script->>Script: Tier 2 presence check helm cmd internal
    Script-->>WF: exit 0 clean or exit 1 drift
    alt "exit_code != 0"
        WF->>GH: list open issues matching drift title
        alt open issue exists
            WF->>GH: append comment with drift report
        else no open issue
            WF->>GH: create new drift issue
        end
    end
Loading

Reviews (3): Last reviewed commit: "fix(template-drift): pipe matrix values ..." | Re-trigger Greptile

Comment thread .github/workflows/template-drift.yml
Comment thread scripts/compare-template.sh
Comment thread .github/workflows/template-drift.yml
P1: drop outer markdown fence around drift-report.txt in issue body. The
report already emits per-file ```diff fences from compare-template.sh;
the outer ``` block was prematurely closed at the first inner ```,
corrupting every filed issue body.

P2: quote $SERVICE_DIR in the migration glob ls call. The shellcheck
disable=SC2086 is for the glob portion, not the path; defensive against
service paths containing whitespace.

Greptile P2#2 (matrix value centralization) dismissed — that's the
standard YAML matrix pattern; per-service tuples are the right unit
when identity/billing/platform onboard.

Validation:
- actionlint clean
- Local dry-run vs agents: exit 1 (1 expected drift), report well-formed
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 10, 2026

All three items handled cleanly in 486c5d0.

P1 — the <details> + cat drift-report.txt pattern is exactly right. The comment in the workflow explaining why the outer fence was dropped is a nice touch for future maintainers.

P2 #1"${SERVICE_DIR}"/migrations/schema/000001_*."${direction}".sql is the correct idiom. The inline comment pinpointing that SC2086 targets the glob, not the path variable, makes the suppression self-documenting.

P2 #2 — agreed, the matrix block as registry is the right call. The How to add a new service section in the allowlist doc already covers the onboarding steps, so the indirection would be pure overhead.

LGTM — good to merge.

Tip: You can customize Greptile's behavior for this repo with .greptile/rules.md and .greptile/config.json.

Direct ${{ matrix.* }} interpolation in run: shell strings becomes a
shell injection vector the moment matrix values come from a user-facing
source (workflow_dispatch input, issue-ops trigger). Pre-empt by piping
each value through a step-level env: block and dereferencing as a shell
variable. Hardcoded values today, but the pattern is now safe under
any future input source.

Also drops shellcheck disable=SC2086 on the migration glob — verified
with koalaman/shellcheck:stable that the disable was unnecessary once
${SERVICE_DIR} got quoted.

Validation: actionlint clean, shellcheck clean, local dry-run vs agents
exit 1 (1 expected drift).
@ensarkovankaya ensarkovankaya merged commit 5a75332 into main May 10, 2026
@ensarkovankaya ensarkovankaya deleted the feat/template-drift branch May 10, 2026 11:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant