Skip to content

Webhook endpoint for hot-reload on push to the published data branch #65

@themightychris

Description

@themightychris

What

Add an HTTP endpoint on the API that pulls the latest commit on the configured CFP_DATA_BRANCH and atomically swaps the in-memory state — no pod restart. Then add a GitHub Actions workflow on the codeforphilly-data repo that fires on push to published and calls the endpoint.

Why

Picking up new data currently requires a pod restart (the entrypoint's git fetch + reconcile runs once at boot; everything is in-memory after that). For the routine "merge legacy-importpublished" flow, that's heavy:

  • Readiness drops while the pod re-boots
  • In-flight requests are dropped
  • The full record set is re-parsed + indexed + FTS-rebuilt on every restart — adds tens of seconds even for unchanged data
  • Restarts mid-write are dangerous; a hot reload can wait on the write mutex

A webhook-driven hot-reload makes data updates feel like cache invalidations.

Sketch

API side

New route, hidden from the public OpenAPI spec:

  • POST /api/_internal/reload-data
  • Auth: shared secret via Authorization: Bearer <key> or X-CFP-Webhook-Key: <key>. Secret in env (CFP_DATA_RELOAD_SECRET) sourced from the sealed env Secret.
  • Body: optional — could carry { branch?: string, commitHash?: string } for sanity checks (refuse if remote HEAD doesn't match).

Handler:

  1. Acquire the write mutex (serialize against API mutations; we don't want to pull mid-transact).
  2. git fetch --prune origin <branch> + git merge --ff-only origin/<branch> on /app/data. Refuse non-fast-forward (operator must reconcile manually).
  3. Re-open the gitsheets store + rebuild InMemoryState + rebuild FTS off the new tree.
  4. Atomic swap: point fastify.inMemoryState (and FTS handle) at the new value. The old value is GC'd.
  5. Release the mutex. Respond { ok: true, oldCommit, newCommit, durationMs }.

Failure modes leave the running state intact (don't swap unless the rebuild succeeds end-to-end). On rebuild error, return 5xx, keep serving from the old state, page the operator.

Repo side (codeforphilly-data)

New workflow .github/workflows/notify-deployments.yml:

on:
  push:
    branches: [published]

jobs:
  notify:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        target:
          - { name: sandbox, url: 'https://next-v2.codeforphilly.org/api/_internal/reload-data' }
          # - { name: prod, url: '...' }  when prod stands up
    steps:
      - name: Trigger reload on ${{ matrix.target.name }}
        run: |
          curl -fsS -X POST "${{ matrix.target.url }}" \
            -H "Authorization: Bearer ${{ secrets.CFP_DATA_RELOAD_SECRET }}" \
            -H "Content-Type: application/json" \
            -d '{"branch":"published","commitHash":"${{ github.sha }}"}'

Secret stored once per environment.

Open questions

  • Auth shape. Bearer is simplest. HMAC of payload (like GitHub's own webhooks) is more robust against secret leakage in transit — but we're HTTPS-only, so bearer is fine here. Defer HMAC to "later if we ever go off HTTPS."
  • Concurrency. One reload at a time. Reuse the gitsheets write mutex or a dedicated reload mutex? Reusing the write mutex is simpler and correct (a reload conflicts with writes the same way two writes do).
  • Rebuild atomicity. Need to make sure the in-memory swap is a single pointer assignment, not a piecemeal update. The current InMemoryState is mostly built by services.boot() — that boot is what we'd call to build the new state object. Verify it can run to completion without mutating the existing instance.
  • Push-daemon interaction. The push daemon writes commits the API creates out to origin. If a webhook fires while the daemon is mid-push, that's fine — they're independent directions. But if published has API commits ahead of origin (push daemon hasn't pushed yet) and a webhook arrives, the --ff-only will refuse. Acceptable; the daemon will push, the next webhook will succeed.
  • Pre-cutover only. Once we're production with high write volume, hot-reload becomes routine. For now, it's mostly relevant to the importer-merge flow.

Out of scope

  • Push-side validation in the GH Action (e.g., "did this commit pass schema CI?"). Separate concern; current model trusts whatever's on published.
  • Multi-pod fanout. We're single-replica by architectural constraint (specs/architecture.md); one webhook = one reload.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions