Problem
Today, shipping a new `ghcr.io/tracebloc/ingestor` image requires a full `client` chart release:
- Push new image to GHCR (get digest)
- Bump `images.ingestor.digest` in `client/values.yaml`
- Bump `client/Chart.yaml` version
- PR → develop → sync to main → cut release tag
- autoUpgrade picks up the new chart at next `:23` tick → jobs-manager redeploys with new `INGESTOR_IMAGE_DIGEST`
That's ~hours of overhead per ingestor-image bump, vs. ~15 min for jobs-manager (which uses image-refresh on a floating tag). The asymmetry hurts when the ingestor changes frequently.
Proposed solution
Extend the existing image-refresh CronJob (shipped in #155 / v1.4.0) to handle the ingestor image as a third entry in its image list, but with different semantics from jobs-manager/pods-monitor:
- jobs-manager / pods-monitor: registry-HEAD vs annotation; on change → `kubectl rollout restart` + update annotation.
- ingestor: registry-HEAD vs annotation; on change → `kubectl set env deployment/-jobs-manager -c api INGESTOR_IMAGE_DIGEST=` + update annotation. The env change triggers a natural rollout (deployment spec mutates, new ReplicaSet rolls out).
Audit trail is preserved: the digest still flows through the deployment's env (still inspectable via `kubectl get deployment -o yaml`), and the `tracebloc.io/last-refreshed-ingestor-digest` annotation records every successful refresh — same auditability as the current pin-in-values approach, just sourced from the registry instead of the chart.
Why this is a separate ticket from #154
#154 closed with the deliberate "ingestor is out of scope" decision because the ingestor's post-install Job hook couldn't be `rollout restart`ed. This ticket sidesteps that — we're refreshing the IMAGE the spawned ingestion Jobs USE, not the hook itself. The hook stays untouched; the parent jobs-manager deployment gets the env-var patch.
Acceptance criteria
Notes
- RBAC stays unchanged — `kubectl set env` is a patch on the deployment, which we already have.
- Steady-state rate-limit usage adds 1 HEAD per tick to GHCR (well under any anonymous limits).
- If the change frequency is also high for chart-template changes to the ingestor subchart itself, that's a separate follow-up — this ticket only addresses the image-bump case.
Follow-up to #154.
Problem
Today, shipping a new `ghcr.io/tracebloc/ingestor` image requires a full `client` chart release:
That's ~hours of overhead per ingestor-image bump, vs. ~15 min for jobs-manager (which uses image-refresh on a floating tag). The asymmetry hurts when the ingestor changes frequently.
Proposed solution
Extend the existing image-refresh CronJob (shipped in #155 / v1.4.0) to handle the ingestor image as a third entry in its image list, but with different semantics from jobs-manager/pods-monitor:
Audit trail is preserved: the digest still flows through the deployment's env (still inspectable via `kubectl get deployment -o yaml`), and the `tracebloc.io/last-refreshed-ingestor-digest` annotation records every successful refresh — same auditability as the current pin-in-values approach, just sourced from the registry instead of the chart.
Why this is a separate ticket from #154
#154 closed with the deliberate "ingestor is out of scope" decision because the ingestor's post-install Job hook couldn't be `rollout restart`ed. This ticket sidesteps that — we're refreshing the IMAGE the spawned ingestion Jobs USE, not the hook itself. The hook stays untouched; the parent jobs-manager deployment gets the env-var patch.
Acceptance criteria
Notes
Follow-up to #154.