From c072d2aacdf60cf782f5efc73c8af28eb7b8c9a1 Mon Sep 17 00:00:00 2001 From: Asad Iqbal Date: Wed, 20 May 2026 18:26:57 +0500 Subject: [PATCH 1/3] fix(#139): preserve idempotency key across helm upgrade MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The ingestor.idempotencyKey helper defaulted to "-" and re-stamped on every render. `helm upgrade --reuse-values` preserves the stored value "" (not the previously-rendered key), so the template re-evaluated `now | unixEpoch` and produced a NEW key each upgrade — accidentally creating duplicate ingestion runs from what customers expected to be no-op upgrades. Contradicts the documented behavior in ingestor/README.md added in #138. Look up the existing post-install hook ConfigMap from the previous render and reuse its idempotency_key. On fresh install (or after uninstall) the lookup returns empty and we fall through to the now-based default. `helm template` (no cluster connection) returns empty for lookup too, so local previews still get a fresh key per render — matches the in-cluster install path the first time. Caught by bugbot on PR #137 review. Closes #139. Co-Authored-By: Claude Opus 4.7 (1M context) --- ingestor/templates/_helpers.tpl | 43 +++++++++++++++++++++++++++------ 1 file changed, 36 insertions(+), 7 deletions(-) diff --git a/ingestor/templates/_helpers.tpl b/ingestor/templates/_helpers.tpl index c47062f..5a8edf7 100644 --- a/ingestor/templates/_helpers.tpl +++ b/ingestor/templates/_helpers.tpl @@ -31,21 +31,50 @@ Naming: {{- end -}} {{- /* -Resolved idempotency key. Defaults to "-" so each -install is a fresh run — including reinstalls under the same release -name, where Helm restarts revisions at 1 and a revision-derived key -would collide with the previous attempt and trip jobs-manager's -"already used with a different image_digest or table" guard. Explicit -override is honored verbatim; set it to a stable UUID only when you -want at-most-once semantics across reinstalls. +Resolved idempotency key. + +Default behavior: + - First helm install: stamp a fresh "-" key. + - helm upgrade of the same release: REUSE the existing key by looking + up the post-install hook ConfigMap from the previous render. This + preserves replay semantics — jobs-manager sees the same key on + upgrade and returns 200 (replay) rather than spawning a new run. + - helm install after uninstall: lookup misses (ConfigMap was deleted + on uninstall), so we fall through to a fresh now-based key. No + collision with the previous run because the epoch differs. + +Earlier versions defaulted to `now | unixEpoch` on every render. That +worked for installs but accidentally created a NEW key on +`helm upgrade --reuse-values` (Helm preserves the stored value `""`, +not the previously-rendered key, so the template re-evaluates `now`). +The result: customers running `helm upgrade` thinking it was a no-op +got duplicate ingestion runs. Bugbot caught it on PR #137. See #139. + +Helm template (no cluster connection) returns empty for lookup, so +local previews always re-stamp with a fresh key — matches the +in-cluster install path the first time around. + +Explicit override is honored verbatim; set `idempotencyKey` to a +stable UUID when you want strict at-most-once semantics across +uninstall/reinstall cycles. */ -}} {{- define "ingestor.idempotencyKey" -}} {{- if .Values.idempotencyKey -}} {{ .Values.idempotencyKey }} {{- else -}} +{{- $existing := lookup "v1" "ConfigMap" .Release.Namespace (include "ingestor.configMapName" .) -}} +{{- /* The ConfigMap key is literally "body.json" (a single key with a dot in + its name, not a nested path), so use `index` rather than dot-access. + The fromJson call then parses the JSON body and we read its + idempotency_key field. Guards against missing data map (e.g. an + in-flight create) by defaulting through `dict`. */ -}} +{{- if and $existing (hasKey ($existing.data | default dict) "body.json") -}} +{{- (fromJson (index $existing.data "body.json")).idempotency_key -}} +{{- else -}} {{ printf "%s-%s" .Release.Name (now | unixEpoch) }} {{- end -}} {{- end -}} +{{- end -}} {{- define "ingestor.labels" -}} app.kubernetes.io/name: ingestor From 15bcda96fbb6b1ecf3c33f9bb65efcf5a9521366 Mon Sep 17 00:00:00 2001 From: Asad Iqbal Date: Wed, 20 May 2026 18:26:57 +0500 Subject: [PATCH 2/3] fix(#140): read requests-proxy resources from values MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The requests-proxy deployment hardcoded its container resources, ignoring the resources.requestsProxy schema entry that values.schema.json has defined since the requests-proxy was added. Every other component (jobsManager, podsMonitor, mysql) reads from .Values.resources..* with defaults — bring requestsProxy in line with that pattern. Adds the resources.requestsProxy block to values.yaml with the existing hardcoded defaults so behavior on a fresh install is unchanged. The template uses the default-through-dict nil-guard idiom so `helm upgrade --reuse-values` from a pre-1.3.6 release (where the value didn't exist) still renders cleanly without crashing on a nil parent. Caught by bugbot on PR #137 review. Closes #140. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../templates/requests-proxy-deployment.yaml | 23 +++++++++++++++---- client/values.yaml | 11 +++++++++ 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/client/templates/requests-proxy-deployment.yaml b/client/templates/requests-proxy-deployment.yaml index b0e2346..cde610c 100644 --- a/client/templates/requests-proxy-deployment.yaml +++ b/client/templates/requests-proxy-deployment.yaml @@ -48,13 +48,28 @@ spec: capabilities: drop: ["ALL"] readOnlyRootFilesystem: true + {{- /* Nil-guard the whole resources.requestsProxy chain. A + helm upgrade --reuse-values from a release predating this + field (any 1.3.x ≤ 1.3.5) won't have it in stored values, + and the direct path .Values.resources.requestsProxy... on + a nil parent dict crashes with "nil pointer evaluating + interface{}.requests". Default-through-dict idiom lets + missing values fall through to the inline literals. + The trailing dash on `{{-` strips the newline AFTER + readOnlyRootFilesystem; we deliberately do NOT use `-}}` + on the last `:=` so the newline before `resources:` is + preserved (an earlier version of this guard ate the + newline and rendered `readOnlyRootFilesystem: trueresources:`). */ -}} + {{- $rp := default dict (default dict .Values.resources).requestsProxy }} + {{- $rpReq := default dict $rp.requests }} + {{- $rpLim := default dict $rp.limits }} resources: requests: - cpu: 100m - memory: 256Mi + cpu: {{ $rpReq.cpu | default "100m" | quote }} + memory: {{ $rpReq.memory | default "256Mi" | quote }} limits: - cpu: 1000m - memory: 512Mi + cpu: {{ $rpLim.cpu | default "1000m" | quote }} + memory: {{ $rpLim.memory | default "512Mi" | quote }} env: - name: MYSQL_HOST value: "mysql-client" diff --git a/client/values.yaml b/client/values.yaml index df4903a..28de382 100644 --- a/client/values.yaml +++ b/client/values.yaml @@ -249,6 +249,17 @@ resources: limits: cpu: "500m" memory: "512Mi" + # requests-proxy serves the Service Bus / backend communication path and + # is mostly idle (a few req/min). 100m/256Mi requests with headroom for + # the occasional burst; revisit if heap profiling shows growth. Schema + # at values.schema.json#resources.requestsProxy validates overrides. + requestsProxy: + requests: + cpu: "100m" + memory: "256Mi" + limits: + cpu: "1000m" + memory: "512Mi" # -- PriorityClass for the data-plane (mysql). # Cluster-scoped resource. Created with helm.sh/resource-policy: keep so a From 174cd51efbb65e46be47a207483fa1ec42c7e173 Mon Sep 17 00:00:00 2001 From: Asad Iqbal Date: Wed, 20 May 2026 18:26:57 +0500 Subject: [PATCH 3/3] fix(#141): add images.ingestor entry to values.schema.json MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit values.yaml has had images.ingestor.digest since #126, and the jobs-manager template surfaces it as INGESTOR_IMAGE_DIGEST, but the schema didn't validate it — every other image (jobsManager, podsMonitor, resourceMonitor, requestsProxy, mysqlClient, busybox) has an entry. An operator setting --set images.ingestor.digest=foo (not the canonical sha256:<64-hex>) bypassed schema validation and failed only later inside submit_ingestion_run.py. Add the missing entry mirroring the other image entries' shape. helm template now rejects malformed digests at chart-template time ("values don't meet the specifications of the schema(s)... Does not match pattern '^(sha256:[a-f0-9]{64})?$'") rather than waiting for runtime. Caught by bugbot on PR #137 review. Closes #141. Co-Authored-By: Claude Opus 4.7 (1M context) --- client/values.schema.json | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/client/values.schema.json b/client/values.schema.json index 5d2a835..d2efd22 100644 --- a/client/values.schema.json +++ b/client/values.schema.json @@ -293,6 +293,16 @@ } } }, + "ingestor": { + "type": "object", + "properties": { + "digest": { + "type": "string", + "pattern": "^(sha256:[a-f0-9]{64})?$", + "description": "Optional canonical ghcr.io/tracebloc/ingestor digest (sha256:<64 hex>). Surfaced into jobs-manager as the INGESTOR_IMAGE_DIGEST env so the customer-facing tracebloc/ingestor subchart can pick it up without per-install overrides. Empty disables the default and forces customers to --set image.digest on each ingestor subchart install." + } + } + }, "mysqlClient": { "type": "object", "properties": {