Skip to content

fix(helm)!: replace dockerize initContainer with bash TCP wait#40425

Open
rusackas wants to merge 1 commit into
masterfrom
fix/helm-replace-dockerize
Open

fix(helm)!: replace dockerize initContainer with bash TCP wait#40425
rusackas wants to merge 1 commit into
masterfrom
fix/helm-replace-dockerize

Conversation

@rusackas
Copy link
Copy Markdown
Member

SUMMARY

Drops apache/superset:dockerize from the chart entirely. The five initContainers that gate startup on Postgres / Redis now run from the same apache/superset image we're already pulling, using bash's built-in /dev/tcp/host/port redirect for the readiness probe — no external dockerize, nc, or busybox needed.

A trivy scan of the current published apache/superset:dockerize (image created 2024-05-09, base alpine 3.19.1 EOSL) found:

Severity Count Fix available
CRITICAL 3 yes
HIGH 25 yes
MEDIUM 71 yes
LOW 24 yes
Total 123 123/123

64 of the CVEs are in the bundled dockerize Go binary itself (stale Go stdlib + golang.org/x/{net,crypto}); the rest are in the alpine base packages (libcrypto3, libssl3, busybox, musl). Rebuilding the image on a fresher base would just defer the same problem; removing the dependency eliminates it.

Why bash /dev/tcp over the alternatives

  • vs busybox + nc -z: introduces an external image dependency, which is what we're trying to escape.
  • vs a Chainguard image: external registry dependency that hurts air-gapped users, and we don't need a hardened build of a non-trivial tool — we just need a TCP connect, which bash does for free.
  • vs rebuilding apache/superset:dockerize on a fresher base: kicks the can; we'd be back here in another year.

Verified /bin/bash 5.2.15 is present in apache/superset:latest and supports the /dev/tcp redirect (the image's /bin/sh is dash, which does not — hence the explicit /bin/bash).

Changes

  • helm/superset/values.yaml:
    • Removed the initImage: block.
    • Rewrote all 5 initContainers blocks (init, supersetNode, supersetWorker, supersetCeleryBeat, supersetCeleryFlower) to use {{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }} and a bash /dev/tcp loop. The 120s timeout from dockerize -timeout 120s is preserved via a SECONDS-based deadline. Two-port waits factor out a small wait_for helper.
  • helm/superset/Chart.yaml: chart version 0.15.5 → 0.16.0 (minor bump per the chart's documented versioning policy — default behaviour changed; a documented value was removed).
  • helm/superset/README.md: auto-regenerated by helm-docs pre-commit hook (the three initImage.* rows are gone; version badge bumped).

Compatibility

User profile Impact
Default chart user None visible. Same wait behaviour, different binary.
Overrides .Values.initImage.repository/tag/pullPolicy Breaking. The value is no longer read. Overrides should be removed; the chart will pull from .Values.image automatically.
Fully overrides .Values.supersetNode.initContainers (etc.) with their own block None — the override still wins.
Pinned apache/superset:dockerize in network policies / image allowlists Can be removed once on chart 0.16.0.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A — chart-only change.

TESTING INSTRUCTIONS

cd helm/superset
helm dependency build
helm template . --skip-tests --skip-schema-validation \
  --set supersetCeleryBeat.enabled=true \
  --set supersetCeleryFlower.enabled=true \
  | grep -B1 -A3 'name: wait-for'

Expect 5 wait-for-* blocks, all using apache/superset:5.0.0 (or whatever appVersion resolves to), all running /bin/bash with a /dev/tcp probe.

To exercise the actual deploy:

helm install superset . --create-namespace --namespace superset
kubectl -n superset logs -l app=superset -c wait-for-postgres -f
# expect: "waiting for postgres at ... (elapsed Ns)" lines then "postgres at ... is up"

ADDITIONAL INFORMATION

Out of scope (good follow-ups)

  • Stop publishing apache/superset:dockerize from .github/workflows/docker.yml and tag-release.yml. Worth one release of overlap so users on older chart versions still get fresh builds while they upgrade, then a separate PR to drop the matrix entry.

🤖 Generated with Claude Code

Drops `apache/superset:dockerize` from the chart entirely. The five
initContainers that gate startup on Postgres / Redis now run from the
same `apache/superset` image we're already pulling, using bash's
built-in `/dev/tcp/host/port` redirect for the readiness probe — no
external `dockerize`, `nc`, or busybox needed.

A trivy scan of the current published `apache/superset:dockerize`
(image created 2024-05-09, alpine 3.19.1 EOSL) found 3 CRITICAL,
25 HIGH, 71 MEDIUM, and 24 LOW CVEs — 64 of them in the bundled
`dockerize` Go binary itself (stale Go stdlib + golang.org/x/{net,
crypto}); the rest in the alpine base. Rebuilding the image on a
fresher base would just defer the same problem; removing the
dependency eliminates it.

Verified `/bin/bash` 5.2.15 is present in `apache/superset:latest`
and supports the `/dev/tcp` redirect (the image's `/bin/sh` is dash,
which does not — hence the explicit `/bin/bash` invocation).
Rendered the chart with `helm template` and confirmed all five
initContainers (supersetNode, init, supersetWorker,
supersetCeleryBeat, supersetCeleryFlower) emit the expected
bash-based probe and pull the main superset image.

The 120s timeout from `dockerize -timeout 120s` is preserved via a
SECONDS-based deadline in the bash loop. Two-port waits (postgres
+ redis) factor out a small `wait_for` helper to keep the script
readable.

BREAKING CHANGE: chart `values.yaml` no longer defines `initImage`.
Operators who customised `.Values.initImage.repository/tag/pullPolicy`
must remove those overrides — they are silently ignored. Operators
who fully overrode `.Values.supersetNode.initContainers` (etc.) are
unaffected; their override still wins. Chart bumped 0.15.5 → 0.16.0.

Closes #40424

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@bito-code-review bito-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #29c792

Actionable Suggestions - 1
  • helm/superset/values.yaml - 1
Additional Suggestions - 2
  • helm/superset/values.yaml - 2
    • Stale comment references removed section · Line 646-646
      The comment `# See supersetNode.initContainers for the rationale.` is now stale. The supersetNode.initContainers (line 299) serves as the authoritative reference, but its comment (lines 310-312) is more detailed. Either copy that explanation inline here or remove this backward reference since these sections are now identical.
    • Stale comment references removed section · Line 832-832
      The comment `# See supersetNode.initContainers for the rationale.` is now stale. Since this init section now uses the identical bash implementation as supersetNode, this backward reference creates confusion. Either copy the detailed rationale inline or remove the comment.
Review Details
  • Files reviewed - 2 · Commit Range: dd216d6..dd216d6
    • helm/superset/Chart.yaml
    • helm/superset/values.yaml
  • Files skipped - 1
    • helm/superset/README.md - Reason: Filter setting
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment thread helm/superset/values.yaml
Comment on lines 298 to 325
# @default -- a container waiting for postgres
initContainers:
- name: wait-for-postgres
image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}"
imagePullPolicy: "{{ .Values.initImage.pullPolicy }}"
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: "{{ .Values.image.pullPolicy }}"
envFrom:
- secretRef:
name: "{{ tpl .Values.envFromSecret . }}"
command:
- /bin/sh
- /bin/bash
- -c
- dockerize -wait "tcp://$DB_HOST:$DB_PORT" -timeout 120s
- |
# bash's /dev/tcp redirect performs a TCP connect; no external
# `dockerize`, `nc`, or busybox needed. SECONDS-based deadline
# mirrors the prior `dockerize -timeout 120s` behaviour.
SECONDS=0
until (echo > /dev/tcp/"$DB_HOST"/"$DB_PORT") 2>/dev/null; do
if [ "$SECONDS" -ge 120 ]; then
echo "timeout waiting for postgres at $DB_HOST:$DB_PORT after 120s" >&2
exit 1
fi
echo "waiting for postgres at $DB_HOST:$DB_PORT (elapsed ${SECONDS}s)"
sleep 2
done
echo "postgres at $DB_HOST:$DB_PORT is up"
resources:
limits:
memory: "256Mi"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantic duplication in diff

Extract the duplicated wait_for function into a shared Helm template partial (e.g., _wait_for.tpl) and reference it in each initContainers section to avoid semantic duplication and reduce maintenance overhead.

Code Review Run #29c792


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, but I'd push back here. The duplication is real (~15 lines of shared bash across 5 blocks), but the reason these initContainers live in values.yaml rather than in composed template partials is that they're an override surface — operators are expected to read and replace them. Three reasons to keep the diff as-is:

  1. values.yaml can't directly {{ include }} partials. It's plain YAML, parsed before templates render. The only way to share the bash via a _wait_for.tpl partial would be to write command: {{ include "superset.waitForScript" . }} in values.yaml and rely on the templates' second-pass tpl (toYaml .Values.supersetNode.initContainers) rendering. That works mechanically, but…

  2. It breaks the readability contract of values.yaml. A user reading the chart defaults today sees the actual bash that will run. Under the partial pattern, they'd see an opaque {{ include ... }} and have to grep _helpers.tpl (or wherever the partial lives) to learn what their init container actually does. That's a strictly worse experience for the audience values.yaml is meant to serve.

  3. No real reduction in maintenance overhead. The duplicated bash is 15 lines that haven't changed in years on the dockerize side, and is unlikely to. Operators who want to customise the script today copy/paste the literal block; under the partial pattern they'd either still copy/paste (no gain) or have to learn the partial indirection mechanism (lose).

The tradeoff in this chart has consistently favoured explicit, override-friendly defaults in values.yaml over DRY-via-partials, and I think that's the right call here too.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion to extract the duplicated wait_for function into a shared Helm template partial aims to reduce semantic duplication and maintenance overhead. However, the user argues that keeping the duplication in values.yaml is intentional for several reasons:

  1. Mechanical Limitations: values.yaml is plain YAML and cannot directly use {{ include }} partials, requiring a workaround that may not be ideal.
  2. Readability: The current setup allows users to see the actual bash commands directly in values.yaml, which would be obscured by using a partial.
  3. Maintenance Overhead: The duplicated bash code is stable and unlikely to change, so the benefit of refactoring is minimal.

The user concludes that the current approach of explicit, override-friendly defaults in values.yaml is preferable for this specific case.

helm/superset/values.yaml

-      image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}"
-      imagePullPolicy: "{{ .Values.initImage.pullPolicy }}"
+      image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
+      imagePullPolicy: "{{ .Values.image.pullPolicy }}"
        envFrom:
          - secretRef:
              name: "{{ tpl .Values.envFromSecret . }}"
        command:
-        - /bin/sh
+        - /bin/bash
          - -c
-        - dockerize -wait "tcp://$DB_HOST:$DB_PORT" -timeout 120s
+        - |
+          # bash's /dev/tcp redirect performs a TCP connect; no external
+          # `dockerize`, `nc`, or busybox needed. SECONDS-based deadline
+          # mirrors the prior `dockerize -timeout 120s` behaviour.
+          SECONDS=0
+          until (echo > /dev/tcp/"$DB_HOST"/"$DB_PORT") 2>/dev/null; do
+            if [ "$SECONDS" -ge 120 ]; then
+              echo "timeout waiting for postgres at $DB_HOST:$DB_PORT after 120s" >&2
+              exit 1
+            fi
+            echo "waiting for postgres at $DB_HOST:$DB_PORT (elapsed ${SECONDS}s)"
+            sleep 2
+          done
+          echo "postgres at $DB_HOST:$DB_PORT is up"
        resources:
          limits:
            memory: "256Mi"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deploy:helm risk:breaking-change Issues or PRs that will introduce breaking changes size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants