Skip to content

clusterd: SIGTERM handler + in-process gRPC FQDN (Variant A)#36100

Closed
jasonhernandez wants to merge 1 commit into
mainfrom
jason/distroless-process-lifecycle
Closed

clusterd: SIGTERM handler + in-process gRPC FQDN (Variant A)#36100
jasonhernandez wants to merge 1 commit into
mainfrom
jason/distroless-process-lifecycle

Conversation

@jasonhernandez
Copy link
Copy Markdown
Contributor

@jasonhernandez jasonhernandez commented Apr 15, 2026

Design question: where should clusterd's gRPC FQDN come from?

This is Variant A of a two-version pair (see #36872 for Variant B). Both add the SIGTERM handler that distroless clusterd needs as PID 1. They differ only in how CLUSTERD_GRPC_HOST is set once entrypoint.sh is removed by the distroless migration (#36099).

entrypoint.sh previously did, in Kubernetes only:

export CLUSTERD_GRPC_HOST=${CLUSTERD_GRPC_HOST:-$(hostname --fqdn)}
Variant A (this PR) Variant B (#36872)
Who resolves the FQDN clusterd, at startup orchestrator-kubernetes, at pod-spec build
Mechanism in-process getaddrinfo(AI_CANONNAME) (resolve_fqdn) downward API MZ_POD_NAME + env interpolation
Blocking DNS at startup yesgetaddrinfo has no timeout; if CoreDNS is briefly down at pod start, clusterd's main thread blocks no — kubelet substitutes the value
clusterd main() SIGTERM handler + resolve_fqdn + CLUSTERD_PROCESS SIGTERM handler + CLUSTERD_PROCESS only
Where pod identity lives split between binary and orchestrator with the orchestrator
Footprint self-contained; works under any orchestrator k8s-specific; +EnvVars in the pod spec

Why Variant A

Keeps the change confined to the clusterd binary — no orchestrator changes, and it works under any orchestrator (not just k8s) that gives the pod a resolvable hostname. It directly mirrors what entrypoint.sh did (hostname --fqdn).

Known downside

resolve_fqdn calls getaddrinfo with no timeout on the main thread at startup (flagged in the code comment). If DNS is unavailable at pod start, clusterd blocks. This is the same behavior the old shell had, so it's not a regression — but Variant B avoids it entirely by injecting the value from the orchestrator, which already computes the same FQDN.

Part of the distroless migration (#36099 image, this/#36872 lifecycle, #36101 UID/GID).

Test plan

  • cargo check -p mz-clusterd (rustc 1.96.0)
  • Verify in k8s that resolve_fqdn yields the pod FQDN and gRPC host validation passes
  • Decide A vs B before merge — close the other

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

Distroless images have no shell, so there's no init process to forward
signals. Add a direct SIGTERM handler in clusterd (using nix::signal)
and environmentd so graceful shutdown works without a wrapper script.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jasonhernandez jasonhernandez force-pushed the jason/distroless-process-lifecycle branch from 6aaafcb to c23ae38 Compare June 2, 2026 16:57
jasonhernandez added a commit that referenced this pull request Jun 2, 2026
Distroless images run as nonroot (UID 65534) instead of root. Add
version-gating so orchestratord sets the correct runAsUser/runAsGroup
based on the Materialize version, avoiding UID mismatches during
rolling upgrades from Debian-based to distroless images.

Gate versions (verified against release history, 2026-06):
- balancerd: V26_18_0. Its ci/Dockerfile switched to distroless-prod-base
  in v26.18.0 (prod-base in v26.17.x). The original V26_19_0 was off by
  one and would have forced UID 999 onto v26.18.x balancerd pods that
  actually run as 65534.
- environmentd/clusterd: V26_28_0, matching the release that ships their
  distroless migration (#36099). The original V26_20_0 predated the actual
  landing by ~8 releases (main is now 26.28-dev) and would have applied
  UID 65534 to v26.20-v26.27 images that still run as UID 999.

NOTE: the env/clusterd gate assumes #36099 lands in the 26.28 cycle. If it
slips, bump V26_28_0 to the actual release. The three distroless PRs
(#36099 image, #36100 SIGTERM, #36101 this) must ship in the same release.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jasonhernandez
Copy link
Copy Markdown
Contributor Author

jasonhernandez commented Jun 2, 2026

🔗 Distroless migration — coordination note

One of three PRs that ship together: #36099 (distroless image), #36100 (this — SIGTERM handler), #36101 (UID/GID gating).

Merge this before or with #36099: distroless runs the binary as PID 1, which silently ignores SIGTERM without an explicit handler. Safe to merge first (harmless on current images). environmentd already handles termination signals, so this change is correctly clusterd-only despite the commit subject.

@jasonhernandez jasonhernandez changed the title clusterd,environmentd: add SIGTERM handler for distroless containers clusterd: SIGTERM handler + in-process gRPC FQDN (Variant A) Jun 2, 2026
jasonhernandez added a commit that referenced this pull request Jun 2, 2026
Distroless containers run the binary directly as PID 1 (no tini/shell), so
two things from the removed entrypoint.sh are handled here:

1. PID 1 ignores signals with a SIG_DFL disposition, so SIGTERM from
   Kubernetes is silently dropped. Add an explicit termination-signal
   handler in clusterd (environmentd already has one). The StatefulSet
   ordinal (CLUSTERD_PROCESS) is derived in-process from the hostname.

2. entrypoint.sh set CLUSTERD_GRPC_HOST via `hostname --fqdn`. That value
   fed the CTP handshake's optional `server_fqdn` check: clusterd advertised
   its FQDN and the controller compared it against the address it dialed
   (transport.rs::handshake). The check only fired when the value was set,
   is unrelated to gRPC despite the name, and guards only against reaching a
   misrouted replica. Rather than re-plumb it for distroless, remove it:
   drop `--grpc-host`/`CLUSTERD_GRPC_HOST` and the `server_fqdn` field from
   the CTP `Hello`, along with the now-unused `host_from_address` helper and
   the `test_handshake_fqdn_mismatch` test. test_metrics byte-count bounds
   loosened since the handshake shrank.

This is the "rip it out" alternative to the FQDN-handling PRs (#36100
in-process resolve, #36872 orchestrator-injected). Pick one.

Part of SEC-236 distroless migration.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jasonhernandez
Copy link
Copy Markdown
Contributor Author

Closing in favor of the #36872#36876 stack.

The SIGTERM handler from this PR now lives in #36872 (minimal). The FQDN question this PR solved with an in-process resolve_fqdn is instead handled by #36876, which removes the CTP server-FQDN validation entirely — it turned out to be an optional, misnamed (it's CTP, not gRPC) check that isn't worth re-plumbing for distroless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant