[Proposal] ${env:...} / ${file:...} interpolation in gateway config.toml
#1983
renuka-fernando
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
${env:NAME}/${file:/path}interpolation toconfig.tomlfor the gateway-controller and policy-engine, with bash-style modifiers (:-default,:?required-or-fail).APIP_GW_*env overrides cannot index into[[array]]elements, and Helm cannot read Secret contents at render time.${file:...}is recommended for secrets;${env:...}for non-sensitive deployment values.${file:...}paths gated by an allowlist configured via--config-file-source-allowlist/APIP_GW_CONFIG_FILE_SOURCE_ALLOWLIST(bootstrap; see Implementation Details).Motivation
gateway-helm-chart, and frequently need to inject sensitive values (passwords, client secrets, encryption keys, OAuth tokens) from k8s Secrets into specific fields ofconfig.toml.APIP_GW_*env overrides cannot index into array elements. A flat env var likeAPIP_GW_CONTROLLER_AUTH_BASIC_USERS_0_PASSWORDdoes not splice into the nth element of[[controller.auth.basic.users]]. This is a known gap shared with Viper.lookupexists but it bakes plaintext into a ConfigMap, defeating the purpose of using a Secret.values.yaml-only arrays work, but require committing secret values to Git, which is unacceptable.[[controller.auth.basic.users]],[[controller.encryption.providers]], and its nested[[controller.encryption.providers.keys]].envsubstworkaround, which leaves docker-compose, the all-in-one distribution, and bare-metal deployments unsolved.Proposal
${source:ref[:modifier]}tokens and substitutes resolved values in place, before unmarshal into the typed struct.envandfile. (fileRawdeferred — see Implementation Details.)${file:/path}trims trailing whitespace from file contents;${env:NAME}reads an environment variable.${env:NAME:-default}(default if unset/empty) and${env:NAME:?msg}(required, fails startup).$${-> literal${. One level only."https://${env:HOST}:${env:PORT}/v1").--config-file-source-allowlist/APIP_GW_CONFIG_FILE_SOURCE_ALLOWLIST(bootstrap config — cannot live inconfig.tomlitself). Defaults and enforcement rule live in Implementation Details.${file:/secrets/gateway-controller/admin-password}) appear in diagnostic surfaces.Why
${file:...}is preferred over${env:...}for secretsA core reason this proposal adds
file:at the same time asenv:is so we can recommend file projection for any sensitive value.Detailed comparison — env var risks vs mounted file advantages
Env var risks:
/proc/<pid>/environ./debug/vars-style endpoints.kubectl describe pod,docker inspect, audit logs -- especially if a chart misconfiguresvalue:instead ofvalueFrom:.Mounted file advantages:
tmpfs-backed in Kubernetes when the source is a Secret -- never hits disk.defaultMode: 0400).Industry alignment: Kubernetes docs, OWASP Secrets Management Cheat Sheet, AWS Secrets Manager, GCP Secret Manager, and Vault Agent all default to file projection for secrets.
Recommendation: use
${file:...}for any secret; reserve${env:...}for non-sensitive deployment-specific values such as cluster domain, log level, feature flags.Configuration Changes
Companion Kubernetes wiring (illustrative)
Lives in the user's manifests /
values.yaml:Before / after example -- moving a secret out of Git
Before -- the only way to put a secret into an array-of-tables element today is to commit it to
values.yaml:After -- the value lives in a Kubernetes Secret;
config.tomlreferences it:Open Debate: drop
APIP_GW_*Koanf env overrides entirely?Once
${env:NAME}interpolation exists, the platform has two ways to set a value from env: Koanf'sAPIP_GW_*provider and${env:...}insideconfig.toml. A live design question is whether to delete the Koanf env path and standardise on${env:...}alone.For dropping it:
gateway/gateway-controller/pkg/config/config.go:472-505(short-name aliases + underscore-collision fixes) is deleted outright — a class of silent bugs and per-field maintenance disappears.[[arrays]]" gap evaporates rather than being worked around; there is only one mechanism and it works inside arrays naturally.grep '${env:' configs/config.tomlis the complete list of env knobs. Today operators have to read struct tags + the switch table.level = "${env:APIP_GW_LOG_LEVEL:-info}"puts default and override on one line.Against dropping it:
APIP_GW_*env vars (docker-compose, all-in-one, custom k8s manifests, helmextraEnv) silently get defaults. Worst case: a secret env var is set, override is ignored, the gateway boots with a placeholder default and the failure mode is delayed.extraEnvlose that path; they must switch to chart values or${file:...}.Recommendation: drop it. Pre-1.0 / SNAPSHOT versions are exactly when this cut is cheapest, and the simplification compounds (no switch table, no array-indexing gap, one mechanism to document). Conditions:
APIP_GW_*usage is bounded to the dozen entries in the switch table plus a handful in compose files. If the count blows up, reconsider scope.gateway/configs/config.tomlto expose every name currently in the switch table asfield = "${env:APIP_GW_SAME_NAME:-default}"before deleting the Koanf env loader. Operators keep the same env-var names; the resolution path changes silently. This converts "breaking change" into "behaves the same in practice."APIP_GW_prefix as a documentation convention in shipped defaults so the platform's env footprint stays grep-able inkubectl describe pod.${env:...}migration recipe. No deprecation dance on SNAPSHOT.If accepted, the proposal reframes from "additive feature" to "replacement": the Motivation loses the Koanf-array-gap argument, Drawbacks loses the two-ways bullet and gains a breaking-change bullet, Alternative 5 is rejected on stronger grounds, and the "two distinct env-var mechanisms" bullet in Implementation Details is deleted.
Implementation Details
Loader module placement. The expansion pass lives in a new shared module
common/configinterpolate/, imported by both the gateway-controller and policy-engine config loaders. Keeping it incommon/(rather than duplicating per binary) ensures the grammar, escape rules, and allowlist semantics stay identical across the two binaries — both of which read their ownconfig.tomlindependently at startup (noconfig.tomlcontent is forwarded between binaries via xDS).Where the expansion pass slots into Koanf. Koanf is built to load and merge multiple providers (TOML file +
APIP_GW_*env) into a single map in one pass; the expansion step runs on the already-merged map, not between providers. Concrete flow:config.tomlvia the TOML provider.APIP_GW_*) on top — env values override file values.common/configinterpolate/walksk.Raw(), substitutes${...}in every string-valued leaf, and writes the substituted map back into the same Koanf instance viaconfmap.Provider(expanded, ".")— Koanf does not mutate values in place, so an explicit reload is required forUnmarshalWithConfto see the resolved values.RawConfigsnapshot for policies (policy-engine only). Todaycfg.PolicyEngine.RawConfig = k.Raw()is captured after env merge and passed to every policy viareg.SetConfig(...)for CEL${config}resolution insystemParameters. With this proposal, that snapshot is taken after the expansion pass, so policies that reference config (e.g. an Advanced Rate Limit policy reading a Redis password via${config.controller...}) get the resolved value directly rather than the literal${file:...}token. Trade-off: every loaded policy is in the trust boundary for any secret materialised intoconfig.toml. Policy authors must not log values pulled from${config}; this becomes part of the policy-authoring guidance and the policy-review checklist.Env-override values are part of the interpolation input. Because expansion runs after Koanf's merge, an env override may itself contain
${...}(e.g.APIP_GW_CONTROLLER_LOGGING_LEVEL="${env:DEBUG_LEVEL:-info}") and will be expanded. There is no "do not re-expand env overrides" carve-out; the path allowlist is the sole gate against${file:/etc/passwd}-style abuse from a hostile env var.Two distinct env-var mechanisms — do not conflate.
APIP_GW_*env overrides and${env:NAME}interpolation are separate features and interact only via the merge order above:APIP_GW_*overrides are matched by Koanf's env provider, stripped of the prefix, and mapped to a dotted Koanf key path (e.g.APIP_GW_CONTROLLER_SERVER_API_PORT→controller.server.api_port). The gateway-controller additionally carries a hand-maintained switch table (pkg/config/config.go) that hard-maps a small set of names —controlplane_host,gateway_registration_token,controller_controlplane_insecure_skip_verify, etc. — to their dotted paths, because the default_→.rule would split fields that contain literal underscores (e.g.insecure_skip_verify). The policy-engine uses only the generic__→ literal-_escape rule, no custom table.${env:NAME}interpolation reads an environment variable by its exact name and substitutes the resulting string into the TOML position where the token appears. It does not go through the prefix, the lowercasing, the_→.rule, or the controller's switch table.NAMEis whatever the operator wrote —CP_CLIENT_ID,CLUSTER_DOMAIN, etc.APIP_GW_*naming rules to use${env:...}, and there is no risk of an interpolation name accidentally colliding with a Koanf override path.Allowlist enforcement rule for
${file:...}. The check applies to the input path, not the resolved symlink target:filepath.Clean()to the path inside${file:...}; reject if the cleaned path contains.../secretsdoes not match/secrets-other).os.ReadFile(or equivalent); the OS resolves any symlinks transparently.Why no re-check. Kubernetes Secret mounts symlink
/secrets/foo → ..data/foo → /var/lib/kubelet/pods/<uuid>/...; the realpath is kubelet-managed, unpredictable, and changes across pod restarts. Re-checking the resolved target would make${file:...}unusable in the canonical target deployment (k8s). Vault Agent, the OpenTelemetry Collector file source, and Kubernetes' own Secret-to-env projection all treat the mounted path as the trust boundary — not the realpath. The operator owns the contents of allowlisted directories; if an untrusted actor can create symlinks inside/secrets/, the threat is broader than this feature can address.What this protects against. Typos and hostile env-overrides that walk into sensitive paths (
${file:/proc/self/environ},${file:/etc/passwd},${file:../../etc/shadow}) — those fail at step 1 or 2 without ever opening anything.Per-binary allowlist defaults and configuration. Defaults: gateway-controller ->
/etc/gateway-controller,/secrets/gateway-controller; policy-engine ->/etc/gateway-runtime,/secrets/gateway-runtime. Per-binary (rather than shared) so the controller can't read from/secrets/gateway-runtimeand vice versa — catches wrong-pod mount mistakes and keeps blast radius tight. Runtime defaults usegateway-runtime(the container/deployment name operators see in Helm andkubectl) rather thanpolicy-engine(the binary inside the container). The broad/secretsdefault is intentionally not included.${file:...}references, so it cannot live insideconfig.tomlitself (chicken-and-egg). Three sources, standard precedence (CLI flag > env var > defaults):--config-file-source-allowlist=/etc/gateway-controller,/secrets/gateway-controller,/custom/pathAPIP_GW_CONFIG_FILE_SOURCE_ALLOWLIST(comma-separated)gateway.config.fileSourceAllowlist: []that renders the env var, so operators don't have to memorise the variable name. Falls back to chart defaults if unset.${file:...}interpolation feature will resolve. It is not an OS-level access control — the gateway process can still open any file the kernel permits, the same as any process in the container. The allowlist exists to (a) catch accidental leakage of mounted paths through log lines / analytics / upstream headers and (b) prevent path-traversal-style typos (${file:/proc/self/environ},${file:../../etc/shadow})./var/run/secretsis excluded by default. It's the standard mount point for the auto-mounted Kubernetes service account token (/var/run/secrets/kubernetes.io/serviceaccount/token) and is therefore readable on every pod by default. Excluding the parent dir keeps casual${file:/var/run/secrets/...}access to the SA token off the table. Operators using secret-injector sidecars (Vault Agent, external-secrets, cert-manager) that mount under/var/run/secrets/<vendor>/extend the allowlist explicitly as part of wiring those sidecars.fileRawdeferred from v1. AfileRawvariant (verbatim, no trimming) was considered for use cases like multi-line PEM blocks but rejected — gateway PEM material is fed via path-valued fields (cert_file,key_file), not inlined intoconfig.toml. AddingfileRawlater if a concrete need emerges is easy; removing shipped syntax is not.Why escape is one level only (
$${...}->${...}, no$$$counting rule).\$\{[^}]+\}); a counting rule breaks one-liners.$${X}in my output" case is handled cleanly by${file:/path/to/snippet}.$$$today, so the syntax space stays reserved if a real use case appears.Interpolating non-string fields. To interpolate into a non-string field (int, bool, duration), wrap the placeholder in quotes — e.g.
listener_port = "${env:PORT:-8080}". The quoted form makes it a TOML string the interpolator can substitute; Koanf then coerces the resolved string to the declared field type at unmarshal time, so the field still ends up as the correct type.Startup summary log line. Emit a single info-level line at boot with counts only, e.g.
config interpolation: resolved 12 references (env=8, file=4). Counts are not sensitive; values and references are deliberately omitted. Helps operators confirm overrides are wired correctly without exposing structure or contents. Counting unit: one reference = one${...}token (so"https://${env:H}:${env:P}"counts as 2).Per-key debug log line. At debug level, log one entry per resolved reference: the destination Koanf key path, the source (
env/file), and the reference text (env name or file path) — never the resolved value. Example:config interpolation: controller.controlplane.apim_oauth2_client_secret <- file:/secrets/gateway-controller/cp/client-secret. Lets operators debug "did my Secret mount land in the right field?" without needing to dump config or risk leaking the value. Disabled at info and above.Drawbacks
APIP_GW_*Koanf env overrides and${env:...}. We keep both because Koanf overrides are convenient for top-level scalars; this proposal documents when to reach for which.Alternatives Considered
Alternative 1: Kustomize overlays
Pros / cons
internal/helmgateway/deploy.go), so a Kustomize layer would have to wrap the operator's output. Two templating layers; the operator no longer owns its artifacts. And Kustomize cannot read Secret contents at render time either -- same fundamental constraint as Helm.Alternative 2: initContainer +
envsubstconfig.tomlas a template, runenvsubst(or a sed pass) in an initContainer that writes the rendered file to anemptyDir, then run the gateway against the rendered file.Pros / cons
envsubstis fragile around$characters in passwords / random tokens.deployment.yaml.envsubstonly does env.Alternative 3: Inline-table marker form (
password = { from_env = "X" }/password = { from_file = "/y" })Pros / cons
string | inline-table) -- schemas, code generators, and IDE plugins all stumble."https://${env:HOST}".int,bool) need their own handling.Alternative 4: Go template syntax (
{{ env "X" }},{{ file "Y" }})This is the most important alternative because it's the obvious suggestion for a Go shop.
config.tomlthroughtext/templatewith a small allowed function set (env,file, maybedefault,required).${...}matches the surface area perfectly without becoming a programming language insideconfig.toml.Pros / cons — full rejection rationale (a) through (f) and middle-ground considered
config.tomlis already rendered by Helm's Go-template engine insidegateway-helm-chart/templates/gateway/gateway-config.yaml. If the runtime also uses Go-template syntax, every{{ env "X" }}literal that needs to survive into the final file has to be escaped against Helm first using ugly constructs like{{ printf "{{ env %q }}" "ADMIN_PASSWORD" }}or{{}}literal escapes. Users will get this wrong constantly, and Helm template errors are some of the worst messages in the ecosystem.${...}has no collision because Helm does not interpret it.if,range,with,define, pipelines, and (by convention) sprig. Once available, someone will put logic inconfig.toml. Config files that are also programs are hard to audit, hard to diff in PRs, hard to reason about for security.template: :3:14: executing "" at <env "X">are notoriously bad.${...}parser errors can be pointed at the exact source location:config.toml:42: ${env:CP_CLIENT_ID} not set.{{-/-}}) -- another foot-gun, especially in TOML where whitespace inside strings matters.${source:ref}is what Envoy, Loki, Grafana Agent, Tempo, Promtail, OpenTelemetry Collector, Vector, and Fluent Bit all use for runtime config interpolation. Go templates are conventional for generating whole files (Helm, Hugo, kubectl) -- not for embedded substitution inside config that the runtime reads at startup.${...}is a scanner in ~80 lines.<< env "X" >>) and a tiny allowed function set, to avoid the Helm collision in (a).Alternative 5: Extend
APIP_GW_*env overrides to support array indicesAPIP_GW_CONTROLLER_AUTH_BASIC_USERS_0_PASSWORDand splice into element 0.Pros / cons
config.toml.Compatibility
${...}are unchanged; existingAPIP_GW_*env overrides continue to override TOML values exactly as before (Koanf still merges env on top of file; the new expansion pass runs on the already-merged map).${...}substring would now be interpreted. The escape$${...}covers the corner case, and a one-time grep across our shipped samples will catch anything we own.References
${ENV_VAR}in a control-plane-adjacent config.${env:NAME}and${env:NAME:-default}style.${ENV}/${file:...}pattern.${VAR:-default}and${VAR:?err}semantics.extraEnv/extraEnvFrom/extraVolumeshooks:kubernetes/helm/gateway-helm-chart/templates/gateway/controller/deployment.yamland.../gateway-runtime/deployment.yaml.kubernetes/gateway-operator/internal/helmgateway/deploy.go.Beta Was this translation helpful? Give feedback.
All reactions