fix(endpoint-exposer): make install usable and spec tenant-agnostic#11
Merged
jcastiarena merged 7 commits intomainfrom Apr 21, 2026
Merged
fix(endpoint-exposer): make install usable and spec tenant-agnostic#11jcastiarena merged 7 commits intomainfrom
jcastiarena merged 7 commits intomainfrom
Conversation
- Remove the hardcoded enum on publicDomain/privateDomain (was
["hello.idp.poc.nullapps.io"], an example FQDN from another tenant's
POC). The enum bound every tenant to that FQDN through the UI
dropdown; the field is now free-text so each tenant provides its own.
A description is added on both fields so developers know what kind of
value is expected.
- Remove the {{ env.Getenv "SERVICE_NAME" }} and {{ env.Getenv "NRN" }}
templates. These are never rendered: the current
nullplatform/tofu-modules service_definition module reads the spec
via `data "http"` + `jsondecode()` (no gomplate) and resolves `name`
and `visible_to` from TF variables (var.service_name,
concat([var.nrn], ...)). So the templates "worked" only because other
fields happened to be overridden at the TF layer. Leaving them in
place invites future authors to add their own {{ env.Getenv }} for
non-overridden fields, which would silently fail.
The runtime workflow is unaffected — the Istio manifests are assembled
from the scope instance's attributes at execution time, not from the
spec's example values.
install/tofu/main.tf referenced the old variable names of nullplatform/tofu-modules' service_definition module: - git_repo, git_ref, git_service_path, use_tpl_files, git_password Those were removed in tofu-modules v1.52.x. Running `tofu plan` against the current main of tofu-modules (what installation.md tells tenants to clone) fails with "Unsupported argument" on five lines, blocking any tenant that follows the guide verbatim. Alongside, install/tofu was passing `agent_command` and `workflow_override_path` to service_definition_agent_association — neither is accepted by the module. The module builds the cmdline automatically from base_clone_path + repository_service_spec_repo + service_path + "/entrypoint/entrypoint", and exposes `agent_arguments` for passing flags to that entrypoint. Changes: - main.tf: use repository_org/name/branch/service_path + repository_token on service_definition; drop the unsupported agent_command and workflow_override_path on the association module; forward --overrides-path via agent_arguments. - variables.tf: split git_repo into repository_org + repository_name; rename git_branch → repository_branch and git_service_path → spec_path; add agent_service_path so the specs path (which includes "install/" for this service) and the runtime path (just "endpoint-exposer") can differ. Make github_token optional (nullplatform/services is public; the token is only needed for private forks). - terraform.tfvars.example: update to match. - installation.md: update the variables table, note that github_token is optional for the default (public) spec repo, fix the cmdline sample (was missing the trailing /entrypoint), and add a short "Domains" section explaining the free-text contract introduced in the spec cleanup commit. - prerequisites.md: update path-override guidance to the new variable names; clarify github_token is optional. - .terraform.lock.hcl: regenerated — hashicorp/external and hashicorp/null are no longer pulled in (the old module used them; the new one doesn't). Tested: `tofu fmt -check` clean; `tofu init -backend=false && tofu validate` succeeds against nullplatform/tofu-modules main.
Discovered when applying the migrated install/tofu module against a clean nullplatform account: `tofu plan` fails with Error: Error in function call ...Call to function "jsondecode" failed: extraneous data after JSON object. nullplatform/tofu-modules' `service_definition` module defaults `available_links = ["connect"]`, which makes it attempt to fetch `<service_path>/specs/links/connect.json.tpl` via HTTP. Endpoint Exposer is a `type = "dependency"` service that ships no link spec (no `install/specs/links/` directory), so the fetch returns a 404 HTML page — `jsondecode()` then aborts the whole plan with the generic "extraneous data" error, which is hard to map back to the root cause on first encounter. Passing an explicit empty list resolves it cleanly. The same override is now required in any tenant's own Terraform that registers this service without going through install/tofu — see the galicia-banco POC for the downstream mirror. A deeper fix belongs in tofu-modules' `service_definition` (defaulting `available_links` to `[]` when the spec itself doesn't declare any, or deriving it from `spec.available_links`); left out of this PR to keep the scope focused on the endpoint-exposer install flow.
…uired Each route item's JSON Schema listed "environment" in `required` but did NOT declare it under `properties` — only `method`, `path`, `scope`, `visibility`, and `groups`. A route's JSON Schema that requires a field it doesn't define is an instant validation failure for any input, so the UI bounces service creation with: /routes/0 must have required property 'environment' …even when the user has filled in every visible field. `environment` is already a top-level property of the service (populated from the associated scopes' dimensions); the per-route duplicate is a leftover, not intended functionality. Removes "environment" from `routes[].items.required` in all four schemas that duplicate the routes definition (main service attributes + the two action specifications, each with parameters/results schemas). Discovered during the Galicia POC smoke test: registration succeeded end-to-end, but service creation was blocked for every tenant until this is fixed.
Two runtime bugs that made the service unusable out-of-the-box:
1) `INGRESS_TYPE` defaulted to `alb`, but the repo only ships
`workflows/istio/`. Any tenant that didn't explicitly set the env
var to `istio` saw:
failed to read workflow file: open /workflows/alb/create.yaml:
no such file or directory
There is no `workflows/alb/` to fall back to — the default was
effectively dead. Switching the default to `istio` matches what the
repo actually contains; tenants that later add `workflows/alb/`
and want ALB can export `INGRESS_TYPE=alb` explicitly.
2) `SERVICE_PATH` was only populated if the agent passed
`--service-path=<abs-path>` as an argument. Without it, the path
starts empty and every derived path becomes absolute
(`/workflows/…`, `/values.yaml`), missing the service-root prefix
entirely. The script already computes `WORKING_DIRECTORY` from its
own location; the service root is `WORKING_DIRECTORY/..`. Using
that as a fallback keeps `--service-path` as an allowed override
but removes it as a silent requirement for basic operation.
Both fixes are backward-compatible: tenants currently passing
`INGRESS_TYPE=alb` + `--service-path=…` keep the same behaviour.
The Kubernetes Gateway API rejects non-absolute values for `Exact` and `PathPrefix` match types: spec.rules[N].matches[M].path: Invalid value: "object": value must be an absolute path and start with '/' when type one of ['Exact', 'PathPrefix'] Developers entering "health" in the UI reasonably expect it to be treated as "/health" — the UI doesn't document the requirement, and the scope UI for other nullplatform services is forgiving on this. The rejected route surfaces as a failed kubectl apply deep in the agent workflow, with no hint that the fix is "add a slash". `detect_path_type` now normalizes the value for the two absolute-only types (`Exact`, `PathPrefix`), leaving `RegularExpression` untouched since regex paths are free-form. Also handles the pre-existing edge case where wildcard inputs like `users/*` stripped to `users` (no leading slash); now normalized to `/users`.
javi-null
reviewed
Apr 20, 2026
Clarify that `name` and `visible_to` in service-spec.json.tpl are ignored
at apply time because the service_definition module overrides them from
TF variables. Prevents future authors from adding `{{ env.Getenv ... }}`
template expressions to non-overridden fields, which would reach the
nullplatform API as literals (no template engine in the pipeline).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
javi-null
approved these changes
Apr 21, 2026
sebastiancorrea81
approved these changes
Apr 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three related fixes so any tenant can register and use
endpoint-exposerby followinginstall/installation.mdverbatim. Today the install flow is broken in two independent ways and the spec locks every tenant to an example FQDN. Details below.Current broken state
1.
install/tofu/main.tfdoesn't compile against current tofu-modulesThe module call uses input names that were removed from
nullplatform/tofu-modulesin v1.52.x:git_reporepository_org+repository_namegit_refrepository_branchgit_service_pathservice_pathuse_tpl_filesgit_passwordrepository_tokenReproduction (exactly what
installation.mdtells tenants to do):Result:
…repeated for
git_ref,git_service_path,use_tpl_files,git_password. Five errors, zero successful plans.Additionally,
install/tofu/main.tfpassedagent_commandandworkflow_override_pathtoservice_definition_agent_association— neither is accepted by that module today. The module constructs the cmdline automatically as${base_clone_path}/${repository_service_spec_repo}/${service_path}/entrypoint/entrypointand exposesagent_argumentsfor flags.2. Hardcoded domain enums in the spec lock every tenant to an example FQDN
install/specs/service-spec.json.tplhad:…repeated for
privateDomainand across the duplicated schemas in each action specification (10 enum blocks total). This FQDN is a leftover example from another tenant's POC. With the enum present, the nullplatform UI offered it as the only selectable value — any other tenant (us for galicia, or anyone else) cannot create a scope through the UI because their real FQDN isn't in the enum.The reorganization of
endpoint-exposerunderinstall/(#3) carried this over unchanged from the previous branch.3.
{{ env.Getenv }}in the spec is dead codeThe spec ships with:
suggesting those two fields are parametrizable at
tofu applyvia env vars. They are not. The currentservice_definitionmodule reads the spec withdata "http"+jsondecode()— no template engine anywhere in the pipeline. These fields "work" only because the module overrides them explicitly from TF variables:Any future author who adds a
{{ env.Getenv "MY_FIELD" }}to a field that is not overridden at the TF layer will ship a spec with the literal template string as its value — silently, until a developer opens the scope creation form and sees{{ env.Getenv "MY_FIELD" }}in the UI.Changes
Spec (
install/specs/service-spec.json.tpl)"description"on both domain fields so developers know what kind of value to provide. Fields are now"type": "string"free-text; the FQDN is typed at scope creation time.{{ env.Getenv "SERVICE_NAME" ... }}template with the literal"Endpoint Exposer"(the TF layer overrides it regardless)."visible_to": ["{{ env.Getenv \"NRN\" }}"]with"visible_to": [](same reasoning; the TF layer sets it fromvar.nrn).Install (
install/tofu/)main.tfto the currentnullplatform/tofu-modulesAPI (repository_org/_name/_branch/service_path/repository_token).agent_command,workflow_override_path,service_description) from the agent association module call. Forward--overrides-path=<path>viaagent_argumentswhen overrides are enabled.git_repointorepository_org+repository_name, renamegit_branch→repository_branchandgit_service_path→spec_path. Addagent_service_pathso the specs path (includesinstall/for this service) and runtime path (justendpoint-exposer) can differ.github_tokenoptional (defaultnull) —nullplatform/servicesis public, token is only needed when pointing at a private fork.terraform.tfvars.exampleto match..terraform.lock.hcl—hashicorp/externalandhashicorp/nullare no longer transitive dependencies (the old module used them; the new one doesn't).Docs
installation.md: new variables table, note on optionalgithub_token, corrected cmdline example (was missing the trailing/entrypoint), new "Domains" section explaining the free-text contract.prerequisites.md: updated path-override guidance to the new variable names; clarifygithub_tokenis optional for the default spec repo.Test plan
tofu fmt -checkcleantofu init -backend=false && tofu validatesucceeds with local clone of currentnullplatform/tofu-modulesmainjq -e . install/specs/service-spec.json.tplparses (no unclosed template braces)gomplate -f install/specs/service-spec.json.tpl | jq .produces byte-identical output (idempotent — no templates remain)Notes for reviewers
nullplatform/tofu-modulesgeneration that is no longer current, and nobody has exercised the install guide end-to-end since the API changed.🤖 Generated with Claude Code