feat(ingress): per-workload CF ingress + simplify apps to web-nvidia-smi demo#134
Merged
Conversation
…smi demo
Agent VMs can now declare `expose: {hostname_label, port}` on individual
boot workloads; dd-agent forwards those on /register, CP prepends them
to the cloudflared ingress alongside the default dashboard rule, and
CF provisions matching CNAMEs. Each entry becomes a public hostname
`<label>.<agent-hostname>` → `localhost:<port>`.
dd's apps/ example collapses from podman+ollama+openclaw down to
podman+web-nvidia-smi — one focused demo that proves podman, GPU
passthrough, and the new ingress path end-to-end. Ollama and openclaw
move out of this repo; they'll land in slopandmop as a self-contained
example where they belong.
Preview agent VM keeps its role as the registration smoke test against
per-PR CPs but drops its CPU-ollama workload. Prod agent VM serves
`gpu.<agent-host>.devopsdefender.com` with the container's
`nvidia-smi` output.
Boot-time exposure only in this PR. Runtime /deploy exposure for
POSTed workloads (e.g. anything slopandmop ships at runtime) is a
follow-up — those workloads still run, they're just not auto-routed
to a public hostname yet.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DD preview readyURL: https://pr-134.devopsdefender.com Browser login: paste CLI / curl: Register endpoint for a local agent: |
posix4e
added a commit
that referenced
this pull request
Apr 18, 2026
The prior shape — a JSON array substituted into
`"DD_EXTRA_INGRESS=${DD_EXTRA_INGRESS}"` — closed the outer env
string at the first embedded `"`, producing invalid JSON that broke
`jq -c .`:
jq: parse error: Invalid numeric literal at line 21, column 40
Seen on the dd-local-prod relaunch pipeline immediately after #134
merged (the failing job was in main's Release cascade).
Switches the wire format to comma-separated `label:port` pairs
(`gpu:8081` or `gpu:8081,web:9000`) and adds unit tests covering the
parser edge cases. HTTP request body from agent → CP /register still
carries the structured JSON shape — only the env-var-to-env-var hop
changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
posix4e
added a commit
that referenced
this pull request
Apr 18, 2026
PR #134 wired `expose` at boot time — the ingress rules baked into the agent VM's config.iso got published when CP first created the tunnel. This extends that to the runtime path: when a workload POSTed to dd-agent's /deploy declares `expose: {hostname_label, port}`, the agent now calls the CP's new /ingress/replace endpoint with the merged (boot + runtime) extras list, and CP re-PUTs the tunnel config + upserts a CNAME for the new hostname. Wire-level summary: - cf.rs — extract `apply_ingress()` used by both `create()` (at register) and a new public `update_ingress()` (at runtime). The existing tunnel id + token stay stable. - cp.rs — new endpoint POST /ingress/replace. PAT-authenticated, looks up the agent in the store by agent_id, re-PUTs the tunnel config, updates the store's `extras` field for the agent. - collector::Agent — gains `tunnel_id` + `extras` fields, preserved across /health scrapes so the collector doesn't clobber them. - agent.rs — stores `agent_id` from the register bootstrap, holds a live `Arc<RwLock<Vec<(String, u16)>>>` for the merged extras, hooks into /deploy to push updates. Soft-fails — workload stays running even if the ingress update fails; only public reachability is affected. Opens the runtime path slopandmop needs: POST openclaw to an agent, the agent asks CP to route `openclaw.<agent-host>` → localhost:port, CF picks up the ingress config within seconds, browser hits the URL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
posix4e
added a commit
that referenced
this pull request
Apr 18, 2026
…#136) The prior shape — a JSON array substituted into `"DD_EXTRA_INGRESS=${DD_EXTRA_INGRESS}"` — closed the outer env string at the first embedded `"`, producing invalid JSON that broke `jq -c .`: jq: parse error: Invalid numeric literal at line 21, column 40 Seen on the dd-local-prod relaunch pipeline immediately after #134 merged (the failing job was in main's Release cascade). Switches the wire format to comma-separated `label:port` pairs (`gpu:8081` or `gpu:8081,web:9000`) and adds unit tests covering the parser edge cases. HTTP request body from agent → CP /register still carries the structured JSON shape — only the env-var-to-env-var hop changes. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
posix4e
added a commit
that referenced
this pull request
Apr 18, 2026
PR #134 wired `expose` at boot time — the ingress rules baked into the agent VM's config.iso got published when CP first created the tunnel. This extends that to the runtime path: when a workload POSTed to dd-agent's /deploy declares `expose: {hostname_label, port}`, the agent now calls the CP's new /ingress/replace endpoint with the merged (boot + runtime) extras list, and CP re-PUTs the tunnel config + upserts a CNAME for the new hostname. Wire-level summary: - cf.rs — extract `apply_ingress()` used by both `create()` (at register) and a new public `update_ingress()` (at runtime). The existing tunnel id + token stay stable. - cp.rs — new endpoint POST /ingress/replace. PAT-authenticated, looks up the agent in the store by agent_id, re-PUTs the tunnel config, updates the store's `extras` field for the agent. - collector::Agent — gains `tunnel_id` + `extras` fields, preserved across /health scrapes so the collector doesn't clobber them. - agent.rs — stores `agent_id` from the register bootstrap, holds a live `Arc<RwLock<Vec<(String, u16)>>>` for the merged extras, hooks into /deploy to push updates. Soft-fails — workload stays running even if the ingress update fails; only public reachability is affected. Opens the runtime path slopandmop needs: POST openclaw to an agent, the agent asks CP to route `openclaw.<agent-host>` → localhost:port, CF picks up the ingress config within seconds, browser hits the URL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
posix4e
added a commit
that referenced
this pull request
Apr 18, 2026
PR #134 wired `expose` at boot time — the ingress rules baked into the agent VM's config.iso got published when CP first created the tunnel. This extends that to the runtime path: when a workload POSTed to dd-agent's /deploy declares `expose: {hostname_label, port}`, the agent now calls the CP's new /ingress/replace endpoint with the merged (boot + runtime) extras list, and CP re-PUTs the tunnel config + upserts a CNAME for the new hostname. Wire-level summary: - cf.rs — extract `apply_ingress()` used by both `create()` (at register) and a new public `update_ingress()` (at runtime). The existing tunnel id + token stay stable. - cp.rs — new endpoint POST /ingress/replace. PAT-authenticated, looks up the agent in the store by agent_id, re-PUTs the tunnel config, updates the store's `extras` field for the agent. - collector::Agent — gains `tunnel_id` + `extras` fields, preserved across /health scrapes so the collector doesn't clobber them. - agent.rs — stores `agent_id` from the register bootstrap, holds a live `Arc<RwLock<Vec<(String, u16)>>>` for the merged extras, hooks into /deploy to push updates. Soft-fails — workload stays running even if the ingress update fails; only public reachability is affected. Opens the runtime path slopandmop needs: POST openclaw to an agent, the agent asks CP to route `openclaw.<agent-host>` → localhost:port, CF picks up the ingress config within seconds, browser hits the URL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
posix4e
added a commit
that referenced
this pull request
Apr 18, 2026
PR #134 wired `expose` at boot time — the ingress rules baked into the agent VM's config.iso got published when CP first created the tunnel. This extends that to the runtime path: when a workload POSTed to dd-agent's /deploy declares `expose: {hostname_label, port}`, the agent now calls the CP's new /ingress/replace endpoint with the merged (boot + runtime) extras list, and CP re-PUTs the tunnel config + upserts a CNAME for the new hostname. Wire-level summary: - cf.rs — extract `apply_ingress()` used by both `create()` (at register) and a new public `update_ingress()` (at runtime). The existing tunnel id + token stay stable. - cp.rs — new endpoint POST /ingress/replace. PAT-authenticated, looks up the agent in the store by agent_id, re-PUTs the tunnel config, updates the store's `extras` field for the agent. - collector::Agent — gains `tunnel_id` + `extras` fields, preserved across /health scrapes so the collector doesn't clobber them. - agent.rs — stores `agent_id` from the register bootstrap, holds a live `Arc<RwLock<Vec<(String, u16)>>>` for the merged extras, hooks into /deploy to push updates. Soft-fails — workload stays running even if the ingress update fails; only public reachability is affected. Opens the runtime path slopandmop needs: POST openclaw to an agent, the agent asks CP to route `openclaw.<agent-host>` → localhost:port, CF picks up the ingress config within seconds, browser hits the URL. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Scope boundary
Boot-time exposure only. Runtime `/deploy` exposure for POSTed workloads (e.g. anything slopandmop ships at runtime) is a follow-up — those workloads still run, they're just not auto-routed to a public hostname yet.
Test plan
🤖 Generated with Claude Code