Implement RFD-72: Ephemeral deployments#745
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRPCs for build and deploy now accept Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
api/deployment/rpc.yml (1)
174-212:⚠️ Potential issue | 🟡 MinorUpdate the
DeployVersioncontract for ephemeral mode.The method doc and
deploymentresult doc still say this call creates a deployment record and activates the version. Withephemeral_label, that is no longer true, so the generated API contract is wrong for the new path.As per coding guidelines, "RPC interfaces defined in schemas are generated into implementations using `pkg/rpc/cmd/rpcgen`."Suggested doc fix
- doc: Deploy an existing app version (used for rollback and redeploy). Server handles everything — creates deployment record, activates version, marks previous deployment. + doc: Deploy an existing app version (used for rollback and redeploy). Normal deploys create/update deployment records and activate the version; ephemeral deploys skip activation and deployment-record creation. @@ - - name: deployment - type: DeploymentInfo - doc: The newly created deployment record + - name: deployment + type: DeploymentInfo + doc: Deployment info for the requested version. For ephemeral deploys this may be non-persisted/synthetic because no deployment record is created.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@api/deployment/rpc.yml` around lines 174 - 212, The DeployVersion RPC contract needs its docs updated to reflect ephemeral mode: change the method description for DeployVersion to state that when ephemeral_label is set the server creates a transient/ephemeral deployment record and does NOT activate the version (and uses ephemeral_ttl default "24h" if not provided), and update the deployment result doc to clarify that the returned deployment may be ephemeral/non-activated when ephemeral_label is present; keep other parameter docs (is_rollback, env_vars, ephemeral_ttl) but ensure ephemeral_label and ephemeral_ttl behavior is explicitly described and that lock_info/access_info remain accurate for both normal and ephemeral flows.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cli/commands/deploy.go`:
- Around line 200-212: The current ephemeral branch prefixes ephemeralLabel to
each value from result.AccessInfo().Hostnames(), producing duplicates; instead,
when result.AccessInfo().HasHostnames() use the hostnames exactly as returned
(print "https://<hostname>") and do not prefix ephemeralLabel, and if
Hostnames() is empty but AccessInfo().ClusterHostname() is present, fall back to
constructing the URL as "https://<ephemeralLabel>.<clusterHostname>". Update the
block around result.AccessInfo(), Hostnames(), and ClusterHostname() (where
ephemeralLabel, ephemeralTTL, versionDisplay and ctx.Printf are used) to
implement this behavior.
In `@pkg/ephemeral/manage.go`:
- Around line 130-155: The deleteEphemeralVersion function currently proceeds to
delete the AppVersion even if listing or deleting sandbox pools failed; change
it to abort and return an error when pool cleanup fails. Specifically, in
deleteEphemeralVersion, after calling eac.List(entity.Ref(...)) return a
descriptive error instead of only logging when List fails, and inside the loop
over poolResp.Values() return an error (propagating the underlying err) if any
eac.Delete(poolID) fails rather than continuing; keep the existing logs for
context and only call eac.Delete for the AppVersion and log.Info("Deleted
ephemeral version", ...) when all pool operations succeeded.
In `@servers/build/build.go`:
- Around line 1449-1459: After parsing ephemeral.ttl with time.ParseDuration in
the isEphemeral branch, validate that ttlDuration is strictly positive; if
ttlDuration <= 0 return an error (e.g. "invalid ephemeral TTL %q: must be > 0")
instead of proceeding. Keep the existing ephemeralx.ValidateLabel check and only
set mrv.EphemeralLabel, mrv.EphemeralTtl and mrv.EphemeralExpiresAt after the
TTL passes this positive-duration validation so expired/zero TTLs are rejected
server-side.
- Around line 1461-1469: The calls to ephemeralx.ReplaceExisting and
ephemeralx.EnforceLimit currently only log warnings and allow the deploy to
continue; instead, fail the deploy when either returns an error by
propagating/returning a wrapped error from the surrounding function (do not
swallow it). Locate the ReplaceExisting and EnforceLimit calls (they use
ephemeralx.ReplaceExisting(ctx, b.ec.EAC(), appRec.ID, ephemeral.label, b.Log)
and ephemeralx.EnforceLimit(ctx, b.ec.EAC(), appRec.ID,
ephemeralx.DefaultMaxEphemeral, b.Log)) and change the handling so that on
non-nil err you return fmt.Errorf (or errors.Wrap) with contextual text like
"replace existing ephemeral failed" or "enforce ephemeral limit failed"
including err, rather than calling b.Log.Warn and continuing. Ensure callers
handle the returned error path so the deploy aborts.
In `@servers/deployment/server.go`:
- Around line 821-828: The code currently ignores errors from d.EAC.Get and
proceeds as if the revision was set; modify the block handling the Get in the
deployment flow so that if getErr != nil you log the error (use d.Log.Error with
the error and context including appVersion.ID), call results.SetError with a
descriptive message (e.g., "failed to read current version for ephemeral
update") and return the error/early return instead of continuing; keep the
existing SetRevision/Put logic when Get succeeds (i.e., in the branch where
getErr == nil call updateEntity.SetRevision(current.Entity().Revision()) and
d.EAC.Put and handle putErr as already done).
---
Outside diff comments:
In `@api/deployment/rpc.yml`:
- Around line 174-212: The DeployVersion RPC contract needs its docs updated to
reflect ephemeral mode: change the method description for DeployVersion to state
that when ephemeral_label is set the server creates a transient/ephemeral
deployment record and does NOT activate the version (and uses ephemeral_ttl
default "24h" if not provided), and update the deployment result doc to clarify
that the returned deployment may be ephemeral/non-activated when ephemeral_label
is present; keep other parameter docs (is_rollback, env_vars, ephemeral_ttl) but
ensure ephemeral_label and ephemeral_ttl behavior is explicitly described and
that lock_info/access_info remain accurate for both normal and ephemeral flows.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5c0ac918-207d-42e9-bc9f-a0d509456bda
📒 Files selected for processing (24)
api/build/build_v1alpha/rpc.gen.goapi/build/rpc.ymlapi/core/core_v1alpha/schema.gen.goapi/core/schema.ymlapi/deployment/deployment_v1alpha/rpc.gen.goapi/deployment/rpc.ymlapi/ingress/client.goapi/ingress/client_test.goblackbox/ephemeral_test.gocli/commands/app_rollback.gocli/commands/app_versions.gocli/commands/commands.gocli/commands/deploy.gocomponents/coordinate/coordinate.gocontrollers/ephemeral/gc.gocontrollers/ephemeral/gc_test.gopkg/ephemeral/label.gopkg/ephemeral/label_test.gopkg/ephemeral/manage.gopkg/ephemeral/manage_test.goservers/build/build.goservers/deployment/server.goservers/deployment/server_test.goservers/httpingress/httpingress.go
There was a problem hiding this comment.
🧹 Nitpick comments (2)
cli/commands/deploy.go (2)
205-215: Possible duplicate URL display when both Hostnames and ClusterHostname are present.If the server returns both resolved hostnames (already containing the ephemeral label) and a ClusterHostname, the CLI will print both:
- The resolved hostnames from
info.Hostnames()(correct)- A fallback URL constructed as
ephemeralLabel.ClusterHostname(line 213)When routes exist, the server-resolved hostnames are authoritative. The ClusterHostname fallback should only display when no hostnames are returned. Consider guarding the ClusterHostname display:
♻️ Proposed fix to avoid redundant URL display
if result.HasAccessInfo() && result.AccessInfo() != nil { info := result.AccessInfo() if info.HasHostnames() { for _, h := range *info.Hostnames() { ctx.Printf(" URL: https://%s\n", h) } - } - if info.HasClusterHostname() && info.ClusterHostname() != "" { + } else if info.HasClusterHostname() && info.ClusterHostname() != "" { + // Fallback: construct ephemeral URL from cluster hostname when no routes exist ctx.Printf(" URL: https://%s.%s\n", ephemeralLabel, info.ClusterHostname()) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/commands/deploy.go` around lines 205 - 215, The CLI currently prints both resolved hostnames and a constructed cluster fallback URL; update the logic in the access-info block (use result.HasAccessInfo(), result.AccessInfo(), info.HasHostnames(), info.Hostnames(), info.HasClusterHostname(), info.ClusterHostname(), ephemeralLabel, and ctx.Printf) so that the ClusterHostname fallback is only printed when there are no hostnames returned by info.Hostnames() (i.e., if info.HasHostnames() is false or info.Hostnames() is empty), otherwise skip the ClusterHostname print to avoid duplicate URLs. Ensure you keep existing nil/Has* checks and only fall back to printing the ephemeralLabel.ClusterHostname when hostnames are absent.
875-885: Same redundant URL display concern in build path.This block has the same pattern as lines 205-215 where both resolved hostnames and ClusterHostname-based URLs may be displayed simultaneously. Consider applying the same fix to use ClusterHostname only as a fallback.
♻️ Proposed fix
if results.HasAccessInfo() && results.AccessInfo() != nil { info := results.AccessInfo() if info.HasHostnames() { for _, h := range *info.Hostnames() { ctx.Printf(" URL: https://%s\n", h) } - } - if info.HasClusterHostname() && info.ClusterHostname() != "" { + } else if info.HasClusterHostname() && info.ClusterHostname() != "" { ctx.Printf(" URL: https://%s.%s\n", ephemeralLabel, info.ClusterHostname()) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/commands/deploy.go` around lines 875 - 885, The URL printing currently shows both resolved hostnames and a ClusterHostname-based URL which can duplicate outputs; update the block that uses results.HasAccessInfo()/results.AccessInfo() and info.HasHostnames()/info.HasClusterHostname() so that you prefer printing hostnames from info.Hostnames() when present, and only use ctx.Printf(" URL: https://%s.%s\n", ephemeralLabel, info.ClusterHostname()) as a fallback when info.HasHostnames() is false or the hostnames slice is empty (also ensure ClusterHostname is non-empty); keep the existing ctx.Printf usage and ephemeralLabel reference but guard the ClusterHostname path so it does not print alongside explicit hostnames.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@cli/commands/deploy.go`:
- Around line 205-215: The CLI currently prints both resolved hostnames and a
constructed cluster fallback URL; update the logic in the access-info block (use
result.HasAccessInfo(), result.AccessInfo(), info.HasHostnames(),
info.Hostnames(), info.HasClusterHostname(), info.ClusterHostname(),
ephemeralLabel, and ctx.Printf) so that the ClusterHostname fallback is only
printed when there are no hostnames returned by info.Hostnames() (i.e., if
info.HasHostnames() is false or info.Hostnames() is empty), otherwise skip the
ClusterHostname print to avoid duplicate URLs. Ensure you keep existing nil/Has*
checks and only fall back to printing the ephemeralLabel.ClusterHostname when
hostnames are absent.
- Around line 875-885: The URL printing currently shows both resolved hostnames
and a ClusterHostname-based URL which can duplicate outputs; update the block
that uses results.HasAccessInfo()/results.AccessInfo() and
info.HasHostnames()/info.HasClusterHostname() so that you prefer printing
hostnames from info.Hostnames() when present, and only use ctx.Printf(" URL:
https://%s.%s\n", ephemeralLabel, info.ClusterHostname()) as a fallback when
info.HasHostnames() is false or the hostnames slice is empty (also ensure
ClusterHostname is non-empty); keep the existing ctx.Printf usage and
ephemeralLabel reference but guard the ClusterHostname path so it does not print
alongside explicit hostnames.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 39bdc33a-f758-4ca5-b7dd-847cfc156e72
📒 Files selected for processing (9)
cli/commands/deploy.godocs/command-sidebar.jsondocs/docs/command/app-versions.mddocs/docs/command/app.mddocs/docs/command/deploy.mddocs/docs/commands.mdpkg/ephemeral/manage.goservers/build/build.goservers/deployment/server.go
✅ Files skipped from review due to trivial changes (5)
- docs/docs/command/app.md
- docs/command-sidebar.json
- docs/docs/commands.md
- docs/docs/command/app-versions.md
- docs/docs/command/deploy.md
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/ephemeral/manage.go
phinze
left a comment
There was a problem hiding this comment.
Clegon and Phenkman here. We spent some quality time with these ephemeral apparitions, tracking their movements from manifestation (deploy) through haunting (ingress routing) to exorcism (GC). The spectral containment system is well-built. A few field notes from our investigation.
CodeRabbit's earlier sightings (pool cleanup ordering, TTL validation, error propagation) were all banished in follow-up commits. Our notes focus on phenomena the automated detectors missed.
👻 Who you gonna call? miren deploy --ephemeral
| "miren.dev/runtime/pkg/entity" | ||
| ) | ||
|
|
||
| const DefaultMaxEphemeral = 10 |
There was a problem hiding this comment.
This package is part ghost taxonomy (label normalization, pure strings) and part ghost wrangling (entity CRUD with an injected client). Slightly split personality, but not worth restructuring now. MIR-441 will naturally clarify its identity by pulling the orchestration into saga actions.
b6115f6 to
73c3452
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/ephemeral/manage.go`:
- Around line 145-152: The loop currently deletes every pool via eac.Delete(ctx,
poolID) which will remove pools still referenced by other versions; change it to
first inspect the pool's referenced list (e.g., p.ReferencedByVersions() or
similar) and only call eac.Delete when that list contains only this versionID;
if the pool is referenced by multiple versions, remove this versionID from the
pool's ReferencedByVersions and persist that change via the appropriate API
(e.g., an update/patch call on the pool) instead of deleting; keep using
pool.Id(), eac.Delete, and versionID as the identifying symbols when
implementing the conditional delete vs. reference-removal flow.
- Around line 186-191: The current loop in manage.go returns nil when it
encounters an expired ephemeral version (checking av.EphemeralExpiresAt against
now), which lets an expired row shadow a live one; change the logic inside the
loop that currently does "return nil, nil" for expired matches to instead skip
that record and continue scanning (e.g., use continue) so the loop can still
find and return a live &av later; ensure the function still returns nil only if
no valid non-expired match is found after the loop completes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0b3e507c-dad3-45af-a6cf-dfc613c8ac67
📒 Files selected for processing (2)
pkg/ephemeral/manage.gopkg/ephemeral/manage_test.go
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
servers/httpingress/httpingress.go (1)
526-542:⚠️ Potential issue | 🟠 MajorSurface
lookupEphemeralRoutefailures instead of falling through.If
lookupEphemeralRoutereturns an error here, theelse ifcondition drops it and continues into the default-route/404 branch. That can misroute an ephemeral hostname to the default app instead of returning a 500.🩹 Suggested handling
- } else if label, baseRoute, err := h.lookupEphemeralRoute(ctx, onlyHost); err == nil && baseRoute != nil { - // No exact or wildcard match, but stripping the first subdomain label - // matched an existing route — this is an ephemeral subdomain request. - route = baseRoute - targetAppId = baseRoute.App - routeType = "route" - ephemeralLabel = label - h.Log.Debug("detected ephemeral subdomain", "host", onlyHost, "label", label, "app", targetAppId) + } else { + label, baseRoute, err := h.lookupEphemeralRoute(ctx, onlyHost) + if err != nil { + h.Log.Error("error looking up ephemeral route", "error", err, "host", onlyHost) + http.Error(w, fmt.Sprintf("error looking up http route: %s", onlyHost), http.StatusInternalServerError) + return + } + if baseRoute != nil { + // No exact or wildcard match, but stripping the first subdomain label + // matched an existing route — this is an ephemeral subdomain request. + route = baseRoute + targetAppId = baseRoute.App + routeType = "route" + ephemeralLabel = label + h.Log.Debug("detected ephemeral subdomain", "host", onlyHost, "label", label, "app", targetAppId) - if !entity.Empty(baseRoute.OidcProvider) { - oidcWrapped := h.oidcMiddleware(route, func(w http.ResponseWriter, r *http.Request) { - h.serveAuthenticatedRequest(w, r, targetAppId, routeType, ephemeralLabel, appName) - }) - oidcWrapped(w, req) - return - } - } else { + if !entity.Empty(baseRoute.OidcProvider) { + oidcWrapped := h.oidcMiddleware(route, func(w http.ResponseWriter, r *http.Request) { + h.serveAuthenticatedRequest(w, r, targetAppId, routeType, ephemeralLabel, appName) + }) + oidcWrapped(w, req) + return + } + } else { // No route found, try to find a default route h.Log.Debug("no http route found, checking for default route", "host", onlyHost) + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@servers/httpingress/httpingress.go` around lines 526 - 542, Call h.lookupEphemeralRoute(ctx, onlyHost) separately and if it returns a non-nil error surface it (log and write HTTP 500 / return) instead of letting the else-if drop the error; only proceed to set route, targetAppId, routeType, ephemeralLabel and run oidcMiddleware/serveAuthenticatedRequest when err == nil and baseRoute != nil. Ensure you reference the same variables used now (lookupEphemeralRoute, baseRoute, route, targetAppId, oidcMiddleware, serveAuthenticatedRequest) and return immediately after writing the 500 to avoid falling through to the default-route/404 branch.
♻️ Duplicate comments (4)
cli/commands/deploy.go (1)
200-215:⚠️ Potential issue | 🟠 MajorUse the server-resolved ephemeral hostnames as-is here.
Both branches already have authoritative URLs in
AccessInfo.Hostnames(). Appendinghttps://<label>.<clusterHostname>unconditionally can invent an invalid URL for apps that only have explicit host routes, and in the build path Line 925 then prints a second access block on top of this one. Please centralize this rendering and only fall back to<label>.<clusterHostname>when no hostnames were returned for a default-route case.Also applies to: 867-885
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cli/commands/deploy.go` around lines 200 - 215, The code prints an extra constructed URL using ephemeralLabel and AccessInfo.ClusterHostname() even when the server already returned authoritative hostnames via result.AccessInfo().Hostnames(), causing duplicate/invalid URLs; update the rendering in the ephemeral branch (the block that checks ephemeralLabel, ephemeralTTL and result.HasAccessInfo()) to: prefer and print each hostname from result.AccessInfo().Hostnames() exactly as returned, and only if no hostnames are present but AccessInfo().ClusterHostname() is non-empty, then print the fallback https://<ephemeralLabel>.<clusterHostname> URL; centralize this logic so both places (the ephemeral branch and the other access-printing block) use the same rule and avoid printing duplicate access blocks.servers/httpingress/httpingress.go (1)
503-514:⚠️ Potential issue | 🟠 MajorDon't infer ephemeral labels from matched wildcard routes.
Line 509 makes every matched
*.hostroute behave like an ephemeral namespace. If no ephemeral version exists for that first label, normal wildcard traffic now 404s instead of reaching the route's active version. The strip-first-label fallback below already implements the intended “normal route + wildcard DNS” behavior, so this extra extraction regresses existing wildcard routes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@servers/httpingress/httpingress.go` around lines 503 - 514, The code is incorrectly treating any matched wildcard route as an ephemeral subdomain by calling ingress.ExtractSubdomainLabel and setting ephemeralLabel (in the route handling block for variable route and onlyHost); remove that behavior so matched wildcard routes are treated as normal routes: delete or skip the ephemeralLabel extraction and associated log branch inside the route != nil block (the lines using ephemeralLabel, ingress.ExtractSubdomainLabel, and the Debug that declares "detected ephemeral subdomain via wildcard route"), and rely on the existing strip-first-label fallback logic to detect ephemeral namespaces instead.pkg/ephemeral/manage.go (2)
186-189:⚠️ Potential issue | 🟠 MajorSkip expired matches instead of returning immediately.
Ephemeral deploys bypass locking, so duplicate
appID+ label rows are possible. Returningnilon the first expired match lets an expired row shadow a live one later in the result set and turns a valid hostname into a false 404.🩹 Suggested fix
// Reject expired versions immediately rather than waiting for GC if !av.EphemeralExpiresAt.IsZero() && now.After(av.EphemeralExpiresAt) { - return nil, nil + continue }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/ephemeral/manage.go` around lines 186 - 189, The code currently returns nil when encountering an expired ephemeral row (checking av.EphemeralExpiresAt and now), which can prematurely hide later valid matches; instead, inside the loop that iterates matches (the block referencing av and now), skip the expired entry by replacing the early "return nil, nil" with "continue" so the loop keeps searching for a non-expired row; ensure the function still returns the appropriate result after the loop if no valid match is found.
145-152:⚠️ Potential issue | 🟠 MajorDon't delete pools that are still referenced by other versions.
SandboxPoolReferencedByVersionsis plural. This loop deletes every pool that mentionsversionID, so a reused pool can be torn down while another version still references it. Only delete pools exclusively owned by this version; otherwise remove this version's reference and keep the pool. Based on learnings: incomponents/activator/activator.go, reused pools can temporarily remain referenced by multiple versions viaReferencedByVersions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/ephemeral/manage.go` around lines 145 - 152, The loop currently deletes every pool that lists versionID; instead, check each pool's SandboxPoolReferencedByVersions (or equivalent field) for multiple references: for each p from poolResp.Values() get poolID and referenced := p.SandboxPoolReferencedByVersions() (or its accessor) and if the slice/set has only this versionID call eac.Delete(ctx, poolID) and append errors to poolErrs as before; if the pool is referenced by other versions, remove this versionID from the pool's reference list and update the pool (do not call eac.Delete), logging the reference-removal path and preserving pools shared by other versions. Ensure you still log and append errors from either the Delete or the update operation.
🧹 Nitpick comments (1)
servers/deployment/server_test.go (1)
1150-1285: Assert the contracts that distinguish ephemeral deploys from normal deploys.This test covers TTL persistence, but it would still pass if the server started creating real deployment records again or stopped replacing same-label previews. Please add assertions for both behaviors: deployment count should stay unchanged across an ephemeral deploy, and a second deploy with the same label should leave exactly one matching ephemeral version (or otherwise prove the first one was replaced).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@servers/deployment/server_test.go` around lines 1150 - 1285, Update TestDeployVersionEphemeral to assert that ephemeral deploys do not create real Deployment records and that repeated ephemeral deploys with the same label replace the previous preview: before the first client.DeployVersion call capture the current deployment count via inmem.Client.List (or the in-memory store list API) and assert the count is unchanged after the ephemeral deploy (use server/client deployment types from deployment_v1alpha and the DeployVersion result.Deployment when present); then perform a second ephemeral client.DeployVersion with the same ephemeral label and TTL and assert there is exactly one AppVersion with EphemeralLabel == "feat-preview" (or query versions via inmem.Client.List/Get) to prove the prior preview was replaced rather than duplicated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@servers/deployment/server.go`:
- Around line 779-839: The code currently mutates the existing AppVersion
(appVersion) in place to set EphemeralLabel/EphemeralTtl/EphemeralExpiresAt;
instead, clone the source AppVersion into a new AppVersion entity (e.g., copy
struct, clear ID/revision), set the ephemeral fields on the clone, persist the
clone via the entity API (create a new entity with SetAttrs from clone and call
d.EAC.Put/Create), and use that new version ID for ephemeral lifecycle calls
(ephemeralx.ReplaceExisting, ephemeralx.EnforceLimit) and results, leaving the
original appVersion unchanged; ensure you do not reuse
updateEntity.SetId(string(appVersion.ID)) or set the revision from current —
create a new entity id/revision for the cloned version so the active version is
not modified.
In `@servers/httpingress/httpingress.go`:
- Around line 679-719: The lease cache must be scoped to the resolved version
ID, not just the app ID: after resolving the AppVersion into av (via
ephemeralx.LookupByLabel or h.eac.Get and av.Decode), compute a lease key
derived from av.ID (e.g., targetAppId.String()+":"+string(av.ID) or similar) and
replace any earlier calls that used useLease(targetAppId.String()) /
retainLease(targetAppId.String()) with the new version-scoped key; ensure
release/invalidate paths also use the same version-scoped key so ephemeral and
active versions never share the same lease.
---
Outside diff comments:
In `@servers/httpingress/httpingress.go`:
- Around line 526-542: Call h.lookupEphemeralRoute(ctx, onlyHost) separately and
if it returns a non-nil error surface it (log and write HTTP 500 / return)
instead of letting the else-if drop the error; only proceed to set route,
targetAppId, routeType, ephemeralLabel and run
oidcMiddleware/serveAuthenticatedRequest when err == nil and baseRoute != nil.
Ensure you reference the same variables used now (lookupEphemeralRoute,
baseRoute, route, targetAppId, oidcMiddleware, serveAuthenticatedRequest) and
return immediately after writing the 500 to avoid falling through to the
default-route/404 branch.
---
Duplicate comments:
In `@cli/commands/deploy.go`:
- Around line 200-215: The code prints an extra constructed URL using
ephemeralLabel and AccessInfo.ClusterHostname() even when the server already
returned authoritative hostnames via result.AccessInfo().Hostnames(), causing
duplicate/invalid URLs; update the rendering in the ephemeral branch (the block
that checks ephemeralLabel, ephemeralTTL and result.HasAccessInfo()) to: prefer
and print each hostname from result.AccessInfo().Hostnames() exactly as
returned, and only if no hostnames are present but
AccessInfo().ClusterHostname() is non-empty, then print the fallback
https://<ephemeralLabel>.<clusterHostname> URL; centralize this logic so both
places (the ephemeral branch and the other access-printing block) use the same
rule and avoid printing duplicate access blocks.
In `@pkg/ephemeral/manage.go`:
- Around line 186-189: The code currently returns nil when encountering an
expired ephemeral row (checking av.EphemeralExpiresAt and now), which can
prematurely hide later valid matches; instead, inside the loop that iterates
matches (the block referencing av and now), skip the expired entry by replacing
the early "return nil, nil" with "continue" so the loop keeps searching for a
non-expired row; ensure the function still returns the appropriate result after
the loop if no valid match is found.
- Around line 145-152: The loop currently deletes every pool that lists
versionID; instead, check each pool's SandboxPoolReferencedByVersions (or
equivalent field) for multiple references: for each p from poolResp.Values() get
poolID and referenced := p.SandboxPoolReferencedByVersions() (or its accessor)
and if the slice/set has only this versionID call eac.Delete(ctx, poolID) and
append errors to poolErrs as before; if the pool is referenced by other
versions, remove this versionID from the pool's reference list and update the
pool (do not call eac.Delete), logging the reference-removal path and preserving
pools shared by other versions. Ensure you still log and append errors from
either the Delete or the update operation.
In `@servers/httpingress/httpingress.go`:
- Around line 503-514: The code is incorrectly treating any matched wildcard
route as an ephemeral subdomain by calling ingress.ExtractSubdomainLabel and
setting ephemeralLabel (in the route handling block for variable route and
onlyHost); remove that behavior so matched wildcard routes are treated as normal
routes: delete or skip the ephemeralLabel extraction and associated log branch
inside the route != nil block (the lines using ephemeralLabel,
ingress.ExtractSubdomainLabel, and the Debug that declares "detected ephemeral
subdomain via wildcard route"), and rely on the existing strip-first-label
fallback logic to detect ephemeral namespaces instead.
---
Nitpick comments:
In `@servers/deployment/server_test.go`:
- Around line 1150-1285: Update TestDeployVersionEphemeral to assert that
ephemeral deploys do not create real Deployment records and that repeated
ephemeral deploys with the same label replace the previous preview: before the
first client.DeployVersion call capture the current deployment count via
inmem.Client.List (or the in-memory store list API) and assert the count is
unchanged after the ephemeral deploy (use server/client deployment types from
deployment_v1alpha and the DeployVersion result.Deployment when present); then
perform a second ephemeral client.DeployVersion with the same ephemeral label
and TTL and assert there is exactly one AppVersion with EphemeralLabel ==
"feat-preview" (or query versions via inmem.Client.List/Get) to prove the prior
preview was replaced rather than duplicated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3a17c4b1-45b7-49b6-b107-48ef7a870817
📒 Files selected for processing (29)
api/build/build_v1alpha/rpc.gen.goapi/build/rpc.ymlapi/core/core_v1alpha/schema.gen.goapi/core/schema.ymlapi/deployment/deployment_v1alpha/rpc.gen.goapi/deployment/rpc.ymlapi/ingress/client.goapi/ingress/client_test.goblackbox/ephemeral_test.gocli/commands/app_rollback.gocli/commands/app_versions.gocli/commands/commands.gocli/commands/deploy.gocomponents/coordinate/coordinate.gocontrollers/ephemeral/gc.gocontrollers/ephemeral/gc_test.godocs/command-sidebar.jsondocs/docs/command/app-versions.mddocs/docs/command/app.mddocs/docs/command/deploy.mddocs/docs/commands.mdpkg/ephemeral/label.gopkg/ephemeral/label_test.gopkg/ephemeral/manage.gopkg/ephemeral/manage_test.goservers/build/build.goservers/deployment/server.goservers/deployment/server_test.goservers/httpingress/httpingress.go
✅ Files skipped from review due to trivial changes (9)
- docs/docs/commands.md
- docs/command-sidebar.json
- docs/docs/command/app.md
- api/ingress/client_test.go
- docs/docs/command/app-versions.md
- pkg/ephemeral/label.go
- pkg/ephemeral/label_test.go
- docs/docs/command/deploy.md
- api/core/core_v1alpha/schema.gen.go
🚧 Files skipped from review as they are similar to previous changes (8)
- api/deployment/rpc.yml
- cli/commands/commands.go
- api/core/schema.yml
- api/ingress/client.go
- controllers/ephemeral/gc_test.go
- cli/commands/app_versions.go
- api/deployment/deployment_v1alpha/rpc.gen.go
- servers/build/build.go
Add ephemeral deployments — app versions tagged with a label and TTL that are never set as the active version. Each ephemeral version is accessed via a subdomain prefix on the app's existing route (e.g., feat-x.app.example.com). Pools spin up on demand and spin down when idle. Versions are automatically deleted when their TTL expires. Key changes: Schema: Add ephemeral_label (indexed), ephemeral_ttl, and ephemeral_expires_at (indexed) fields to the AppVersion entity. CLI: Add --ephemeral and --ttl flags to `miren deploy`. Add new `miren app versions` command with --ephemeral filter and JSON output. Ephemeral deploys skip deployment record creation and locking. Build & deployment servers: When ephemeral, set version fields, skip SetActiveVersion, handle replace-on-same-label, enforce per-app limit (max 10), and validate labels server-side. HTTP ingress: New lookupEphemeralRoute strips the first subdomain label and matches against existing routes — no wildcard route entity required. Resolves ephemeral AppVersion by label and passes it to AcquireLease (activator needs zero changes). GC controller: Periodic cleanup (every 5m) of expired ephemeral versions and their associated sandbox pools. Wired into coordinator. Closes MIR-992
- Fix double-prefixing of ephemeral hostnames in --version deploy path: deployment server's getAccessInfo now resolves wildcard/normal routes into concrete ephemeral hostnames when given a label, so the CLI can print them as-is. Adds ClusterHostname fallback. - Abort ephemeral version deletion when sandbox pool cleanup fails so pools aren't leaked. The version entity is retained for GC retry. - Reject non-positive ephemeral TTLs (ttl <= 0) in both build and deployment server paths — prevents versions that expire immediately. - Fail the deploy when ReplaceExisting or EnforceLimit fails rather than logging and continuing. Silent failures could violate label uniqueness or the per-app limit. - Fail loudly when fetching the app version revision fails during ephemeral update. Previously the Put was silently skipped and the user saw success while the ephemeral fields were never persisted.
LookupByLabel now checks EphemeralExpiresAt and returns nil for versions past their TTL. This stops routing traffic to expired versions immediately rather than waiting up to 5 minutes for the next GC sweep.
- LookupByLabel: change `return nil` to `continue` on expired match so later valid entries aren't shadowed by a stale one in the result set. - deleteEphemeralVersion: before deleting a sandbox pool, check if it's shared (ReferencedByVersions has multiple entries). For shared pools, remove only the current version's reference instead of deleting the entire pool, preventing disruption to other versions sharing it.
- DeployVersion: clone the source AppVersion into a new entity instead of mutating it in place. Deploying an active version as ephemeral no longer corrupts the live version with ephemeral fields. - HTTP ingress: scope the lease cache key by ephemeral label (appID:eph:label) so ephemeral and active versions never share cached leases. Previously all versions of an app shared one cache entry, which could proxy ephemeral traffic to active sandboxes or vice versa.
The DeploymentLauncher creates pools when App.ActiveVersion changes, but ephemeral versions skip SetActiveVersion so no pool exists when the first request arrives. Add a PoolCreator interface to the activator. When requestPoolCapacity can't find a pool after retries and a PoolCreator is registered, it calls CreatePoolForVersion to create one on demand. The Launcher implements this interface, reusing its existing ensurePoolForService logic. The coordinator wires the Launcher into the activator via SetPoolCreator at startup.
d0dcc25 to
006f069
Compare
phinze
left a comment
There was a problem hiding this comment.
Clegon and Phenkman checking back in. We dug through the delta and ended up doing some git archaeology, because the on-demand pool creation turned out to have a richer backstory than the diff suggests.
Most of the delta is tight. The clone-instead-of-mutate fix (09b799bb) and the lease-cache scoping are genuinely important, both were silent correctness failures for --version <active> --ephemeral. The shared-pool cleanup (d8bcb91b) is the right call. The LookupByLabel expiry check fully addresses our earlier "apparition past its expiry" note.
The piece Evan flagged for re-review (pool creation on the activation path, 006f0690) is where we ended up spending most of our time.
The backstory: bf97ec7d (Oct 2025) introduced a sentinel pattern in poolState specifically to prevent duplicate pool creation under concurrent requests. Evan reviewed and refined it in b630ea32, the commit that added the concurrency doc block still sitting at the top of activator.go. 040329dd (RFD-39 Phase 4) then moved pool creation out to DeploymentLauncher; the sentinel fields stayed but stopped guarding anything. This PR's on-demand branch reintroduces pool creation in the activator without re-wiring the sentinel, so concurrent requests to a cold ephemeral URL can both fall through to CreatePoolForVersion. Since findMatchingPool is list-then-create (not atomic), both can succeed.
Suggested pragmatic fix: re-animate the sentinel for the on-demand branch. The fields are still there, just unused. Mark inProgress: true and open a done channel before releasing the lock, close it after creation finishes. One related cleanup either way: the top-of-file doc block at lines 21-25 still claims requestPoolCapacity uses the Sentinel Pattern, which hasn't been true since 040329dd. Restore the sentinel and the doc is correct again; skip the sentinel and the doc should lose that section.
The bigger frustration, which Paul has been sitting with for a while: the activator's coordination story keeps slowly eroding as features thread through it. MIR-596 tracks the direction already, and we added a comment there pointing at this PR as fresh evidence. That's the refactor pile, not a blocker.
Two concurrent requests to a cold ephemeral URL could both fall out of the findPoolInStore retry loop and both call CreatePoolForVersion, creating duplicate pools for the same (version, service) because findMatchingPool is list-then-create (not atomic). Reuse the existing inProgress/done/err sentinel infrastructure in poolState: under the write lock, install the sentinel before releasing to do the slow creation. Concurrent callers find the sentinel, wait on done, then loop back to re-check state. The sentinel is replaced with the real pool state on success, or deleted on failure. Also update the top-of-file doc block to describe the sentinel pattern accurately for the on-demand creation case, and clarify the error message when findPoolInStore returns nil after ensurePoolForService reused an existing pool (poolID empty).
Summary
Implements RFD-72: Ephemeral Deployments — app versions tagged with a label and TTL that are accessed via subdomain routing without affecting the active version.
miren deploy --ephemeral <label> --ttl <duration>creates an ephemeral version accessible at<label>.app.example.commiren app versionslists all versions with status (active/ephemeral/replaced), supports--ephemeralfilter and--format jsonAcquireLeaseKey design decisions
lookupEphemeralRoute)getAccessInforesolves ephemeral hostnames server-side so the CLI displays the correct URLTest plan
app versionsoutput, verify HTTP activation via subdomain, verify replace-on-same-label