Add per-service port_wait_timeout to app.toml#792
Add per-service port_wait_timeout to app.toml#792jcasimir wants to merge 4 commits intomirendev:mainfrom
Conversation
The sandbox port-wait health check has been hardcoded to 15s for all
user apps. SandboxSpec.PortWaitTimeout already exists and is honored by
resolvePortWaitTimeout, but only the addon framework sets it — there
was no path from app.toml.
This bites apps with slow first-boot. Deploying LiteLLM with the
miren-postgresql addon trips it: Prisma migrations on first DB connect
take >15s, the sandbox dies before 4000 binds, and the pool retries
forever.
Surface the existing knob as a per-service field:
[services.web]
port_wait_timeout = "120s"
Per-service rather than app-level so web and worker can have different
profiles. Config rather than CLI flag because the slow-boot
characteristic is a property of the service, not the deploy.
Validate at parse time. An invalid duration ("120" missing the unit)
would otherwise fall back silently to 15s — exactly the failure the
field exists to prevent. Mirrors the existing scale_down_delay /
shutdown_timeout validation.
Wired only through the ConfigSpec path. The legacy inline Config is
written empty for new versions, so there's no need to plumb the field
through ConfigSpecFromConfig.
Tests: TOML round-trip, parse-time validation, build conversion to
ConfigSpecServices, launcher propagation into SandboxSpec, and
positive-path resolution above the default.
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR adds support for per-service Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@appconfig/appconfig.go`:
- Around line 75-78: The field comment for PortWaitTimeout is incorrect: it
claims "Empty or invalid values fall back to the default" while the Validate()
method actually accepts empty values (falls back to 15s) but returns a
ValidationError for invalid duration strings; update the comment on the
PortWaitTimeout field to state that empty values fall back to the 15s default
and that invalid duration strings will be rejected by Validate() (refer to the
Validate() method and the ValidationError returned for invalid durations) so the
doc matches the actual behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9a185b03-e315-4cbf-bfa7-1f6cc87e3f83
📒 Files selected for processing (9)
api/core/core_v1alpha/schema.gen.goapi/core/schema.ymlappconfig/appconfig.goappconfig/appconfig_test.gocontrollers/deployment/launcher.gocontrollers/deployment/launcher_test.gocontrollers/sandbox/port_wait_timeout_test.goservers/build/build.goservers/build/build_test.go
After parse-time validation was added, invalid durations are rejected rather than silently defaulted. Update the field comment to match.
evanphx
left a comment
There was a problem hiding this comment.
Minor change, let's have it just be port_timeout. Otherwise looks good!
| // its port during startup. Accepts a Go duration string (e.g. "60s", "2m"). | ||
| // Empty falls back to the default; invalid duration strings are rejected | ||
| // at parse time by Validate. | ||
| PortWaitTimeout string `toml:"port_wait_timeout,omitempty"` |
There was a problem hiding this comment.
| PortWaitTimeout string `toml:"port_wait_timeout,omitempty"` | |
| PortWaitTimeout string `toml:"port_timeout,omitempty"` |
|
Great! |
Per review feedback from @evanphx: shorter name on the user-facing TOML key. Also renames the matching Go fields on ServiceConfig and ConfigSpecServices for consistency. SandboxSpec.PortWaitTimeout and resolvePortWaitTimeout stay as-is — pre-existing and out of scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
servers/build/build_test.go (1)
722-731: ⚡ Quick winTighten this test with an exact service-count assertion.
Please add
require.Len(t, services, 2)before buildingbyName; this prevents false positives if unexpected services are emitted while still containingwebandworker.Suggested diff
validateServices: func(t *testing.T, services []core_v1alpha.ConfigSpecServices) { + require.Len(t, services, 2) byName := make(map[string]core_v1alpha.ConfigSpecServices, len(services)) for _, s := range services { byName[s.Name] = s }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@servers/build/build_test.go` around lines 722 - 731, Add a strict length assertion to the validateServices test: inside the validateServices func (the closure that receives t *testing.T and services []core_v1alpha.ConfigSpecServices) insert require.Len(t, services, 2) before constructing the byName map so the test fails if extra services are present; keep the subsequent mapping and assertions for byName["web"] and byName["worker"] unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@servers/build/build_test.go`:
- Around line 722-731: Add a strict length assertion to the validateServices
test: inside the validateServices func (the closure that receives t *testing.T
and services []core_v1alpha.ConfigSpecServices) insert require.Len(t, services,
2) before constructing the byName map so the test fails if extra services are
present; keep the subsequent mapping and assertions for byName["web"] and
byName["worker"] unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 46639662-81b4-4f62-b630-6f2232661b13
📒 Files selected for processing (8)
api/core/core_v1alpha/schema.gen.goapi/core/schema.ymlappconfig/appconfig.goappconfig/appconfig_test.gocontrollers/deployment/launcher.gocontrollers/deployment/launcher_test.goservers/build/build.goservers/build/build_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
- servers/build/build.go
- api/core/schema.yml
- appconfig/appconfig_test.go
- controllers/deployment/launcher_test.go
- appconfig/appconfig.go
- api/core/core_v1alpha/schema.gen.go
Per CodeRabbit nitpick: add require.Len(t, services, 2) so an unexpected extra service emitted by buildServicesConfig fails the test instead of being silently ignored by the byName lookup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Why
SandboxSpec.PortWaitTimeoutalready exists and is honored byresolvePortWaitTimeoutincontrollers/sandbox/sandbox.go, but the only callers are addons (pkg/addon/postgresql,pkg/addon/rabbitmq). Regular apps are stuck with the 15s default constant.Concrete failure: LiteLLM with the
miren-postgresql:smalladdon. Prisma migrations on first DB connect take >15s, the sandbox dies before the app binds 4000, the pool retries forever. We were running a local-only fork bumping the constant globally to 60s — fine as a stopgap, ugly as a fix.What
Surface the existing
SandboxSpec.PortWaitTimeoutknob to userapp.tomlas a per-service field:Design choices worth flagging
Config. New versions writemrv.Config = core_v1alpha.Config{}(build.go:1507) — the legacy inline path is empty for all live deploys, so plumbing the field throughConfigSpecFromConfigwould be dead code.port_wait_timeout = "120"(missing the unit) would silently fall back to the 15s default — exactly the failure mode the field exists to prevent. Validation inAppConfig.Validate()mirrorsscale_down_delay/shutdown_timeout.Touch points
api/core/schema.ymlport_wait_timeoutattr underconfig_spec.servicesapi/core/core_v1alpha/schema.gen.goappconfig/appconfig.goServiceConfig.PortWaitTimeout+ parse-timetime.ParseDurationvalidationservers/build/build.gobuildServicesConfigcontrollers/deployment/launcher.gosbSpec.PortWaitTimeoutin the existing service-match loop inbuildSandboxSpecTests
appconfig: TOML round-trip; parse-time rejection of"120"(no unit); accepts"2m"servers/build: appconfig →ConfigSpecServicescarries the field; sibling stays emptycontrollers/deployment:buildSandboxSpecpropagates intoSandboxSpecfor the matching service onlycontrollers/sandbox: positive-path cases (120s,2m) forresolvePortWaitTimeout, complementing the existing fallback casesAll six touched packages green locally (sandbox tests run via
golang:1.25container since the package is//go:build linux).