Skip to content

Add per-service port_wait_timeout to app.toml#792

Open
jcasimir wants to merge 4 commits intomirendev:mainfrom
jcasimir:jcasimir/port-wait-timeout
Open

Add per-service port_wait_timeout to app.toml#792
jcasimir wants to merge 4 commits intomirendev:mainfrom
jcasimir:jcasimir/port-wait-timeout

Conversation

@jcasimir
Copy link
Copy Markdown

@jcasimir jcasimir commented May 7, 2026

Why

SandboxSpec.PortWaitTimeout already exists and is honored by resolvePortWaitTimeout in controllers/sandbox/sandbox.go, but the only callers are addons (pkg/addon/postgresql, pkg/addon/rabbitmq). Regular apps are stuck with the 15s default constant.

Concrete failure: LiteLLM with the miren-postgresql:small addon. Prisma migrations on first DB connect take >15s, the sandbox dies before the app binds 4000, the pool retries forever. We were running a local-only fork bumping the constant globally to 60s — fine as a stopgap, ugly as a fix.

What

Surface the existing SandboxSpec.PortWaitTimeout knob to user app.toml as a per-service field:

[services.web]
command = "..."
port = 4000
port_wait_timeout = "120s"

Design choices worth flagging

  • Per-service, not app-level. Web and worker have different startup profiles. A worker doing migrations needs 120s; the web process next to it should still fail fast at 15s.
  • Config, not CLI flag. The slow-boot characteristic is a property of the service, not the deploy — it would be set the same value every push. Belongs in the manifest.
  • ConfigSpec only, not legacy Config. New versions write mrv.Config = core_v1alpha.Config{} (build.go:1507) — the legacy inline path is empty for all live deploys, so plumbing the field through ConfigSpecFromConfig would be dead code.
  • Validate at parse time. A typo like port_wait_timeout = "120" (missing the unit) would silently fall back to the 15s default — exactly the failure mode the field exists to prevent. Validation in AppConfig.Validate() mirrors scale_down_delay / shutdown_timeout.

Touch points

File Change
api/core/schema.yml New port_wait_timeout attr under config_spec.services
api/core/core_v1alpha/schema.gen.go Regenerated
appconfig/appconfig.go ServiceConfig.PortWaitTimeout + parse-time time.ParseDuration validation
servers/build/build.go Copy through in buildServicesConfig
controllers/deployment/launcher.go Set sbSpec.PortWaitTimeout in the existing service-match loop in buildSandboxSpec

Tests

  • appconfig: TOML round-trip; parse-time rejection of "120" (no unit); accepts "2m"
  • servers/build: appconfig → ConfigSpecServices carries the field; sibling stays empty
  • controllers/deployment: buildSandboxSpec propagates into SandboxSpec for the matching service only
  • controllers/sandbox: positive-path cases (120s, 2m) for resolvePortWaitTimeout, complementing the existing fallback cases

All six touched packages green locally (sandbox tests run via golang:1.25 container since the package is //go:build linux).

The sandbox port-wait health check has been hardcoded to 15s for all
user apps. SandboxSpec.PortWaitTimeout already exists and is honored by
resolvePortWaitTimeout, but only the addon framework sets it — there
was no path from app.toml.

This bites apps with slow first-boot. Deploying LiteLLM with the
miren-postgresql addon trips it: Prisma migrations on first DB connect
take >15s, the sandbox dies before 4000 binds, and the pool retries
forever.

Surface the existing knob as a per-service field:

  [services.web]
  port_wait_timeout = "120s"

Per-service rather than app-level so web and worker can have different
profiles. Config rather than CLI flag because the slow-boot
characteristic is a property of the service, not the deploy.

Validate at parse time. An invalid duration ("120" missing the unit)
would otherwise fall back silently to 15s — exactly the failure the
field exists to prevent. Mirrors the existing scale_down_delay /
shutdown_timeout validation.

Wired only through the ConfigSpec path. The legacy inline Config is
written empty for new versions, so there's no need to plumb the field
through ConfigSpecFromConfig.

Tests: TOML round-trip, parse-time validation, build conversion to
ConfigSpecServices, launcher propagation into SandboxSpec, and
positive-path resolution above the default.
@jcasimir jcasimir requested a review from a team as a code owner May 7, 2026 18:04
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

Review Change Stack

Warning

Rate limit exceeded

@jcasimir has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 40 minutes and 53 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 23a5e923-3181-42ef-8fdd-2aa1cc22b237

📥 Commits

Reviewing files that changed from the base of the PR and between 7562f8f and 20a07f5.

📒 Files selected for processing (1)
  • servers/build/build_test.go
📝 Walkthrough

Walkthrough

This PR adds support for per-service port_wait_timeout configuration, defined in the API schema and added to generated ConfigSpecServices. It introduces ServiceConfig.PortTimeout with duration-string validation, maps the validated value into core service specs during build, and propagates it into SandboxSpec.PortWaitTimeout during launcher sandbox construction. Tests cover parsing, validation, mapping, and propagation across the layers.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@appconfig/appconfig.go`:
- Around line 75-78: The field comment for PortWaitTimeout is incorrect: it
claims "Empty or invalid values fall back to the default" while the Validate()
method actually accepts empty values (falls back to 15s) but returns a
ValidationError for invalid duration strings; update the comment on the
PortWaitTimeout field to state that empty values fall back to the 15s default
and that invalid duration strings will be rejected by Validate() (refer to the
Validate() method and the ValidationError returned for invalid durations) so the
doc matches the actual behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9a185b03-e315-4cbf-bfa7-1f6cc87e3f83

📥 Commits

Reviewing files that changed from the base of the PR and between af4f87c and 104bae5.

📒 Files selected for processing (9)
  • api/core/core_v1alpha/schema.gen.go
  • api/core/schema.yml
  • appconfig/appconfig.go
  • appconfig/appconfig_test.go
  • controllers/deployment/launcher.go
  • controllers/deployment/launcher_test.go
  • controllers/sandbox/port_wait_timeout_test.go
  • servers/build/build.go
  • servers/build/build_test.go

Comment thread appconfig/appconfig.go Outdated
After parse-time validation was added, invalid durations are rejected
rather than silently defaulted. Update the field comment to match.
Copy link
Copy Markdown
Contributor

@evanphx evanphx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor change, let's have it just be port_timeout. Otherwise looks good!

Comment thread appconfig/appconfig.go Outdated
// its port during startup. Accepts a Go duration string (e.g. "60s", "2m").
// Empty falls back to the default; invalid duration strings are rejected
// at parse time by Validate.
PortWaitTimeout string `toml:"port_wait_timeout,omitempty"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PortWaitTimeout string `toml:"port_wait_timeout,omitempty"`
PortWaitTimeout string `toml:"port_timeout,omitempty"`

@jcasimir
Copy link
Copy Markdown
Author

jcasimir commented May 8, 2026

Great!

Per review feedback from @evanphx: shorter name on the user-facing
TOML key. Also renames the matching Go fields on ServiceConfig and
ConfigSpecServices for consistency. SandboxSpec.PortWaitTimeout and
resolvePortWaitTimeout stay as-is — pre-existing and out of scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
servers/build/build_test.go (1)

722-731: ⚡ Quick win

Tighten this test with an exact service-count assertion.

Please add require.Len(t, services, 2) before building byName; this prevents false positives if unexpected services are emitted while still containing web and worker.

Suggested diff
 			validateServices: func(t *testing.T, services []core_v1alpha.ConfigSpecServices) {
+				require.Len(t, services, 2)
 				byName := make(map[string]core_v1alpha.ConfigSpecServices, len(services))
 				for _, s := range services {
 					byName[s.Name] = s
 				}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@servers/build/build_test.go` around lines 722 - 731, Add a strict length
assertion to the validateServices test: inside the validateServices func (the
closure that receives t *testing.T and services
[]core_v1alpha.ConfigSpecServices) insert require.Len(t, services, 2) before
constructing the byName map so the test fails if extra services are present;
keep the subsequent mapping and assertions for byName["web"] and
byName["worker"] unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@servers/build/build_test.go`:
- Around line 722-731: Add a strict length assertion to the validateServices
test: inside the validateServices func (the closure that receives t *testing.T
and services []core_v1alpha.ConfigSpecServices) insert require.Len(t, services,
2) before constructing the byName map so the test fails if extra services are
present; keep the subsequent mapping and assertions for byName["web"] and
byName["worker"] unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 46639662-81b4-4f62-b630-6f2232661b13

📥 Commits

Reviewing files that changed from the base of the PR and between 104bae5 and 7562f8f.

📒 Files selected for processing (8)
  • api/core/core_v1alpha/schema.gen.go
  • api/core/schema.yml
  • appconfig/appconfig.go
  • appconfig/appconfig_test.go
  • controllers/deployment/launcher.go
  • controllers/deployment/launcher_test.go
  • servers/build/build.go
  • servers/build/build_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • servers/build/build.go
  • api/core/schema.yml
  • appconfig/appconfig_test.go
  • controllers/deployment/launcher_test.go
  • appconfig/appconfig.go
  • api/core/core_v1alpha/schema.gen.go

Per CodeRabbit nitpick: add require.Len(t, services, 2) so an
unexpected extra service emitted by buildServicesConfig fails the
test instead of being silently ignored by the byName lookup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants