Fix startupProbe ignoring its own timing configuration#976
Merged
nightfury1204 merged 1 commit intomasterfrom Mar 31, 2026
Merged
Fix startupProbe ignoring its own timing configuration#976nightfury1204 merged 1 commit intomasterfrom
nightfury1204 merged 1 commit intomasterfrom
Conversation
The startupProbe template was reading all timing parameters (grace, interval, timeout, successThreshold, failureThreshold) from the Liveness struct instead of the StartupProbe struct. This meant any values set directly on startupProbe in convox.yml were silently ignored, and the liveness defaults were used instead. This also adds default handling for StartupProbe in ApplyDefaults(), inheriting from liveness values when not explicitly set, preserving backward compatibility for configs that relied on the inheritance behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is the feature/update/fix?
Fix: startupProbe Ignoring Its Own Timing Configuration
The
startupProbein the Kubernetes service deployment template was reading its timing parameters from$.Service.Liveness.*instead of$.Service.StartupProbe.*. This meant any timing values set directly onstartupProbeinconvox.ymlwere silently ignored — the liveness probe values were always used instead.This bug was introduced in PR #848 (v3.19.7) when the startupProbe feature was first added. The template was wired to the wrong struct from the start.
What was affected:
convox.ymlfieldstartupProbe.graceinitialDelaySecondsliveness.gracestartupProbe.gracestartupProbe.intervalperiodSecondsliveness.intervalstartupProbe.intervalstartupProbe.timeouttimeoutSecondsliveness.timeoutstartupProbe.timeoutstartupProbe.successThresholdsuccessThresholdliveness.successThresholdstartupProbe.successThresholdstartupProbe.failureThresholdfailureThresholdliveness.failureThresholdstartupProbe.failureThresholdDefault inheritance: When startupProbe timing fields are not explicitly set (value is
0), they now correctly fall back to the corresponding liveness probe values. This preserves backward compatibility for existing configs that only set astartupProbe.pathwithout custom timing.Test coverage:
TestManifestStartupProbecovers 3 scenarios:pathinherits all timing from liveness (backward compat)grace,failureThreshold), others inherited from liveness; uses TCP socket check instead of HTTPWhy is this important?
The startupProbe is critical for services with slow initialization — database migrations, large model loading, cache warming, JVM startup, or GPU model initialization. When the timing values silently fell back to liveness probe values, services could experience unexpected restarts during startup because the startupProbe's
failureThresholdorgraceperiod didn't provide enough time for initialization to complete.Example of the broken behavior:
Given this
convox.yml:Before (broken): Kubernetes startupProbe gets
initialDelaySeconds: 10,periodSeconds: 5,failureThreshold: 3(liveness values). The service has only ~25 seconds to start (3 failures × 5s interval + 10s grace).After (fixed): Kubernetes startupProbe gets
initialDelaySeconds: 60,periodSeconds: 30,failureThreshold: 10(startupProbe values). The service has ~360 seconds to start (10 failures × 30s interval + 60s grace).Benefits:
convox.ymlwill now use those values instead of silently falling back to liveness valuestcpSocketPortfor non-HTTP services, and this fix applies to TCP checks as wellSupported startupProbe fields in
convox.yml:pathtcpSocketPortpath)graceinitialDelaySeconds)intervalperiodSeconds)timeouttimeoutSeconds)successThresholdfailureThresholdDoes it have a breaking change?
Potentially — if your service relies on the broken behavior where startupProbe timing was always inherited from liveness regardless of explicit values, the fix will change the effective probe timing.
However, this is a bug fix: the documented and intended behavior was always for startupProbe to use its own timing values when explicitly configured. If your
convox.ymlspecifies startupProbe timing values, those values will now actually be applied.Services that only set a
startupProbe.pathwithout custom timing values are completely unaffected — they will continue to inherit timing from the liveness probe.Requirements
This fix requires version
3.24.1or later for the rack.Update the Rack: Run
convox rack update 3.24.1 -r rackNameto update to this version.Note that your rack must already be on at least version
3.23.0before performing this update.If you're unfamiliar with v3 rack versioning, we recommend reviewing the documentation on Updating a Rack before applying any updates.