Conversation
When DO promotes a deployment from the InProgressDeployment slot to the ActiveDeployment slot, there is a window where InProgressDeployment is nil but ActiveDeployment.Phase is still transitioning (e.g. Deploying, PendingDeploy) before reaching Active. The previous appHealthResult returned "no deployment found" in that window, locking polling callers into a false-negative loop until timeout. Surfaced during core-dump staging deploy (run 25258558716): 1-second transition from "deployment in progress: DEPLOYING" to "no deployment found" → 10-minute timeout instead of healthy completion. Fix: when ActiveDeployment is populated but Phase != Active, classify based on the phase using the same priority order as the InProgress switch. Building/PendingDeploy/Deploying → in-progress (active slot); Error/Canceled/Superseded → failed (active slot). The "no deployment found" branch only fires when all three slots (Active, InProgress, Pending) are nil. Tests cover the four scenarios: - ActiveDeployment with Deploying phase → in-progress - ActiveDeployment with Error phase → failed - ActiveDeployment with Active phase → healthy (regression) - Active(Active) + InProgress(Deploying) priority → healthy Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes App Platform health-check classification during a DigitalOcean deployment slot transition where ActiveDeployment is populated before its phase becomes Active, preventing false "no deployment found" results and long health-check timeouts. It updates the shared deployment-slot health logic used by App Platform resources and adds regression tests around the transition window described in issue #48.
Changes:
- Teach
appHealthResultto classify non-ActiveActiveDeploymentphases as in-progress, failed, or unknown instead of falling through to"no deployment found". - Preserve the existing healthy-path precedence when
ActiveDeployment.Phase == Active, even ifInProgressDeploymentis also present. - Add App Platform regression tests covering active-slot deploying, error, healthy, and precedence scenarios.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
internal/drivers/app_platform.go |
Extends shared App Platform/API Gateway health evaluation to handle transitional ActiveDeployment phases. |
internal/drivers/app_platform_test.go |
Adds regression tests for active-slot transition behavior and priority handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+947
to
+963
| // TestAppHealthResult_ActiveSlotErrorPhase covers terminal failure when the | ||
| // ActiveDeployment slot is populated with a failed phase. | ||
| func TestAppHealthResult_ActiveSlotErrorPhase(t *testing.T) { | ||
| d := drivers.NewAppPlatformDriverWithClient(&mockAppClient{ | ||
| app: appWithPhases(phasePtr(godo.DeploymentPhase_Error), nil, nil), | ||
| }, "nyc3") | ||
| result, err := d.HealthCheck(context.Background(), interfaces.ResourceRef{Name: "phased-app", ProviderID: "f8b6200c-3bba-48a7-8bf1-7a3e3a885eb5"}) | ||
| if err != nil { | ||
| t.Fatalf("unexpected error: %v", err) | ||
| } | ||
| if result.Healthy { | ||
| t.Error("expected Healthy=false for ActiveDeployment.Phase=Error") | ||
| } | ||
| if !strings.Contains(result.Message, "failed") { | ||
| t.Errorf("message should contain 'failed', got: %q", result.Message) | ||
| } | ||
| } |
Comment on lines
+251
to
+275
| if dep := app.ActiveDeployment; dep != nil { | ||
| switch dep.Phase { | ||
| case godo.DeploymentPhase_PendingBuild, | ||
| godo.DeploymentPhase_Building, | ||
| godo.DeploymentPhase_PendingDeploy, | ||
| godo.DeploymentPhase_Deploying: | ||
| return &interfaces.HealthResult{ | ||
| Healthy: false, | ||
| Message: fmt.Sprintf("deployment in progress (active slot): %s", dep.Phase), | ||
| } | ||
| case godo.DeploymentPhase_Error, | ||
| godo.DeploymentPhase_Canceled, | ||
| godo.DeploymentPhase_Superseded: | ||
| return &interfaces.HealthResult{ | ||
| Healthy: false, | ||
| Message: fmt.Sprintf("deployment failed (active slot): %s", dep.Phase), | ||
| } | ||
| default: | ||
| // Forward-compat: a future godo release may add new phases. | ||
| // Report "unknown" rather than "failed" to avoid mislabeling. | ||
| return &interfaces.HealthResult{ | ||
| Healthy: false, | ||
| Message: fmt.Sprintf("unknown phase (active slot): %s", dep.Phase), | ||
| } | ||
| } |
Comment on lines
+246
to
+251
| // 2a. ActiveDeployment populated but Phase not yet Active — covers the | ||
| // post-promotion-pre-active window where DO has moved the deployment out of | ||
| // the InProgressDeployment slot but its Phase is still transitioning toward | ||
| // Active. Without this check, the polling loop falls through to "no | ||
| // deployment found" and loops until timeout (issue #48). | ||
| if dep := app.ActiveDeployment; dep != nil { |
Comment on lines
+915
to
+926
| // ── Active-slot phase-transition tests (issue #48) ─────────────────────────── | ||
| // | ||
| // When DO promotes a deployment from InProgressDeployment to ActiveDeployment, | ||
| // there is a window where InProgressDeployment is nil but | ||
| // ActiveDeployment.Phase is still transitioning (e.g. Deploying, PendingDeploy) | ||
| // before reaching Active. The previous appHealthResult returned "no deployment | ||
| // found" in that window, locking polling callers into a false-negative loop. | ||
|
|
||
| // TestAppHealthResult_ActiveSlotDeployingPhase covers the core bug: ActiveDeployment | ||
| // with Deploying phase and no InProgressDeployment must return in-progress, not | ||
| // "no deployment found". | ||
| func TestAppHealthResult_ActiveSlotDeployingPhase(t *testing.T) { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #48. Fixes
appHealthResultfalse-negative when theActiveDeploymentslot is populated but itsPhasehas not yet transitioned toActive.Bug
Surfaced in core-dump staging deploy 25258558716: 1-second transition from "deployment in progress: DEPLOYING" to "no deployment found" → 10-minute timeout instead of detecting the eventually-healthy deployment.
When DO promotes a deployment from
InProgressDeploymenttoActiveDeployment, there is a window whereInProgressDeploymentisnilbutActiveDeployment.Phaseis stillDeploying(or another transitioning phase). The previous code fell through all checks and returned "no deployment found" in that window.Fix
When
ActiveDeploymentis populated butPhase != Active, classify the deployment by phase (mirrors the existingInProgressDeploymentswitch logic). The "no deployment found" terminal branch only fires when all three slots (ActiveDeployment,InProgressDeployment,PendingDeployment) arenil.Test plan
go test ./internal/drivers/ -run TestAppHealth— 4 new tests cover the active-slot transition states + priority-vs-InProgress regressiongo test ./...— full suite green (159 tests across 3 packages)🤖 Generated with Claude Code
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com