Skip to content

engine/direct: recover from a failed Create during Recreate#5173

Open
janniklasrose wants to merge 5 commits intomainfrom
janniklasrose/recreate-empty-id
Open

engine/direct: recover from a failed Create during Recreate#5173
janniklasrose wants to merge 5 commits intomainfrom
janniklasrose/recreate-empty-id

Conversation

@janniklasrose
Copy link
Copy Markdown
Contributor

Changes

  • bundle/direct/apply.go: Recreate now drops the deployment state entry (db.DeleteState) between DoDelete and the follow-up Create, instead of db.SaveState(key, "", nil, nil).
  • bundle/direct/bundle_plan.go: Treat an existing state entry whose __id__ is empty as missing, so the next plan re-plans Create instead of erroring with invalid state: empty id. This covers state files written by pre-fix CLIs.
  • acceptance/bundle/resources/vector_search_endpoints/recreate/create-fails/: New test that triggers the failure path end-to-end by renaming my_endpoint onto a sibling endpoint's name and switching its endpoint_type. The first Recreate's Create 409s on the conflict; the next bundle plan recovers cleanly.

Why

A direct-engine Recreate was a DoDeleteSaveState(key, "", nil, nil)Create sequence. If the follow-up Create failed for any reason (in our reproducer: a name collision against another bundle resource), Finalize persisted a state row with __id__ == "". Every subsequent bundle plan then refused to proceed (invalid state: empty id) and bundle destroy couldn't recover either, leaving the bundle in a broken state until the user hand-edited resources.json.

Dropping the state entry up front means a failed Create simply looks like "no state for this resource" on the next plan, which is the natural recovery path. The planner-side tolerance handles state files already written by older CLIs.

Tests

  • New acceptance test bundle/resources/vector_search_endpoints/recreate/create-fails exercises the full path: initial deploy, Recreate triggered by endpoint_type change, Create 409 from a name collision with blocker_endpoint, then bundle plan showing create my_endpoint and bundle destroy cleaning up.
  • go test ./bundle/... passes.
  • ./task lint passes.
  • ./task test had unrelated local failures (Python databricks-bundles module not installed in the fresh worktree's venv, surfacing in pydabs/invariant tests); CI should not hit that.

PR description drafted with Claude Code.

@janniklasrose janniklasrose requested a review from denik May 4, 2026 12:40
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

Approval status: pending

/acceptance/bundle/ - needs approval

5 files changed
Suggested: @denik
Also eligible: @shreyas-goenka, @pietern, @andrewnester, @lennartkats-db, @anton-107

/bundle/ - needs approval

Files: bundle/direct/apply.go, bundle/direct/bundle_plan.go
Suggested: @denik
Also eligible: @shreyas-goenka, @pietern, @andrewnester, @lennartkats-db, @anton-107

General files (require maintainer)

Files: NEXT_CHANGELOG.md
Based on git history:

  • @denik -- recent work in bundle/direct/, ./

Any maintainer (@andrewnester, @anton-107, @denik, @pietern, @shreyas-goenka, @simonfaltum, @renaudhartert-db) can approve all areas.
See OWNERS for ownership rules.

@janniklasrose janniklasrose force-pushed the janniklasrose/recreate-empty-id branch from d9ae6b7 to b94cf0e Compare May 4, 2026 12:41

Exit code: 1

=== Subsequent plan recovers: my_endpoint state was dropped, replan as Create
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare with 9cb1077 where this was a failure

janniklasrose added a commit that referenced this pull request May 4, 2026
The SaveState->DeleteState change in apply.Recreate and the empty-id
tolerance in bundle_plan.go were extracted to a separate PR (#5173).
Reverting them here so this branch and #5173 can land independently;
once #5173 merges, a rebase on main brings the same fix back in.

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant