Skip to content

Fix project creation delays caused by slow deletions#553

Merged
scotwells merged 2 commits intomainfrom
fix/non-blocking-project-deletion
Apr 1, 2026
Merged

Fix project creation delays caused by slow deletions#553
scotwells merged 2 commits intomainfrom
fix/non-blocking-project-deletion

Conversation

@scotwells
Copy link
Copy Markdown
Contributor

@scotwells scotwells commented Apr 1, 2026

Note

CI shows 6 failing e2e tests — these are pre-existing failures unrelated to this PR. See #549 for the fix.

Summary

  • Project creation could stall for up to 10 minutes while another project was being deleted. The controller now handles deletions without blocking other work.
  • Deletion progress is visible via a new ResourceCleanup status condition on the project, so operators can see exactly what phase the cleanup is in.
  • The controller now processes up to 4 projects concurrently instead of 1.

What changed

The project controller previously ran a synchronous cleanup operation during deletion that could block for up to 10 minutes. During that time, no other project — including newly created ones — could be reconciled.

The cleanup is now split into two non-blocking steps: issuing delete commands (fast) and checking whether resources have drained (cheap poll). The controller tracks progress via a ResourceCleanup condition with three reasons:

  • CleanupStarted → delete commands are being issued
  • CleanupAwaitingCompletion → waiting for resources to be removed
  • CleanupComplete → all resources removed, finalizer can be released

If resources haven't drained after a check, the controller transitions back to CleanupStarted to automatically re-issue delete commands on the next reconcile.

Test plan

  • Existing unit tests pass (task test:unit)
  • Delete a project and verify the ResourceCleanup condition progresses through CleanupStartedCleanupAwaitingCompletionCleanupComplete
  • Create a new project while another project is being deleted — creation should not be delayed
  • Kill the controller pod mid-deletion and verify cleanup resumes after restart
  • End-to-end tests pass (task test:end-to-end) — new project-deletion test passes in 20.7s

🤖 Generated with Claude Code

@joggrbot
Copy link
Copy Markdown
Contributor

joggrbot bot commented Apr 1, 2026

📝 Documentation Analysis

Joggr found 2 outdated docs in the pull request.

Autofix

Joggr opened 1 pull request(s) to fix the outdated docs.

Outdated

file reason confidence
docs/api/resourcemanager.md The API doc for the Project resource only lists a single 'Ready' status condition and reason, but the code adds a 'ResourceCleanup' condition type with new reasons. Users referencing only this document will not know about or be able to interpret the new condition types and reasons related to project deletion and cleanup. 74.3%
docs/architecture/controllers/project-controller/README.md The code changes introduce more granular project resource cleanup status conditions (ResourceCleanup, CleanupStarted, CleanupAwaitingCompletion, CleanupComplete) and a multi-phase project deletion workflow not described in the document, which still presents only a basic finalizer/removal pattern. The documentation's deletion and cleanup flow omits the nuanced status and multi-reconciliation process now present in the implementation. 51.2%

✅ Latest commit analyzed: 07a0193 | Powered by Joggr

The project controller reconciled all projects with a single worker
goroutine. When a project was deleted, the synchronous purge operation
(up to 10 minutes) blocked all other project reconciliation — including
creation of new projects.

Split the monolithic Purge() into StartPurge() (issues delete commands)
and IsPurgeComplete() (checks if resources have drained). The controller
now uses a condition-driven state machine (ResourceCleanup) that returns
from each reconcile in seconds and requeues to poll for completion.
Also increases MaxConcurrentReconciles to 4 for additional throughput.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@scotwells scotwells force-pushed the fix/non-blocking-project-deletion branch from 491dbde to 6fcb592 Compare April 1, 2026 16:55
Adds a Chainsaw e2e test that verifies:
- A project can be deleted after reaching Ready status
- The project is fully removed from both org and main cluster contexts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@scotwells scotwells requested a review from zachsmith1 April 1, 2026 17:19
@scotwells scotwells marked this pull request as ready for review April 1, 2026 17:19
Copy link
Copy Markdown

@ecv ecv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that joggr thing is kinda cool.

@scotwells scotwells merged commit bc341e7 into main Apr 1, 2026
8 of 10 checks passed
@scotwells scotwells deleted the fix/non-blocking-project-deletion branch April 1, 2026 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants