Skip to content

roachprod: centralized api - full cluster lifecycle management #156765

@golgeek

Description

@golgeek

Problem: Currently, users must have cloud credentials locally to create, modify, or destroy clusters. This limits who can perform these operations and makes it difficult to enforce policies, track changes, or implement approval workflows. The centralized service can see cluster state but cannot manage it.

Solution: Extend the API to support full CRUD operations for clusters:

  • Create clusters by specifying cloud provider, machine types, cluster size, and configuration
  • Update cluster properties (resize, modify settings)
  • Destroy clusters when no longer needed
  • Execute these operations via a distributed task system for reliability

Operations should respect user permissions (scoped by cloud provider and account), queue as background tasks with progress tracking, and maintain audit logs.
This enables centralized enforcement of resource policies while removing the need for widespread cloud credential distribution.

Jira issue: CRDB-56181

Epic CRDB-49123

Metadata

Metadata

Assignees

Labels

A-testeng-infraA-testingTesting tools and infrastructureC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-testengTestEng Team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions