Skip to content

docs: add workload status reasons enhancement#136

Open
scotwells wants to merge 1 commit into
mainfrom
docs/enhancement-status-reasons
Open

docs: add workload status reasons enhancement#136
scotwells wants to merge 1 commit into
mainfrom
docs/enhancement-status-reasons

Conversation

@scotwells
Copy link
Copy Markdown
Contributor

When a workload isn't running, a user can tell at a glance why — one consistent, actionable reason and a plain-language message naming the specific blocker (a missing ConfigMap, a quota shortfall, an absent network) — surfaced the same way at the Instance, WorkloadDeployment, and Workload levels, and the same way across the CLI today and the web console later. No more digging through raw nested Kubernetes conditions across three resources to answer a single question.

What it covers

  • One readiness condition per resource (Ready on Instance, Available on WorkloadDeployment and Workload) that carries a stable reason and message whenever it isn't healthy.
  • A stable, shared reason vocabulary defined once in api/v1alpha and used consistently across all three resources — ReferencedDataNotReady, SourceNotFound, QuotaNotGranted, NetworkNotFound, NetworkProvisioning, InstancesProvisioning, and more.
  • Plain-language messages that name the specific object or constraint at issue.
  • Terminal vs. transient classification — telling "act" apart from "wait."
  • An Instance → WorkloadDeployment → Workload priority rollup so the most relevant blocker shows up at whatever level the user is looking, without drilling down.
  • How datumctl compute surfaces the reason directly off the resource it already fetches, with no special-case code.

Why

This closes the platform-side half of the "why isn't it running" developer-experience gap named in docs/enhancements/datumctl-compute-dx.md. That doc promises the CLI explains a stuck workload in plain terms with a next step; this contract is the data underneath that makes it real. The contract is already implemented in the compute controllers — this enhancement describes the shipped behavior, not a proposal.

Notes for reviewers

This lifts the durable, product-facing contract out of the implementation-focused RFC in #129 (docs/compute/development/rfcs/status-blocking-reason-contract.md) into a product enhancement on main, matching the docs/enhancements/ convention. The step-by-step build sequence, file-by-file plan, and test-plan tail are intentionally left in the RFC and out of this doc. #129 will drop the original RFC. Uses "Instance" / "Workload" Datum nomenclature throughout.

🤖 Generated with Claude Code

When a workload isn't running, users get one consistent, actionable reason and a plain-language message naming the specific blocker, surfaced the same way at the Instance, WorkloadDeployment, and Workload levels — instead of digging through raw nested Kubernetes conditions across three resources. This enhancement captures the platform-side contract — a single readiness condition per resource, a stable shared reason vocabulary, and an Instance-to-Workload priority rollup — that makes the "why isn't it running" developer-experience promise real.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant