Skip to content

Conversation

ethanndickson
Copy link
Member

@ethanndickson ethanndickson commented Sep 9, 2025

Part of the previous query didn't make sense and was completely wrong (though it is admittedly confusing). Here's part of the DB query for retrieving running workspaces:

-- Special case where the provisioner status and workspace status
-- differ. A workspace is "running" if the job is "succeeded" and
-- the transition is "start". This is because a workspace starts
-- running when a job is complete.
WHEN $4 = 'running' THEN
    latest_build.job_status = 'succeeded'::provisioner_job_status AND
    latest_build.transition = 'start'::workspace_transition

This new promql query follows this logic, but also uses a max by to handle duplicate metrics from multiple coder replicas:

 count(kube_pod_status_ready{condition="true", namespace=`coder-workspaces`} == 1)
 or
-count(coderd_api_workspace_latest_build{status="running"})
+sum(max by (workspace_owner, template_name, template_version) (coderd_workspace_latest_build_status{status="succeeded", workspace_transition="start"}))
 or
 vector(0)

Tested on dogfood and on my scaletest cluster.

Copy link
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@ethanndickson ethanndickson force-pushed the ethan/fix-running-workspaces-count branch from 06a34d9 to 97edcfc Compare September 9, 2025 04:19
@ethanndickson ethanndickson force-pushed the ethan/fix-running-workspaces-count branch from 97edcfc to 7e99438 Compare September 9, 2025 06:32
@ethanndickson ethanndickson changed the title fix: use succeeded provisioner jobs when determining running workspace count fix: properly determining running workspaces count Sep 9, 2025
@ethanndickson ethanndickson marked this pull request as ready for review September 9, 2025 06:40
@ethanndickson ethanndickson changed the title fix: properly determining running workspaces count fix: properly determine running workspaces count Sep 9, 2025
Copy link
Collaborator

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!
Can you cut a patch release pls? https://github.com/coder/observability/blob/main/PUBLISH.md

@ethanndickson ethanndickson merged commit 54cc333 into main Sep 9, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants