Skip to content

fix(plugin-circleci): populate workflow id for unfinished-job collection (#8907)#8912

Open
zahorniak wants to merge 2 commits into
apache:mainfrom
zahorniak:fix/circleci-collectjobs-empty-workflow-id
Open

fix(plugin-circleci): populate workflow id for unfinished-job collection (#8907)#8912
zahorniak wants to merge 2 commits into
apache:mainfrom
zahorniak:fix/circleci-collectjobs-empty-workflow-id

Conversation

@zahorniak
Copy link
Copy Markdown

@zahorniak zahorniak commented Jun 8, 2026

⚠️ Pre Checklist

  • I have read through the Contributing Documentation.
  • I have added relevant tests.
  • I have added relevant documentation.
  • I will add labels to the PR, such as pr-type/bug-fix.

Summary

The CircleCI collectJobs subtask runs two API collectors that share the URL template /v2/workflow/{{ .Input.Id }}/job:

  • The new-records collector selects id from _tool_circleci_workflows, so .Input.Id is populated and the URL is well-formed.
  • The "unfinished details" collector (which re-collects jobs for workflows that are still in flight) selected DISTINCT workflow_id from _tool_circleci_jobs and scanned each row into a models.CircleciJob. Since the selected column is workflow_id, the scan filled CircleciJob.WorkflowId and left CircleciJob.Id empty — so the shared template collapsed to /v2/workflow//job, which the CircleCI API answers with HTTP 500.

This triggers whenever a job is in a non-terminal status (running / not_running / queued / on_hold) — e.g. an approval-gated workflow sitting on_hold. The 500 aborts the whole CircleCI collection, so all CI/CD data (and the DORA metrics derived from it) goes stale.

Fix

Alias the projection to DISTINCT workflow_id AS id so the scanned CircleciJob.Id carries the workflow id, mirroring the new-records collector that already selects id. This is the minimal change and keeps both collectors consistent.

To make the behaviour testable, the "unfinished details" DAL clauses are extracted into an exported helper, tasks.UnfinishedJobsInputClauses(...), which both the collector and the new test call — so the regression test exercises the production query rather than a copy of it.

// before
dal.Select("DISTINCT workflow_id")
// after
dal.Select("DISTINCT workflow_id AS id")

Tests

Adds TestCircleciUnfinishedJobsInputIterator (e2e, real DB). It:

  • seeds non-terminal and terminal jobs across two connections (including two jobs of the same workflow, to assert DISTINCT, and rows that must be filtered out);
  • runs the production query (tasks.UnfinishedJobsInputClauses) through the real api.DalCursorIterator — the same path the collector uses;
  • asserts each yielded .Id equals the workflow id, that results are DISTINCT per workflow, and that the status/connection filters hold.

The test fails before the fix (.Id is empty → ["", ""]) and passes after. Because it calls the exported production helper, a future revert of the Select re-breaks the test.

Does this close any open issues?

Closes #8907

Screenshots

N/A — backend collector fix, no UI changes.

Other Information

  • Scope: one SQL projection string (workflow_idworkflow_id AS id) plus a test-enabling extraction. No domain-layer, schema, migration, or snapshot changes; existing CircleCI e2e (Extract/Convert) tests are unaffected.
  • DCO: all commits are signed off (git commit -s).
  • Verification: go build ./plugins/circleci/..., go vet ./plugins/circleci/..., and the full ./plugins/circleci/e2e/... suite pass against a real MySQL 8.
  • Real-world impact: in a live deployment, ~7,600 workflows matched the non-terminal filter on a single collection run — each one producing a /v2/workflow//job 500 — so the bug reliably breaks CircleCI collection at scale, not just in edge cases.

zahorniak added 2 commits June 8, 2026 10:44
… a helper

Signed-off-by: Volodymyr Zahorniak <v.zahorniak@gmail.com>
…ion (apache#8907)

The collectJobs 'unfinished details' collector built its URL from
'/v2/workflow/{{ .Input.Id }}/job' but its iterator selected 'DISTINCT
workflow_id' into a models.CircleciJob, leaving .Id empty and producing
'/v2/workflow//job' (HTTP 500) whenever a job was running/queued/on_hold.
Alias the projection to 'workflow_id AS id' so .Id carries the workflow id,
mirroring the new-records collector. Adds an e2e regression test.

Signed-off-by: Volodymyr Zahorniak <v.zahorniak@gmail.com>
@zahorniak zahorniak marked this pull request as ready for review June 8, 2026 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant