Skip to content

[Improve][CI] Optimize update build status workflow#10689

Merged
zhangshenghang merged 2 commits into
apache:devfrom
davidzollo:codex/optimize-update-build-status
Apr 10, 2026
Merged

[Improve][CI] Optimize update build status workflow#10689
zhangshenghang merged 2 commits into
apache:devfrom
davidzollo:codex/optimize-update-build-status

Conversation

@davidzollo
Copy link
Copy Markdown
Contributor

What changed

  • replaced the scheduled full open-PR scan with a GitHub Search API query limited to open PRs updated in the last 7 days
  • processed PR sync work in batches of 10 with concurrent per-PR execution to reduce serial API latency
  • skipped check-run PATCH requests when the upstream check already matches the fork workflow run status, conclusion, and details URL

Why

The previous workflow iterated over every open PR and then executed multiple nested REST calls serially. That makes the scheduled sync much slower than necessary and keeps issuing no-op updates for completed checks.

Impact

The workflow keeps the same build-status sync behavior, but reduces the amount of scheduled work and avoids unnecessary API traffic for inactive PRs and unchanged check runs.

Closes #10688

Validation

  • ./mvnw spotless:apply -nsu -T 3C
  • ./mvnw -q -DskipTests verify -nsu -T 3C failed in this local environment because /Volumes/disk2 ran out of space (No space left on device) while building E2E modules
  • ruby -e 'require \"yaml\"; YAML.load_file(\".github/workflows/update_build_status.yml\"); puts \"yaml-ok\"'\n- node -e 'const fs=require(\"fs\"); const yaml=fs.readFileSync(\".github/workflows/update_build_status.yml\",\"utf8\"); const match=yaml.match(/script: \\|\\n([\\s\\S]*)$/); if(!match) throw new Error(\"script block not found\"); const script=match[1].split(\"\\n\").map(line=>line.replace(/^ {12}/,\"\")).join(\"\\n\"); new Function(return (async () => {\n${script}\n}))(); console.log(\"script-parse-ok\");'

@davidzollo davidzollo marked this pull request as ready for review April 1, 2026 15:18
@github-actions github-actions Bot added the CI&CD label Apr 1, 2026
dybyte
dybyte previously approved these changes Apr 2, 2026
Copy link
Copy Markdown
Contributor

@dybyte dybyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if CI passes.

@DanielLeens
Copy link
Copy Markdown
Contributor

I walked the new scheduled path in .github/workflows/update_build_status.yml and found one behavioral regression.

The new search window at SEARCH_WINDOW_DAYS = 7 (update_build_status.yml:39-55) means the workflow now skips any open PR whose issue-level updated_at is older than seven days. That is not equivalent to the old behavior.

The build-status sync is driven by the fork workflow run state, not by PR conversation activity. A PR can easily be older than seven days while its mirrored Build check still needs syncing, for example when:

  • the fork workflow run is manually re-run on the same head SHA,
  • the run changes from in_progress to completed long after the last PR comment / push, or
  • maintainers are watching an old PR that is still open but quiet.

In those cases the PR will fall out of the Search API window even though the upstream check run still needs to be patched. The previous full open-PR scan did not have that blind spot.

I think we need to keep the candidate set tied to the status-sync problem itself rather than PR recency, or this workflow will silently stop updating older open PRs.

zhangshenghang
zhangshenghang previously approved these changes Apr 6, 2026
Copy link
Copy Markdown
Member

@zhangshenghang zhangshenghang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, if ci passes

Copy link
Copy Markdown
Contributor

@DanielLeens DanielLeens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-checked the latest HEAD after the new commit.

The blocking issue from my previous comment is still present. The workflow candidate set is still restricted to PRs updated in the last 7 days (SEARCH_WINDOW_DAYS = 7, searchActivePullRequests() in update_build_status.yml:39-52). So the sync path is now:

schedule -> searchActivePullRequests() -> processPullRequest() -> check-runs -> workflow run lookup -> patch

An open PR that is older than 7 days never even reaches processPullRequest(), even if its mirrored fork workflow run is re-run later or changes from in_progress to completed. That is a correctness regression from the previous full open-PR scan.

I also checked the latest follow-up commit 92c1c14ff9: it only changes the Kingbase testcontainer files and does not touch this workflow path, so the blind spot remains unchanged.

I think we need to restore candidate discovery that is independent of recent PR conversation activity, for example by keeping the full open-PR scan and retaining the new no-op PATCH skip, before this can merge.

@davidzollo
Copy link
Copy Markdown
Contributor Author

I re-checked the latest HEAD after the new commit.

The blocking issue from my previous comment is still present. The workflow candidate set is still restricted to PRs updated in the last 7 days (SEARCH_WINDOW_DAYS = 7, searchActivePullRequests() in update_build_status.yml:39-52). So the sync path is now:

schedule -> searchActivePullRequests() -> processPullRequest() -> check-runs -> workflow run lookup -> patch

An open PR that is older than 7 days never even reaches processPullRequest(), even if its mirrored fork workflow run is re-run later or changes from in_progress to completed. That is a correctness regression from the previous full open-PR scan.

I also checked the latest follow-up commit 92c1c14ff9: it only changes the Kingbase testcontainer files and does not touch this workflow path, so the blind spot remains unchanged.

I think we need to restore candidate discovery that is independent of recent PR conversation activity, for example by keeping the full open-PR scan and retaining the new no-op PATCH skip, before this can merge.

I totally understand your concern, but I believe this is currently the most effective way to reduce CI runtime. When a PR older than 7 days has its fork repository CI status out of sync with the main SeaTunnel repository, the PR author can trigger an empty commit or similar action to kick off a re-run and status check. This can serve as a trial run — if we find that a significant number of PRs are older than 7 days and affected, we can increase the SEARCH_WINDOW_DAYS value accordingly.

Copy link
Copy Markdown
Contributor

@dybyte dybyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 if CI passes

@DanielLeens
Copy link
Copy Markdown
Contributor

Thanks for the detailed reply. I re-checked the latest workflow code again, and I still think the 7-day search window remains a correctness blocker.

The current runtime path is now:

schedule -> searchActivePullRequests() -> processPullRequest() -> check-runs -> workflow run lookup -> patch

Because searchActivePullRequests() is restricted to updated:>=${since}, any open PR older than 7 days never even reaches processPullRequest(). That means a later rerun or state transition of the mirrored fork workflow on the same head SHA is invisible to this scheduler.

I do not think asking the PR author to push an empty commit closes that gap, because it changes the contract of this workflow:

  • the old implementation synced Build status for open PRs without requiring a head update
  • rerunning a workflow on the same SHA is a valid case that the old code handled
  • forcing synthetic commits to repair stale checks adds noise to PR history and turns an automatic sync job into a manual recovery step

Increasing SEARCH_WINDOW_DAYS would only move the blind spot, not remove it.

So I still think the candidate discovery needs to stay independent of recent PR conversation activity, for example by keeping the full open-PR scan and retaining the new no-op PATCH skip. Once that part is restored, the batching / skip optimizations look fine to me.

Copy link
Copy Markdown
Member

@zhangshenghang zhangshenghang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@zhangshenghang zhangshenghang merged commit fc2d209 into apache:dev Apr 10, 2026
10 of 11 checks passed
onceMisery pushed a commit to onceMisery/seatunnel that referenced this pull request Apr 14, 2026
Co-authored-by: Doyeon Kim <132787602+dybyte@users.noreply.github.com>
junjunclub pushed a commit to junjunclub/seatunnel that referenced this pull request Apr 20, 2026
Co-authored-by: Doyeon Kim <132787602+dybyte@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize update build status workflow for active PRs only

5 participants