Run CodeQL only on languages changed in a PR#67972
Merged
Merged
Conversation
PR-triggered CodeQL is by far the most frequent workflow in the repo (~1,300+ runs/week), and every run fans out one job per language (python, javascript, actions, go, java) regardless of what changed. The java job in particular runs a full Gradle build on every PR even though java-sdk files change in well under 1% of PRs. Gate the language matrix on the files actually changed in the PR: a docs-only PR now runs nothing, and the common python-only PR runs a single job instead of five. push-to-main and scheduled runs still scan every language, so coverage of the main branch is unchanged.
potiuk
approved these changes
Jun 3, 2026
bugraoz93
approved these changes
Jun 3, 2026
Contributor
|
I love this! |
Contributor
Backport failed to create: v3-2-test. View the failure log Run detailsNote: As of Merging PRs targeted for Airflow 3.X In matter of doubt please ask in #release-management Slack channel.
You can attempt to backport this manually by running: cherry_picker c91dc3b v3-2-testThis should apply the commit to the v3-2-test branch and leave the commit in conflict state marking After you have resolved the conflicts, you can continue the backport process by running: cherry_picker --continueIf you don't have cherry-picker installed, see the installation guide. |
shahar1
added a commit
that referenced
this pull request
Jun 4, 2026
#67972) (#68024) PR-triggered CodeQL is by far the most frequent workflow in the repo (~1,300+ runs/week), and every run fans out one job per language (python, javascript, actions, go, java) regardless of what changed. The java job in particular runs a full Gradle build on every PR even though java-sdk files change in well under 1% of PRs. Gate the language matrix on the files actually changed in the PR: a docs-only PR now runs nothing, and the common python-only PR runs a single job instead of five. push-to-main and scheduled runs still scan every language, so coverage of the main branch is unchanged. (cherry picked from commit c91dc3b)
1 task
shahar1
added a commit
to shahar1/airflow
that referenced
this pull request
Jun 5, 2026
running all five languages unconditionally. Extend the detect-languages job to use the GitHub compare API (before…after) for push events, so a docs-only or single-language merge to main no longer fans out all five CodeQL jobs. schedule runs are unchanged — they still scan every language to maintain periodic full-branch coverage. Falls back to all languages when the compare API is unavailable or the before SHA is all zeros (branch creation). related: apache#67972
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
On
pull_request, scan only the CodeQL languages whose files actually changed, instead of always running all five (python,javascript,actions,go,java). A smalldetect-languagesjob inspects the PR's changed files and builds the analysis matrix dynamically.push(tomain) andscheduleruns are unchanged — they always scan every language, so coverage of themainbranch is identical to today.Result per PR:
Why
CodeQL on PRs is by far the most frequently triggered workflow in the repo — on the order of ~1,300+ runs/week (≈ 87% of all CodeQL runs are
pull_request). Every one of those runs currently fans out one job per language regardless of what changed, so it is a constant, high-volume contributor to runner/concurrency pressure on the shared Actions pool. Measuring a sample of recent PRs:javascript~20%,actions~2%,go~1%,java~0%So the large majority of the language jobs we run on PRs scan code that did not change. Gating the matrix cuts roughly ~55–60% of CodeQL PR minutes and ~80% of CodeQL job-starts, while keeping full per-language coverage on
main.Relationship to #45541
#45541 ("CodeQL scanning can run always on all code") deliberately removed conditional CodeQL logic, on the basis that "CodeQL scanning is fast and having custom configuration … makes it unnecessarily complex." That was true at the time — CodeQL then scanned 3 fast languages (
python,javascript,actions).Two things have changed since:
goand especiallyjavawere added afterwards —javavia the "Add Java SDK" change, which runs a fullsetup-java+./gradlew classes testClassesGradle build on every PR, even thoughjava-sdkfiles change in well under 1% of PRs. That materially breaks the "CodeQL is fast" premise the always-on decision rested on.The added complexity here is intentionally small and contained to one workflow (a single detect job + a dynamic matrix), and only affects PR runs —
mainscanning stays exactly as it is.Was generative AI tooling used to co-author this PR?
Generated-by: Claude Code (Opus 4.8) following the guidelines