You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CodeQL Java/Kotlin analysis on a large multi-module Android Kotlin codebase regressed sharply between CodeQL CLI 2.24.0 and 2.24.1. The same workflow, codebase, runner, and query suite that completed in ~12-15 minutes on 2.24.0 now takes 3-5 hours when it converges at all (~37.5% success rate); 60% of runs hang and hit the GitHub-hosted runner 6-hour cap.
The regression is reproducible by version bisection — pinning the CLI back to 2.24.0 via the tools: parameter restores the historical ~12-minute behavior with 100% convergence on the same codebase and configuration.
Bisection result
Three parallel runs with tools: codeql-bundle-v2.24.0:
Run
Duration
Conclusion
1
12m 56s
Success
2
15m 29s
Success
3
13m 32s
Success
Versus the same workflow without the pin (uses current @v4 bundle, CodeQL CLI 2.25.4):
Suite
Runs
Successes
Failures
Success duration
Failure
security-and-quality
5
2 (40%)
3 (60%)
3h12m, 5h15m
6h cap
security-extended
3
1 (33%)
2 (67%)
4h52m
6h cap
Combined post-2.24.1
8
3 (37.5%)
5 (62.5%)
3-5 hours
6h cap
Only the CLI version differs between the pinned-2.24.0 set and the unpinned set. Same codebase, same workflow, same runner, same suite, sometimes within minutes of each other.
Historical context
Pre-regression, this scan ran nightly for ~3 months with 100% green:
Window
Outcome
Nov 2025 (30 runs)
All green
Dec 2025 (31 runs)
All green
Jan 2026 (31 runs)
All green
Feb 1-6, 2026 (6 runs)
All green
Feb 7, 2026 onward (~150 runs)
Nearly all fail at the 6h cap
Historical successful run durations (sampled across the green period): 12-18 minutes consistently.
Feb 7, 2026 is the day after github/codeql-action v4.32.2 shipped (Feb 5, 2026), which bundled CodeQL CLI 2.24.1.
Suspected cause
Per the 2.24.1 changelog:
The SummarizedCallable.propagatesFlow predicate has been extended with columns Provenance p and boolean isExact. The predicates SummarizedCallable.hasProvenance and SummarizedCallable.hasExactModel have been removed as a consequence. This change affects C/C++, C#, Golang, Java/Kotlin, JavaScript/TypeScript, Python, Ruby, Swift, and Rust libraries.
SummarizedCallable.propagatesFlow is in the same data-flow framework where our hung runs' evaluator logs show repeated 13.5 GB tuple-table evictions on TypeFlow<Location, TypeFlow::Input>::UnivFlow::ForAll<...> predicates. The eviction itself is normal memory management — but on 2.25.x it sometimes thrashes indefinitely; on 2.24.0 it doesn't trigger or recovers quickly.
Environment
CodeQL CLI: 2.24.0 (green) vs. 2.25.4 (regressed) — bisected, regression starts in 2.24.1
Runner: Linux 16-core (GHAS large runner), 64 GB RAM
Codebase: ~22 Gradle modules, Kotlin-only source (Java only in compiled deps: Android SDK, AndroidX, Compose, Retrofit, OkHttp). Extracted database has 3,645,995 expressions in TypeFlow::FlowStepsInput::TExpr.
Symptom on regressed versions
After the build phase (~12 min consistently), CodeQL init runs (~1 min), first ~146 syntactic queries finish (~5 min). Then:
Created relation ListOfConstantsSanitizer::FlowNode.asExpr/0#dispred#... with 3,640,831 rows
Starting to evaluate predicate TypeFlowImpl::TypeFlow<Location::Location,TypeFlow::Input>::UnivFlow::ForAll<FlowScc,...>::flowJoin/3 (iteration 1)
... delta sizes grow across iterations 1-3 ...
Pausing evaluation to evict 795.21MiB ARRAYS at sequence stamp o+0
Unpausing evaluation: 13.51GiB forgotten: 13.51GiB UNREACHABLE
Starting to evaluate predicate _TypeFlowImpl::TypeFlow<...>::antijoin_rhs#1/4@i3 (iteration 3)
... silence for the remaining 5+ hours ...
Job killed at 6-hour cap
The eviction happens within ~14 seconds of TypeFlow<Location, Input> starting on every regressed run. Sometimes the evaluator continues past it and converges in 3-5 hours; sometimes it never produces another Evaluation done event.
Workaround
Pinning to 2.24.0 via tools: restores reliable nightly scanning. The cost is being frozen on the Jan 2026 QL pack (no new queries, no Kotlin 2.3+ support) until the regression is fixed.
Related
Java CodeQL hangs for hazelcast repository #10765 (hazelcast, Oct 2022, acknowledged): similar Java taint-tracking hang pattern. Their workaround (per-query exclusion) doesn't help us because the trigger is in the shared framework, not in any specific query.
codeql-action#2378: reports of Java/Kotlin analysis taking 8× build time on large codebases. May be the same regression.
Summary
CodeQL Java/Kotlin analysis on a large multi-module Android Kotlin codebase regressed sharply between CodeQL CLI 2.24.0 and 2.24.1. The same workflow, codebase, runner, and query suite that completed in ~12-15 minutes on 2.24.0 now takes 3-5 hours when it converges at all (~37.5% success rate); 60% of runs hang and hit the GitHub-hosted runner 6-hour cap.
The regression is reproducible by version bisection — pinning the CLI back to 2.24.0 via the
tools:parameter restores the historical ~12-minute behavior with 100% convergence on the same codebase and configuration.Bisection result
Three parallel runs with
tools: codeql-bundle-v2.24.0:Versus the same workflow without the pin (uses current
@v4bundle, CodeQL CLI 2.25.4):Only the CLI version differs between the pinned-2.24.0 set and the unpinned set. Same codebase, same workflow, same runner, same suite, sometimes within minutes of each other.
Historical context
Pre-regression, this scan ran nightly for ~3 months with 100% green:
Historical successful run durations (sampled across the green period): 12-18 minutes consistently.
Feb 7, 2026 is the day after
github/codeql-action v4.32.2shipped (Feb 5, 2026), which bundled CodeQL CLI 2.24.1.Suspected cause
Per the 2.24.1 changelog:
SummarizedCallable.propagatesFlowis in the same data-flow framework where our hung runs' evaluator logs show repeated 13.5 GB tuple-table evictions onTypeFlow<Location, TypeFlow::Input>::UnivFlow::ForAll<...>predicates. The eviction itself is normal memory management — but on 2.25.x it sometimes thrashes indefinitely; on 2.24.0 it doesn't trigger or recovers quickly.Environment
@v4(unchanged across both)manual(Gradle build,./gradlew --no-daemon --no-build-cache clean assembleDebug)TypeFlow::FlowStepsInput::TExpr.Symptom on regressed versions
After the build phase (~12 min consistently), CodeQL init runs (~1 min), first ~146 syntactic queries finish (~5 min). Then:
The eviction happens within ~14 seconds of
TypeFlow<Location, Input>starting on every regressed run. Sometimes the evaluator continues past it and converges in 3-5 hours; sometimes it never produces anotherEvaluation doneevent.Workaround
Pinning to 2.24.0 via
tools:restores reliable nightly scanning. The cost is being frozen on the Jan 2026 QL pack (no new queries, no Kotlin 2.3+ support) until the regression is fixed.Related