fix(detection): C1 PR-3 — tighten batch detector (closes part 3 of #83) by AndresL230 · Pull Request #109 · recost-dev/extension

AndresL230 · 2026-05-13T23:59:26Z

Summary

Eliminates 8 of 9 corpus batch false positives (9 → 1) by gating the sequential-batching detectors on enclosing function. Calls in different functions of the same file execute on independent code paths and cannot be batched together.

Builds on #106 (per-detector measurement) and #108 (cache detector). Same investigative shape, same review pattern.

What changed

TypeScript detector (`src/ast/waste/batch-detector.ts`)

detectSequential() now buckets by (provider, enclosingFunction) instead of provider alone:

Module-level calls bucket to \"<module>\" so they cluster only with other module-level calls.
The bucket Map carries provider as a struct field ({provider, matches}) instead of encoding it in the key — avoids a \"::\" delimiter footgun if a future provider id contains the separator. Per CodeRabbit-style code-quality review.
Within each bucket, dedupe by line: cross-file resolver expansion can produce multiple AstCallMatch entries at the same source line for one user-written call (e.g., bedrock-raw-fetch/src/index.ts where one await handleApi(...) resolves through 2 internal wrappers).
Evidence string lists unique sorted lines.

Python detector (`src/scanner/python-waste-detector.ts`)

detectSequentialBatching() applies the same (providerKey, enclosingFunction) bucketing, with the same struct-field pattern. The cluster.length >= 3 proximity threshold is preserved (Python doesn't have explicit await sequencing as an N+1 indicator, so it warrants a stronger signal than TS's >= 2).

Measurement (corpus v1, 7 fixtures)

Metric	Baseline	After PR-3	Δ
Detection precision	36.26%	36.26%	+0.00pp
Detection recall	48.53%	48.53%	+0.00pp
Provider attribution	82.14%	82.14%	+0.00pp
Finding precision	9.09%	33.33%	+24.24pp
Finding recall	33.33%	33.33%	+0.00pp

Per-detector:

Detector	Before (TP/FP/FN)	After (TP/FP/FN)	Note
`n_plus_one`	1 / 0 / 0	1 / 0 / 0	unchanged
`cache`	0 / 0 / 0	0 / 0 / 0	unchanged (PR-2 already cleared)
`batch`	0 / 9 / 1	0 / 1 / 1	TS detector: 7 → 1 FP. Python detector: 2 → 0 FP.
`rate_limit`	0 / 1 / 0	0 / 1 / 0	unchanged
`unbatched_parallel`	0 / 0 / 1	0 / 0 / 1	unchanged

The remaining batch FP is bedrock-raw-fetch/src/index.ts:5 — two sequential await handleApi(...) calls in the same main() function (lines 5 and 11). Arguably a true positive (Promise.all would parallelize them) that the corpus didn't label. Fixing it cleanly requires either control-flow-branch awareness or a corpus annotation update; out of scope for this PR.

Sample size for batch (TP+FP = 1) is below the per-type gate's ≥3 threshold, so any future regression will need 3+ emissions before tripping the gate. This is an intentional gate design choice (PR-1) to avoid noise on small samples.

Tests

4 new TDD-style fixtures + tests in src/test/c1-pr3-batch-tightening.test.ts and src/test/fixtures/c1-pr3/:

#	Fixture	Direction	Asserts
1	`ts_diff_functions.ts`	negative	two openai calls in two different functions → 0 batch findings
2	`ts_same_function.ts`	positive (recall)	two openai calls in same function → ≥1 batch finding
3	`py_diff_functions.py`	negative	three anthropic calls in three different functions → 0 batch findings
4	`py_same_function.py`	positive (recall)	three anthropic calls in same function → ≥1 batch finding

All 357 tests pass.

Acceptance criteria for issue #83 (part 3)

TS detectSequential() emits ZERO findings on answer.ts, bedrock-client.ts, raw-fetch-client.ts, summarize.ts, tts-service.ts
Python detectSequentialBatching() emits ZERO findings on anthropic_helper.py and chat_completions_basic.py
Synthetic positive fixtures still produce batch findings (recall preserved)
npm test passes (357/357)
npm run benchmark exits 0; per-type gate doesn't fail
Global findingPrecision rises (9.09% → 33.33%)
benchmark/baseline.json updated; batch collapses to TP=0/FP=1/FN=1

Out of scope (next steps)

Cross-file resolver line-dedup at the resolver level rather than the detector level — would also handle the routes/api.ts-style mutually-exclusive-branch pattern if branches are followed.
rate_limit detector — defer until corpus has rate-limit positive cases.
C2 ([Findings] Proper dedupe of AI + local-rule findings (C2) #84) finding dedupe — depends on B3 stable IDs.
C3 ([Findings] Confidence everywhere; severity derived from signals (C3) #85) confidence + derived severity — depends on C1 fully calibrated; with PR-3 merged, C1 is effectively done modulo the 1 borderline FP.

Notes for reviewers

enclosingFunction is string | null on AstCallMatch. Module-level calls (null) bucket to \"<module>\" rather than getting their own per-call buckets — desired behavior since two top-level awaits in a script DO live on the same execution path.
The line-dedup change reduces the displayed line count in the evidence string; existing ast-batch-detector.test.ts synthetic tests use enclosingFunction: null (module-level default) and don't assert on the exact line list, so they pass unchanged.
Bucket Map structure: Map<string, {provider: string; matches: AstCallMatch[]}> instead of Map<string, AstCallMatch[]> + key-split-on-"::". The structural pattern is what PR-2 review settled on for similar code; reused here.

Test plan

npm test — 357 PASS
npm run benchmark — exit 0, no regressions
Manual CLI scan against each corpus fixture — confirmed only bedrock-raw-fetch/src/index.ts:5 still emits

🤖 Generated with Claude Code

Eliminates 8 of 9 corpus batch false positives by gating both the TS AST sequential-batching detector and the Python detectSequentialBatching on enclosing function. Calls in different functions of the same file execute on independent code paths and cannot be batched together. TypeScript (src/ast/waste/batch-detector.ts): - detectSequential() now buckets by (provider, enclosingFunction) instead of provider alone. Module-level calls bucket to "<module>" so they cluster only with other module-level calls. - Within a bucket, dedupe by line — cross-file resolver expansion can produce multiple AstCallMatch entries at the same source line for one user-written call (e.g., bedrock-raw-fetch/src/index.ts where one await handleApi(...) resolves through 2 internal wrapper functions). - Bucket Map carries the provider as a struct field rather than encoding it in the key string — avoids a "::" delimiter footgun if a future provider id contains the separator. - Evidence string lists unique sorted lines. Python (src/scanner/python-waste-detector.ts): - detectSequentialBatching() applies the same (providerKey, enclosingFunction) bucketing. The cluster.length >= 3 proximity threshold is preserved (Python doesn't have explicit await sequencing as a clear N+1 indicator, so it warrants a stronger signal). Measurement (benchmark/baseline.json, docs/accuracy/findings.md): - batch: 9 FP / 0 TP / 1 FN → 1 FP / 0 TP / 1 FN - finding precision: 9.09% → 33.33% globally (+24.24pp) - All other per-detector metrics unchanged - No detection or recall regressions - Sample size for batch falls below the per-type gate's >=3 threshold, so the gate skips it (next regression on batch will need 3+ emissions before failing the build) Tests (src/test/c1-pr3-batch-tightening.test.ts, fixtures/c1-pr3/): - 4 new test cases pinning both regression directions for both languages: cross-function calls (-) and same-function calls (+). - Total: 357 PASS. Remaining 1 batch FP is bedrock-raw-fetch/src/index.ts:5 — two sequential `await handleApi(...)` calls in `main()`. Arguably a true positive (Promise.all would parallelize them) that the corpus didn't label. Fixing it cleanly requires either control-flow-branch awareness in the detector or a corpus annotation update; both are scoped to a follow-up PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-13T23:59:33Z

Warning

Rate limit exceeded

@AndresL230 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 34 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 17abab8b-32f6-4def-96e3-5ce161225a30

📥 Commits

Reviewing files that changed from the base of the PR and between 3b13ffe and dd657aa.

📒 Files selected for processing (11)

benchmark/baseline.json
docs/accuracy/findings.md
docs/superpowers/plans/2026-05-13-c1-pr3-batch-detector-tightening.md
package.json
src/ast/waste/batch-detector.ts
src/scanner/python-waste-detector.ts
src/test/c1-pr3-batch-tightening.test.ts
src/test/fixtures/c1-pr3/py_diff_functions.py
src/test/fixtures/c1-pr3/py_same_function.py
src/test/fixtures/c1-pr3/ts_diff_functions.ts
src/test/fixtures/c1-pr3/ts_same_function.ts

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/c1-pr3-batch-detector

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

AndresL230 merged commit 08eb32f into main May 14, 2026
3 checks passed

AndresL230 mentioned this pull request May 15, 2026

[Findings] Tighten CACHE_GUARD / BATCH_GUARD bare-word leak #112

Open

4 tasks

AndresL230 deleted the claude/c1-pr3-batch-detector branch May 22, 2026 22:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(detection): C1 PR-3 — tighten batch detector (closes part 3 of #83)#109

fix(detection): C1 PR-3 — tighten batch detector (closes part 3 of #83)#109
AndresL230 merged 1 commit into
mainfrom
claude/c1-pr3-batch-detector

AndresL230 commented May 13, 2026

Uh oh!

coderabbitai Bot commented May 13, 2026

Rate limit exceeded

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AndresL230 commented May 13, 2026

Summary

What changed

TypeScript detector (src/ast/waste/batch-detector.ts)

Python detector (src/scanner/python-waste-detector.ts)

Measurement (corpus v1, 7 fixtures)

Tests

Acceptance criteria for issue #83 (part 3)

Out of scope (next steps)

Notes for reviewers

Test plan

Uh oh!

coderabbitai Bot commented May 13, 2026

Rate limit exceeded

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

TypeScript detector (`src/ast/waste/batch-detector.ts`)

Python detector (`src/scanner/python-waste-detector.ts`)