Skip to content

fix(extractor): gate member_expression callback args on callee allowlist#974

Merged
carlos-alm merged 2 commits intomainfrom
fix/971-ts-callback-ref-false-positives
Apr 20, 2026
Merged

fix(extractor): gate member_expression callback args on callee allowlist#974
carlos-alm merged 2 commits intomainfrom
fix/971-ts-callback-ref-false-positives

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

  • Adds a `CALLBACK_ACCEPTING_CALLEES` allowlist (router/middleware, promises, array iteration, event emitters, timers, commander) in `src/extractors/javascript.ts` and gates the `member_expression` branch of `extractCallbackReferenceCalls` on it. Identifier args remain unchanged.
  • Eliminates the `UserRepository.save@repository.ts -> User.id@types.ts` false positive caused by treating `user.id` in `store.set(user.id, user)` as a callback reference, which was the root cause of the TS resolution precision regression from 100% to 93.8% after PR feat(js-extractor): resolve named function references passed as arguments #947.
  • Updates `generated/benchmarks/BUILD-BENCHMARKS.md` with restored 3.9.4 TS resolution metrics (15 TP, 0 FP → precision 1.0).
  • Adds two tests in `tests/parsers/javascript.test.ts`:
    • negative: `store.set(user.id, user)` must not emit `id` dynamic call
    • positive: `app.use(auth.validate)` and `promise.then(handlers.onSuccess)` still emit member-expr dynamic calls

Fixes #971.

Why option 3 (allowlist)

  • Option 1 (identifier-only): regresses `app.use(auth.validate)` — whole member-expr callback class lost.
  • Option 2 (≤1-arg heuristic): regresses `addEventListener('click', h)`, `setTimeout(fn, 100)`.
  • Option 3 (allowlist): precise match to real callback APIs; no collateral damage. Chosen.
  • Option 4 (resolver-side): hides over-extraction instead of preventing it.

Test plan

  • `tests/benchmarks/regression-guard.test.ts` — 17/17 (was 16/17)
  • TS resolution benchmark — precision 93.8% → 100%
  • `tests/parsers/javascript.test.ts` — 47/47 (+2 new)
  • `npm run lint` — clean
  • CI — verify full suite

Remaining concerns

PR #947's extractCallbackReferenceCalls emitted a dynamic call edge for every
identifier or member_expression argument of every call expression. That was
correct for real callbacks (router.use, promise.then, items.map) but produced
false positives for plain property reads passed as data, e.g.
`store.set(user.id, user)` — the `user.id` arg wrongly resolved to
`User.id@types.ts`, dropping TS resolution precision from 100% → 93.8% and
tripping the regression guard.

Fix: gate member_expression args on a CALLBACK_ACCEPTING_CALLEES allowlist
covering router/middleware, promises, array methods, event emitters, and
scheduling APIs. Identifier args remain unchanged. Adds positive and
negative tests; updates the 3.9.4 resolution benchmark to reflect
restored precision.

Fixes #971

Impact: 2 functions changed, 7 affected
@claude
Copy link
Copy Markdown

claude bot commented Apr 20, 2026

Claude finished @carlos-alm's task —— View job


Reviewing PR: fix(extractor): gate member_expression callback args on callee allowlist

Review Checklist:

  • Phase 1: Root cause analysis - understand the false positive problem
  • Phase 2: Configuration change detection - check for rule relaxations
  • Phase 3: Test integrity - verify tests are strengthened, not weakened
  • Phase 4: Code quality review - examine extractor logic and test coverage
  • Phase 5: Backlog compliance - rate against framework criteria
  • Final assessment and recommendation

Root Cause Analysis

Problem Being Solved: False positive in TypeScript callback reference extraction where user.id in store.set(user.id, user) was incorrectly treated as a callback reference to id, causing TypeScript resolution precision to drop from 100% to 93.8% after PR #947.

Why This Approach: The allowlist approach (option 3) is well-justified:

  • Option 1 (identifier-only) would lose legitimate member expressions like app.use(auth.validate)
  • Option 2 (≤1-arg heuristic) would break common patterns like addEventListener('click', handler), setTimeout(fn, 100)
  • Option 4 (resolver-side filtering) would hide over-extraction rather than prevent it
  • The allowlist is precise and targets actual callback-accepting APIs

Risk Assessment: Low risk. The change is surgical and well-scoped, with comprehensive test coverage for both negative and positive cases.

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies
  • Foundation-aligned: ✓ — Improves accuracy of local extraction, no cloud dependencies
  • Problem-fit: 5 — Directly addresses core problem of AI agents getting false dependency information
  • Breaking: No — Pure accuracy improvement, no API changes
  • Tier: 1

Code Quality Assessment

Strengths:

  • Surgical fix: The CALLBACK_ACCEPTING_CALLEES allowlist is comprehensive and well-categorized (Express/router, Promises, Array iteration, Event emitters, Timers, CLI frameworks)
  • Excellent documentation: Clear comments explaining the rationale and examples
  • Strong test coverage: Two new tests cover both the negative case (store.set(user.id, user) should not emit id call) and positive regression guards (app.use(auth.validate), promise.then(handlers.onSuccess) should still work)
  • Measurable improvement: TypeScript resolution precision restored from 93.8% to 100% (15 TP, 0 FP)
  • Conservative allowlist: Easy to extend when new callback APIs are encountered

Technical correctness:

  • extractCalleeName correctly extracts method names from member_expression callees
  • The gating logic in extractCallbackReferenceCalls is sound: memberExprArgsAllowed = calleeName !== null && CALLBACK_ACCEPTING_CALLEES.has(calleeName)
  • Identifier args remain unchanged (good - they have lower FP risk)
  • No issues with edge cases or error handling

Test integrity:

  • Tests strengthen verification rather than weakening it
  • Regression guards ensure the fix doesn't break existing functionality
  • Clear test descriptions explaining the intent

Potential concerns:

  • Allowlist maintenance: Future callback APIs will need manual addition, but this is documented and the conservative approach is appropriate
  • Language coverage: Currently JS/TS focused, but the extractor is JS-specific so this is expected

Minor suggestions:

  1. Consider adding a comment in the allowlist about how to add new APIs when encountered
  2. The allowlist could potentially be externalized to config in the future, but current approach is fine for maintainability

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is a high-quality, well-tested fix that directly addresses a precision regression with minimal risk. The allowlist approach is sound, the implementation is clean, and the tests provide good coverage. The only reason for 4/5 instead of 5/5 is that it's a targeted bug fix rather than a transformative improvement.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 20, 2026

Greptile Summary

This PR adds a CALLBACK_ACCEPTING_CALLEES allowlist and an HTTP_VERB_CALLEES sub-allowlist with a firstArgIsStringLiteral guard to eliminate the false positive where store.set(user.id, user) was treated as a callback invocation of user.id, which regressed TS resolution precision from 100% to 93.8% after PR #947. Previous review concerns — HTTP-verb/Map-API overlap and optional-chaining callee shape — are both addressed in this commit (string-literal guard for HTTP verbs, JSDoc + test confirming optional chaining is transparent in tree-sitter).

Confidence Score: 5/5

Safe to merge — the fix is targeted, well-tested, and restores benchmark precision to 100% with no regressions.

All previous review concerns (HTTP-verb/Map-API overlap, optional-chaining false negative) are resolved in this commit. Five new tests cover negative, positive, HTTP-verb guard, and optional-chain cases. No P0 or P1 issues found in the implementation.

No files require special attention.

Important Files Changed

Filename Overview
src/extractors/javascript.ts Adds CALLBACK_ACCEPTING_CALLEES and HTTP_VERB_CALLEES sets plus extractCalleeName/firstArgIsStringLiteral helpers; gates member_expression callback emission on the allowlist with a string-literal path guard for HTTP verbs. Logic is correct, well-documented, and the calleeName !== null re-check on line 1426 is redundant but harmless.
tests/parsers/javascript.test.ts Adds five new test cases covering the non-allowlisted negative (store.set), allowlist positive regression guard (app.use, promise.then), HTTP-verb negative (cache.get, repo.put, map.delete), HTTP-verb positive with string path (router.get, app.post with template literal), and optional-chaining callee (emitter?.on). Coverage is comprehensive.
generated/benchmarks/BUILD-BENCHMARKS.md Updates benchmark metrics to reflect restored 100% TS resolution precision (15 TP, 0 FP) after the false-positive fix.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[extractCallbackReferenceCalls] --> B[extractCalleeName]
    B --> C{calleeName in\nCALLBACK_ACCEPTING_CALLEES?}
    C -- No --> D[memberExprArgsAllowed = false]
    C -- Yes --> E{calleeName in\nHTTP_VERB_CALLEES?}
    E -- No --> F[memberExprArgsAllowed = true]
    E -- Yes --> G{firstArgIsStringLiteral?}
    G -- Yes: Express route --> F
    G -- No: Map/cache API --> D
    D --> H[Iterate args]
    F --> H
    H --> I{arg type?}
    I -- identifier --> J[always emit dynamic call]
    I -- member_expression --> K{memberExprArgsAllowed?}
    K -- Yes --> L[emit dynamic call with receiver]
    K -- No --> M[skip — data value, not callback]
Loading

Reviews (2): Last reviewed commit: "fix(extractor): require string-literal p..." | Re-trigger Greptile

Comment on lines +1294 to +1302
'use',
'get',
'post',
'put',
'delete',
'patch',
'options',
'head',
'all',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 get/HTTP-verb entries conflict with Map/cache patterns

get, post, put, delete, and patch are added for Express routes (router.get('/path', auth.check)), but they also match Map, LRU-cache, and repository APIs: cache.get(user.id) or repo.put(record.key, value) will still emit id / key as a dynamic callback call — exactly the same category of false positive the PR fixes for set. The set method isn't in the allowlist (correct), but its counterpart get is.

Consider whether the HTTP-verb entries should require a string-literal first argument (the route path) as an additional guard, or whether the get FP rate in practice is low enough to be acceptable. As-is, the remaining FP surface for Map-like patterns using these verb names isn't documented in the PR's "remaining concerns" section.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d51265d. Added an HTTP_VERB_CALLEES sub-allowlist gated by a string-literal first-arg check (firstArgIsStringLiteral): HTTP verbs (get, post, put, delete, patch, options, head, all) now only treat member-expr args as callbacks when the first argument is a string literal — matching the Express route signature app.get('/path', handler) while dropping cache.get(user.id), repo.put(record.key, ...), map.delete(entry.id), etc. Added a negative test (cache.get/repo.put/map.delete must not emit dynamic calls for the property name) and a positive regression guard (router.get('/users/:id', auth.check) and \ template still works). use is intentionally excluded from the sub-allowlist because app.use(handler) without a path is a valid middleware registration.

Comment on lines +1349 to +1358
function extractCalleeName(callNode: TreeSitterNode): string | null {
const fn = callNode.childForFieldName('function');
if (!fn) return null;
if (fn.type === 'identifier') return fn.text;
if (fn.type === 'member_expression') {
const prop = fn.childForFieldName('property');
return prop ? prop.text : null;
}
return null;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Optional-chaining callees silently suppressed

extractCalleeName only handles identifier and member_expression callee shapes, returning null for anything else. Optional-chaining calls (obj?.on(event, handler.fn)) produce an optional_member_expression (or equivalent) in tree-sitter, so extractCalleeName returns nullmemberExprArgsAllowed = false → the member-expr arg is silently dropped even though on is in the allowlist.

This is a pre-existing edge case, but the new gating logic makes it a new false-negative class. Worth adding optional_member_expression handling or at least noting it in the JSDoc.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified this is a non-issue and added a regression test in d51265d. In tree-sitter-javascript and tree-sitter-typescript, obj?.on(event, handler.fn) is represented as a member_expression node (with an optional_chain child for the ?. token), not an optional_member_expression — so extractCalleeName already returns on and the allowlist gate works. Confirmed by dumping the AST for both grammars. Added test handles optional-chaining callees in allowlist (obj?.on) (emitter?.on('tick', handlers.fn) emits fn with receiver handlers) to lock in this behavior, plus a JSDoc note on extractCalleeName documenting why optional chaining is transparent here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 20, 2026

Codegraph Impact Analysis

3 functions changed7 callers affected across 1 files

  • extractCalleeName in src/extractors/javascript.ts:1376 (5 transitive callers)
  • firstArgIsStringLiteral in src/extractors/javascript.ts:1392 (5 transitive callers)
  • extractCallbackReferenceCalls in src/extractors/javascript.ts:1420 (6 transitive callers)

…ting (#974)

Addresses Greptile review feedback on PR #974:

- HTTP-verb callees (get/post/put/delete/patch/options/head/all) double as
  Map/cache/repository method names. Require a string-literal first argument
  (Express route path) for member-expr args to be emitted as dynamic calls,
  so `cache.get(user.id)` and `repo.put(record.key, value)` no longer leak
  `id`/`key` as false-positive dynamic calls while `router.get('/path', h)`
  still works.
- Document that optional-chaining callees (`obj?.on(handlers.fn)`) are
  handled transparently: tree-sitter-javascript/typescript represent them
  as `member_expression` with an `optional_chain` child, so the existing
  extraction returns the property name correctly. Add a regression test.
- Tests: three new cases in `tests/parsers/javascript.test.ts`:
  - negative: `cache.get(user.id)`, `repo.put(record.key, ...)`, `map.delete(entry.id)`
  - positive: `router.get('/path', auth.check)`, `app.post(\`/api\`, handlers.create)`
  - optional-chaining: `emitter?.on('tick', handlers.fn)` still emits

All JS parser, regression-guard, and TS/JS resolution benchmarks stay green
(TS precision 1.0, JS precision 1.0).

Impact: 2 functions changed, 7 affected
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit fc5bfe9 into main Apr 20, 2026
29 checks passed
@carlos-alm carlos-alm deleted the fix/971-ts-callback-ref-false-positives branch April 20, 2026 03:30
@github-actions github-actions bot locked and limited conversation to collaborators Apr 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TS resolution precision regression from PR #947: member_expression args mis-extracted as callback refs

1 participant