Call-edge provenance mislabels CodeQL-resolved edges as "jedi"

## Summary

`jedi_call_graph_edges()` hardcodes `provenance=["jedi"]` on every edge it emits, regardless of which resolver actually filled the underlying `PyCallsite.callee_signature`. As a result, edges that were only resolvable because of CodeQL (or the constructor heuristic) are attributed to Jedi. CodeQL's true contribution to the call graph is systematically under-reported.

Observed at commit `7392fed`.

## Pipeline (as built in `core.py:379-398`)

0. `_build_symbol_table()` (`core.py:379`) — `SymbolTableBuilder` runs **Jedi** per file (`syntactic_analysis/symbol_table_builder.py:9-11,119`) and writes `callee_signature` into each `PyCallsite` during symbol-table construction.
1. `_get_call_graph(symbol_table, augment_sites=True)` (`core.py:395`) — **CodeQL** runs, filling `callee_signature` **in-place** for sites Jedi left empty, and separately emits an explicit edge list with `provenance=["codeql"]` (`semantic_analysis/codeql/codeql_analysis.py:321`).
2. `resolve_unresolved_constructors()` (`core.py:396`) — heuristic pass fills more `callee_signature`s in-place.
3. `jedi_call_graph_edges(symbol_table)` (`core.py:397`) — **reader only**: emits an edge for every site with a non-empty `callee_signature` (`semantic_analysis/call_graph.py:181-183`) and unconditionally tags it `provenance=["jedi"]` (`semantic_analysis/call_graph.py:186`).
4. `merge_edges(jedi_edges, codeql_edges)` (`core.py:398`) — unions provenance for shared `(source, target)` (`semantic_analysis/call_graph.py:263`).

## Root cause

Step 3 derives edges from a symbol table whose `callee_signature`s were filled by **three** different mechanisms (Jedi in step 0, CodeQL in step 1, constructor heuristic in step 2), but stamps all of them `["jedi"]`. An edge only surfaces `codeql` in its provenance if CodeQL *also* emitted it as a standalone object in step 1's `codeql_edges` list, which step 4 then unions in. CodeQL contributions that manifest *only* as in-place `callee_signature` fills are mislabeled `["jedi"]`.

This makes the `provenance` field answer "which backend emitted this exact edge object" rather than the intended "which backend's resolution made this edge possible."

## Reproduction

Running call-graph analysis with `using_codeql=True` over the `codellm-devkit/python-sdk` `cldk/` package (618 edges):

| provenance | edges |
|---|---:|
| `["jedi"]` | 532 |
| `["codeql"]` | 80 |
| `["codeql","jedi"]` | 6 |

The 86 `codeql`-tagged edges are a **lower bound** on CodeQL's real contribution; an unknown share of the 532 `["jedi"]` edges are only resolvable because CodeQL (step 1) or the constructor heuristic (step 2) filled their `callee_signature`.

## Suggested fix

Track provenance at the point `callee_signature` is set, not at edge-emission time. Options:

- Record the resolving backend on `PyCallsite` when each pass fills `callee_signature` (Jedi in symbol-table build, CodeQL in `augment_sites`, heuristic in `resolve_unresolved_constructors`), and have `jedi_call_graph_edges` read provenance from the site instead of hardcoding `["jedi"]`.
- At minimum, rename `jedi_call_graph_edges` to reflect that it derives edges from the *combined* symbol table, so the hardcoded `["jedi"]` tag isn't mistaken for an actual Jedi attribution.

## Impact

Any downstream consumer using `provenance` to measure or compare resolver coverage (e.g. the Java/Python parity work in `python-sdk`) gets misleading numbers — CodeQL looks far less effective than it is.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Call-edge provenance mislabels CodeQL-resolved edges as "jedi" #28

Summary

Pipeline (as built in `core.py:379-398`)

Root cause

Reproduction

Suggested fix

Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

provenance	edges
`["jedi"]`	532
`["codeql"]`	80
`["codeql","jedi"]`	6

Call-edge provenance mislabels CodeQL-resolved edges as "jedi" #28

Description

Summary

Pipeline (as built in core.py:379-398)

Root cause

Reproduction

Suggested fix

Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Pipeline (as built in `core.py:379-398`)