Skip to content

Cross-file INFERRED calls resolution lacks import evidence + ambiguous-candidate handling — short common names blow up god_nodes ranking #543

Description

@AlphaMinusOne

Summary

The current cross-file calls resolution (extract.py around L3345-L3374) builds a global name map keyed by label.strip(\"()\").lstrip(\".\").lower(), then for each unresolved call it adds an INFERRED calls edge to any node whose label matches that key. There's no import-evidence check, no multi-candidate disambiguation, and confidence is hard-coded to 0.8.

For corpora with common short names (log, execute, mutate, replace, find, optional, check, main), this systematically inflates the degree of harmless local helpers until they become the top god_nodes — even though the true call graph never reaches them.

god_nodes() itself already filters file-level hubs / method stubs / isolated stubs (_is_file_node, _is_concept_node), but it does not consult edge confidence or import evidence, so polluted nodes pass straight through.

This is not the same as the calibration issue in #540 — recalibrating the 0.8 score won't change degree, and god_nodes ranks by raw degree.

Reproducer (real corpus, 2026-04-25)

TypeScript ERP repo, ~3080 nodes, 4218 edges, refreshed via /graphify-refresh. Top 8 god_nodes:

Node Total degree Cross-file INFERRED calls Same-file edges EXTRACTED edges
log() 150 144 1 6
execute() 80 79 1 1
createRawMcpServer() 78 76 2 2
registerAllOriginalTools() 76 74 2 2
optional() 69 67 2 2
mutate() 66 64 2 2
replace() 49 46 3 3
find() 30 28 1 2

Concrete example: log() is a one-line arrow function at openclaw/sync.mjs:99:

const log = (msg) => process.stdout.write(`[sync] ${msg}\n`);

It has no export. It cannot be called by scripts/check-cipl-data.ts:check() or any of the other 130 distinct files the graph claims call it. All 144 cross-file INFERRED edges are spurious — the global name map matched log against console.log-style usage in unrelated scripts.

Why "audit, not cleanse" doesn't fully cover this

I read worked/httpx/review.md and the README; I understand the design philosophy is to expose INFERRED edges with verification questions rather than auto-clean them. That's a sound default. But:

  1. God_nodes is the canonical entry point for human reviewers. When the top ranks are systematically pollution, it actively misleads.
  2. The Suggested Questions audit hint ("Are the 144 inferred relationships involving log() actually correct?") is honest but expensive to act on — a reviewer would need to verify each of 144 edges manually.
  3. The pollution scales with corpus size and naming entropy, not extraction quality.

Proposed directions (any one would help)

(a) Disambiguation at extraction time — when a call name resolves to multiple candidates with no import evidence, drop confidence to 0.4–0.5 instead of 0.8, or skip the edge.

(b) Import-evidence gate — only emit cross-file calls if from <other_file> import <name> (or analog) appears in the calling file. Use AST imports already collected upstream.

(c) Trusted-degree ranking in god_nodes — leave graph.json untouched (preserves audit), but compute god_nodes ranking with a trusted_degree that gives zero weight to cross-file INFERRED calls lacking import evidence. Add an audit section listing pollution candidates separately.

(d) Threshold by candidate count — if a name appears as a node in N+ files, refuse to globally resolve to any of them; require explicit import.

(c) is what I'm planning to apply locally as a postprocess-god-nodes step, but the right home for (a)/(b) is upstream extract.py.

Related

Happy to send a PR for (c) if there's interest in landing it upstream rather than each consumer reinventing it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions