Skip to content

feat(callgraph): C call graph builder#674

Merged
shivasurya merged 1 commit intomainfrom
shiva/cpp-c-call-graph
May 3, 2026
Merged

feat(callgraph): C call graph builder#674
shivasurya merged 1 commit intomainfrom
shiva/cpp-c-call-graph

Conversation

@shivasurya
Copy link
Copy Markdown
Owner

Summary

Adds BuildCCallGraph — a four-pass algorithm that produces a *core.CallGraph for C projects, ready to merge into the unified graph alongside Python/Go.

Pass Purpose
1 Index every C function_definition under \"<relpath>::<name>\" and ensure the FQN is also in registry.FunctionIndex
2 Register explicit return types (skipping void) and emit ParameterSymbol entries for every named parameter
3 Walk parser-emitted edges (function_definition → call_expression) to extract one CallSiteInternal per call — no second AST traversal
4 Resolve targets in a definition-preferring order, then emit edges and CallSite records

Resolution order (Pass 4)

  1. Same-file definition — common case (helper in same .c); deterministic and independent of include state.
  2. Global definition — scan registry.FunctionIndex[name] for an FQN whose call-graph entry is a definition. Handles cross-.c calls.
  3. Same-file declaration — accept a forward declaration when no definition exists project-wide.
  4. Declaration reachable through #include — last resort so externs handed off to another translation unit still surface as edges.

Calls that don't match any source produce a CallSite{Resolved: false, FailureReason: \"external_or_unresolved\"} — stdlib calls (printf, malloc) and unknown function pointers remain visible to rule writers without polluting the edge set.

Design notes

  • Edges from the parser: every parseCCallExpression adds an edge from the enclosing function to the call node. The builder walks OutgoingEdges of each indexed function instead of doing byte-range containment, keeping Pass 3 deterministic and trivially testable.
  • Definition vs declaration: the parser sets Metadata[\"is_declaration\"]=true on prototype/extern decls. isDeclaration() reads that key with a typed assertion so non-declaration nodes (no metadata) fall through correctly.
  • Recursion: self-edges (process → process) are emitted as-is; the call graph already deduplicates via AddEdge.
  • Static functions: same FQN-by-file mechanism — file-scope statics in different .c files map to disjoint FQNs.
  • Unique FunctionIndex entries: Pass 1 dedupes against the registry's existing FunctionIndex so calling BuildCCallGraph after BuildCModuleRegistry is idempotent.

Test plan

  • go build ./...
  • go test ./... — full suite green
  • go vet ./...
  • golangci-lint run ./graph/callgraph/builder/ — 0 issues
  • Coverage on c_builder.go lines: ~89.6%
  • Spec scenarios covered:
    • Single-file main()add() edge
    • Cross-file .c definition preferred over .h declaration
    • Header declaration via #include fallback
    • printf (stdlib) recorded as Resolved:false with failure reason
    • Recursive self-call
    • Same name in two .c files (file-scope statics)
    • Type engine populated; void returns dropped
    • Parameters indexed (anonymous params skipped)
    • Declarations skipped from Pass 2 type extraction
    • Merges cleanly into an empty unified graph
    • Non-C nodes ignored (mixed-language safety)
    • Anonymous / missing-file functions filtered
    • Empty-target call_expression produces no edge and no recorded site
    • Cross-.c global lookup works without an #include
    • Same-file forward declaration accepted when no definition exists

Stacked on

shiva/cpp-type-inference (#673)

@shivasurya shivasurya added enhancement New feature or request go Pull requests that update go code labels May 2, 2026
@shivasurya shivasurya self-assigned this May 2, 2026
@safedep
Copy link
Copy Markdown

safedep Bot commented May 2, 2026

SafeDep Report Summary

Green Malicious Packages Badge Green Vulnerable Packages Badge Green Risky License Badge

No dependency changes detected. Nothing to scan.

View complete scan results →

This report is generated by SafeDep Github App

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

Code Pathfinder Security Scan

Pass Critical High Medium Low Info

No security issues detected.

Metric Value
Files Scanned 2
Rules 205

Powered by Code Pathfinder

@codecov
Copy link
Copy Markdown

codecov Bot commented May 2, 2026

Codecov Report

❌ Patch coverage is 88.35616% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.28%. Comparing base (d3ea40b) to head (d55e6cc).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
sast-engine/graph/callgraph/builder/c_builder.go 88.35% 11 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #674      +/-   ##
==========================================
- Coverage   85.28%   85.28%   -0.01%     
==========================================
  Files         182      183       +1     
  Lines       26399    26545     +146     
==========================================
+ Hits        22515    22639     +124     
- Misses       3023     3038      +15     
- Partials      861      868       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Owner Author

shivasurya commented May 3, 2026

Merge activity

  • May 3, 1:15 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • May 3, 1:27 PM UTC: Graphite rebased this pull request as part of a merge.
  • May 3, 1:28 PM UTC: @shivasurya merged this pull request with Graphite.

@shivasurya shivasurya changed the base branch from shiva/cpp-type-inference to graphite-base/674 May 3, 2026 13:25
@shivasurya shivasurya changed the base branch from graphite-base/674 to main May 3, 2026 13:26
Add BuildCCallGraph — a four-pass algorithm that produces a
*core.CallGraph for C projects:

  Pass 1  Index every C function_definition under "<relpath>::<name>"
          and ensure the FQN appears in the module registry's
          FunctionIndex for cross-file resolution.
  Pass 2  Register explicit return types with the type engine
          (skipping void) and emit ParameterSymbol entries for every
          named parameter.
  Pass 3  Walk the parser-emitted edges (function_definition →
          call_expression) to extract one CallSiteInternal per call,
          deterministically and without a second AST traversal.
  Pass 4  Resolve targets in a definition-preferring order:
          same-file definition → global definition → same-file
          declaration → declaration reachable through #include "...".
          Resolved sites add an edge; unresolved sites are recorded
          as CallSite{Resolved:false} so external/stdlib calls remain
          visible to rule writers.

The result merges cleanly into a unified graph via the existing
MergeCallGraphs since C FQNs ("src/main.c::main") share no namespace
with Python, Go, or Java.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@shivasurya shivasurya force-pushed the shiva/cpp-c-call-graph branch from 3055d72 to d55e6cc Compare May 3, 2026 13:27
@shivasurya shivasurya merged commit 897e99d into main May 3, 2026
6 checks passed
@shivasurya shivasurya deleted the shiva/cpp-c-call-graph branch May 3, 2026 13:28
shivasurya added a commit that referenced this pull request May 3, 2026
## Summary

Adds `BuildCppCallGraph` — the four-pass C builder plus three C++-specific resolution paths exercised in Pass 4. Plain free-function calls fall through to `resolveCCallTarget`, making the C++ builder a strict superset of the C one.

| Step | Resolves |
|---|---|
| 1 | Qualified call (`ns::func`, `Class::staticMethod`) — direct `NamespaceIndex` lookup |
| 2 | `this->method()` — caller's enclosing class derived via byte-range containment |
| 3 | Method on typed receiver — receiver type from the type engine, method on that class |
| 4 | C-style fallthrough — definition-preferring lookup shared with the C builder |

### Receiver type normalisation

`obj.method()` calls drive resolution through the receiver's declared type. Before the lookup, the type string is normalised: `const`, `volatile`, and pointer/reference suffixes (`*`, `**`, `&`, `&&`) are stripped so `Dog*`, `const Dog&`, and `Dog **` all reduce to `Dog`.

### Class member tracking (Pass 2)

- Class method return types registered on the type engine for both definitions and declarations — header-only inline declarations still seed receiver-typed resolution.
- Field types extracted from `field_declaration` nodes for future field-chain resolution (`obj.field.method()`).

### Method-to-class association

Same byte-range containment used in PR-05 (`BuildCppModuleRegistry`): for each method, find the smallest class declaration whose `[StartByte, EndByte)` range contains the method's start byte. Nested classes (`class Outer { class Inner { ... }; };`) resolve to the innermost match. The builder doesn't depend on parser-internal context tracking, so it stays composable across future parser refactors.

## Test plan

- [x] `go build ./...`
- [x] `go test ./...` — full suite green
- [x] `go vet ./...`
- [x] `golangci-lint run ./graph/callgraph/builder/` — 0 issues
- [x] Coverage on `cpp_builder.go` lines: ~93%
- [x] Spec scenarios covered:
  - Namespace-qualified call (`mylib::process`)
  - Static method via NamespaceIndex (`Socket::create`)
  - Method on typed receiver (`dog.speak()` where `dog: Dog*`)
  - `this->method()` inside a method body
  - Free-function fallthrough
  - `printf` (stdlib) → unresolved with failure reason
  - Receiver not in scope → unresolved (no panic)
  - Method return type registered on the type engine; void dropped
  - Field types registered on the type engine
  - Receiver normalisation across pointer/reference/const/volatile/whitespace shapes
  - Nested classes pick the innermost match
  - Suffix match on namespaced classes (`Foo` → `ns::Foo::bar`)
  - Merges cleanly into a unified graph
  - Receiver binding with nil Type doesn't panic
  - `this->method()` on a non-method caller falls through cleanly
  - Header-only declarations still register class method return types
  - Non-C++ nodes ignored (mixed-language safety)

## Stacked on

`shiva/cpp-c-call-graph` (#674)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request go Pull requests that update go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant