⚡️ Speed up function `call_graph_summary` by 151% in PR #1460 (`call-graphee`) by codeflash-ai[bot] · Pull Request #1462 · codeflash-ai/codeflash

codeflash-ai · 2026-02-12T06:40:54Z

⚡️ This pull request contains optimizations for PR #1460

If you approve this dependent PR, these changes will be merged into the original PR branch call-graphee.

This PR will be automatically closed if the original PR is merged.

📄 151% (1.51x) speedup for `call_graph_summary` in `codeflash/cli_cmds/console.py`

⏱️ Runtime : 7.86 microseconds → 3.12 microseconds (best of 32 runs)

📝 Explanation and details

The optimized code achieves a 151% speedup (from 7.86 to 3.12 microseconds) primarily through three key optimizations:

1. Module-Level Import Hoisting

Moving from rich.panel import Panel from inside call_graph_summary() to the top-level module imports eliminates repeated import overhead on every function call. The line profiler shows this import took ~30,000 ns in the original (0.5% of total time). While seemingly small, this overhead is completely eliminated in the optimized version.

2. C-Level Aggregation with Built-in `sum()`

The optimization replaces Python-level accumulation loops with native sum() calls that execute at C speed:

Original approach (manual accumulation):

total_callees = 0
with_context = 0
for count in callee_counts.values():
    total_callees += count
    if count > 0:
        with_context += 1

This loop incurred ~828,000 ns across 2,005 iterations (234,973 + 301,448 + 292,402 ns).

Optimized approach (C-level sum):

total_callees = sum(callee_counts.values())
with_context = sum(1 for count in callee_counts.values() if count > 0)

The new approach completes in ~405,000 ns total (16,399 + 389,145 ns) - nearly 2x faster for the aggregation logic alone.

3. Leveraging `map()` for Initial Summation

Using sum(map(len, file_to_funcs.values())) instead of a generator expression provides a minor efficiency gain by pushing the iteration into C-level code, though the improvement here is marginal (34,533 ns → 24,396 ns).

Performance Characteristics

Based on the annotated tests, these optimizations excel when:

Large-scale scenarios: The test_large_scale_many_functions_single_file (1000 functions) and test_large_scale_multiple_files_distribution (1000 functions across 10 files) benefit most from reduced per-iteration overhead
Frequent invocations: If call_graph_summary() is called multiple times in a session, the eliminated import overhead compounds savings
Non-empty function sets: The optimization's impact is proportional to the number of callees being aggregated

The changes preserve all behavior - same summary text, same Panel display, same LSP handling - while delivering substantial runtime improvements through strategic use of Python's built-in functions that leverage optimized C implementations.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 7 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	88.9%

🌀 Click to see Generated Regression Tests

import os
from pathlib import Path
from types import SimpleNamespace

import pytest
from codeflash.cli_cmds.console import call_graph_summary
from codeflash.lsp.helpers import is_LSP_enabled

# Helper to make a lightweight fake "call graph" object that provides the required method.
# We use SimpleNamespace instead of defining a class to comply with test rules (no custom domain classes).
# The object must expose count_callees_per_function(mapping: dict[Path, set[str]]) -> dict[str,int]
def make_call_graph_from_mapping(return_mapping: dict[str, int]):
    # Create a function that accepts the mapping (file_path -> set of qualified names)
    # and returns the predetermined mapping. The implementation is intentionally simple
    # and deterministic for testing.
    def count_callees_per_function(file_path_to_qualified_names):
        # Validate the input shape a bit to catch some misuse in tests
        if not isinstance(file_path_to_qualified_names, dict):
            raise TypeError("expected a dict mapping file paths to sets of names")
        # Ensure sets of strings are passed in (as the real code would do)
        for val in file_path_to_qualified_names.values():
            if not isinstance(val, set):
                raise TypeError("expected sets of qualified names as values")
        return dict(return_mapping)  # return a shallow copy to avoid mutation surprises

    return SimpleNamespace(count_callees_per_function=count_callees_per_function)

def test_no_functions_produces_no_output(capsys):
    # If there are no functions across all files, call_graph_summary should return early and print nothing.
    call_graph = make_call_graph_from_mapping({})  # mapping won't be used because no functions exist
    file_to_funcs = {}  # empty mapping -> total_functions == 0
    # Should not raise and should print nothing
    call_graph_summary(call_graph, file_to_funcs)
    captured = capsys.readouterr()

def test_single_function_with_no_callees_prints_summary(capsys):
    # Test a single function with zero callees; validate summary text and numeric formatting.
    fn = SimpleNamespace(qualified_name="module.single")
    file_to_funcs = {Path("a.py"): [fn]}
    call_graph = make_call_graph_from_mapping({"module.single": 0})
    # Run the function; because LSP is disabled the console Panel will be printed to stdout
    call_graph_summary(call_graph, file_to_funcs)
    out = capsys.readouterr().out

def test_multiple_functions_with_mixed_callees(capsys):
    # Two functions: one calls another (callee count 1), one self-contained (0)
    f1 = SimpleNamespace(qualified_name="pkg.f1")
    f2 = SimpleNamespace(qualified_name="pkg.f2")
    file_to_funcs = {Path("b.py"): [f1, f2]}
    call_graph = make_call_graph_from_mapping({"pkg.f1": 1, "pkg.f2": 0})
    call_graph_summary(call_graph, file_to_funcs)
    out = capsys.readouterr().out

def test_empty_lists_in_file_to_funcs_produces_no_output(capsys):
    # Even if files are present but lists are empty, total_functions == 0 and nothing should be printed.
    file_to_funcs = {Path("empty.py"): [], Path("also_empty.py"): []}
    call_graph = make_call_graph_from_mapping({})
    call_graph_summary(call_graph, file_to_funcs)
    captured = capsys.readouterr()

def test_qualified_names_with_special_characters(capsys):
    # Ensure names with quotes and unicode characters do not crash the summary generation.
    # We only assert numbers are correctly shown; rich may escape or alter text formatting.
    special1 = SimpleNamespace(qualified_name='weird"name')
    special2 = SimpleNamespace(qualified_name="uniçodeƒ")
    file_to_funcs = {Path("special.py"): [special1, special2]}
    call_graph = make_call_graph_from_mapping({'weird"name': 0, "uniçodeƒ": 2})
    call_graph_summary(call_graph, file_to_funcs)
    out = capsys.readouterr().out

def test_large_scale_many_functions_single_file(capsys):
    # Build 1000 functions in a single file and provide callee counts that follow a simple pattern.
    n = 1000
    funcs = [SimpleNamespace(qualified_name=f"f{i}") for i in range(n)]
    file_to_funcs = {Path("big.py"): funcs}
    # Create mapping: f{i} has i % 4 callees, deterministic pattern
    mapping = {f"f{i}": (i % 4) for i in range(n)}
    call_graph = make_call_graph_from_mapping(mapping)
    # Run the summary (should be reasonably fast)
    call_graph_summary(call_graph, file_to_funcs)
    out = capsys.readouterr().out
    # Compute expected totals
    total_functions = n
    total_callees = sum(mapping.values())
    expected_avg = total_callees / total_functions
    # with_context is number of functions with count > 0
    with_context = sum(1 for v in mapping.values() if v > 0)
    leaf_functions = total_functions - with_context

def test_large_scale_multiple_files_distribution(capsys):
    # 1000 functions distributed across 10 files; verify totals remain correct.
    total = 1000
    per_file = 100
    file_to_funcs = {}
    mapping = {}
    for file_idx in range(10):
        funcs = []
        for i in range(per_file):
            idx = file_idx * per_file + i
            name = f"g{idx}"
            funcs.append(SimpleNamespace(qualified_name=name))
            # pattern: even-indexed functions have 2 callees, odd-indexed have 0
            mapping[name] = 2 if idx % 2 == 0 else 0
        file_to_funcs[Path(f"file_{file_idx}.py")] = funcs

    call_graph = make_call_graph_from_mapping(mapping)
    call_graph_summary(call_graph, file_to_funcs)
    out = capsys.readouterr().out

    total_callees = sum(mapping.values())
    expected_avg = total_callees / total
    with_context = sum(1 for v in mapping.values() if v > 0)
    leaf_functions = total - with_context
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1460-2026-02-12T06.40.48 and push.

The optimized code achieves a **151% speedup** (from 7.86 to 3.12 microseconds) primarily through three key optimizations: ## 1. Module-Level Import Hoisting Moving `from rich.panel import Panel` from inside `call_graph_summary()` to the top-level module imports eliminates repeated import overhead on every function call. The line profiler shows this import took ~30,000 ns in the original (0.5% of total time). While seemingly small, this overhead is completely eliminated in the optimized version. ## 2. C-Level Aggregation with Built-in `sum()` The optimization replaces Python-level accumulation loops with native `sum()` calls that execute at C speed: **Original approach** (manual accumulation): ```python total_callees = 0 with_context = 0 for count in callee_counts.values(): total_callees += count if count > 0: with_context += 1 ``` This loop incurred ~828,000 ns across 2,005 iterations (234,973 + 301,448 + 292,402 ns). **Optimized approach** (C-level sum): ```python total_callees = sum(callee_counts.values()) with_context = sum(1 for count in callee_counts.values() if count > 0) ``` The new approach completes in ~405,000 ns total (16,399 + 389,145 ns) - nearly **2x faster** for the aggregation logic alone. ## 3. Leveraging `map()` for Initial Summation Using `sum(map(len, file_to_funcs.values()))` instead of a generator expression provides a minor efficiency gain by pushing the iteration into C-level code, though the improvement here is marginal (34,533 ns → 24,396 ns). ## Performance Characteristics Based on the annotated tests, these optimizations excel when: - **Large-scale scenarios**: The `test_large_scale_many_functions_single_file` (1000 functions) and `test_large_scale_multiple_files_distribution` (1000 functions across 10 files) benefit most from reduced per-iteration overhead - **Frequent invocations**: If `call_graph_summary()` is called multiple times in a session, the eliminated import overhead compounds savings - **Non-empty function sets**: The optimization's impact is proportional to the number of callees being aggregated The changes preserve all behavior - same summary text, same Panel display, same LSP handling - while delivering substantial runtime improvements through strategic use of Python's built-in functions that leverage optimized C implementations.

claude · 2026-02-12T07:01:13Z

PR Review Summary

Prek Checks

✅ Passed (after auto-fix)

Fixed 2 issues:

Import sorting: moved from rich.panel import Panel to proper sorted position with other rich imports
Formatting: removed extra blank line in call_graph_summary

Committed and pushed as style: auto-fix linting issues (182c1b0).

Mypy

⚠️ 17 pre-existing errors in codeflash/cli_cmds/console.py (lines 96-97, 120, 157, 165, 190) — all from code not touched by this PR. No new type errors introduced.

Code Review

✅ No critical issues found

This is a codeflash optimization PR targeting call_graph_summary in codeflash/cli_cmds/console.py. The changes are:

Module-level import: Moved from rich.panel import Panel from inside the function to the top-level — eliminates repeated import overhead
sum(map(len, ...)) instead of generator expression: Minor C-level optimization for counting total functions
sum() for aggregation: Replaces manual for loop accumulation with sum(callee_counts.values()) and sum(1 for count in ... if count > 0) — functionally equivalent, leverages C-level iteration

All changes are behavior-preserving. No bugs, security issues, or breaking API changes.

Test Coverage

File	Stmts	Miss	Cover	Notes
`codeflash/cli_cmds/console.py`	178	132	26%	UI/display module — low coverage expected

Overall project coverage: 79%
call_graph_summary (lines 321-350): Not covered by unit tests. This is pre-existing — the parent PR (feat: add reference graph for Python #1460) added this function without unit tests for it. The codeflash-generated regression tests in the PR description do exercise the function.
call_graph_live_display (lines 213-318): Also uncovered — added by parent PR feat: add reference graph for Python #1460, not by this optimization PR.
⚠️ Coverage comparison vs main not possible since the base branch is call-graphee, which contains significant new code not on main.

Last updated: 2026-02-12

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 12, 2026

codeflash-ai bot mentioned this pull request Feb 12, 2026

feat: add reference graph for Python #1460

Merged

2 tasks

style: auto-fix linting issues

182c1b0

KRRT7 merged commit e909182 into call-graphee Feb 12, 2026
25 of 28 checks passed

KRRT7 deleted the codeflash/optimize-pr1460-2026-02-12T06.40.48 branch February 12, 2026 17:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡️ Speed up function `call_graph_summary` by 151% in PR #1460 (`call-graphee`)#1462

⚡️ Speed up function `call_graph_summary` by 151% in PR #1460 (`call-graphee`)#1462
KRRT7 merged 2 commits intocall-grapheefrom
codeflash/optimize-pr1460-2026-02-12T06.40.48

codeflash-ai bot commented Feb 12, 2026

Uh oh!

claude bot commented Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

codeflash-ai bot commented Feb 12, 2026

⚡️ This pull request contains optimizations for PR #1460

📄 151% (1.51x) speedup for call_graph_summary in codeflash/cli_cmds/console.py

📝 Explanation and details

1. Module-Level Import Hoisting

2. C-Level Aggregation with Built-in sum()

3. Leveraging map() for Initial Summation

Performance Characteristics

Uh oh!

claude bot commented Feb 12, 2026

PR Review Summary

Prek Checks

Mypy

Code Review

Test Coverage

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

📄 151% (1.51x) speedup for `call_graph_summary` in `codeflash/cli_cmds/console.py`

2. C-Level Aggregation with Built-in `sum()`

3. Leveraging `map()` for Initial Summation