Skip to content

agodianel/Graphoptim-Code-Optimizer

GraphOptim Logo

GraphOptim

Graph-theoretic structural optimizer for Python code

Detects and fixes structural inefficiencies that rule-based linters cannot catch — especially effective on AI-generated modules.

CI PyPI version Python versions License: MIT Code of Conduct Changelog


Why GraphOptim?

Production Python code — whether written by AI assistants (Copilot, Claude, Cursor, Gemini) or humans — often contains graph-level structural inefficiencies that traditional linters miss:

  • 🔴 Redundant execution paths — duplicate conditional branches
  • 💀 Dead code blocks — unreachable code after returns/raises
  • 🎯 Over-centralized functions — bottleneck nodes everything flows through
  • 📏 Inflated complexity — unnecessarily deep execution chains

How is this different from existing tools?

Tool Approach Misses
flake8 / ruff Rule-based token patterns Graph-level redundancy, dead paths
pylint AST heuristics + rules Topological inefficiencies
SonarQube Rule-based + some metrics Graph-theoretic pass selection
GraphOptim Weighted CFG + graph algorithms + knapsack optimization

⚡ Quickstart

Install

pip install graphoptim

30-Second Demo

import graphoptim as go

code = """
def process(items):
    results = []
    for item in items:
        if item > 0:
            results.append(item)
    return results
    print("unreachable")  # Dead code!
"""

# Analyze
report = go.analyze(code)
print(f"Score: {report.total_score}/100")
# → Score: 85/100

# Optimize
optimized = go.optimize(code, passes=["dead_code"])
# → Dead code removed, valid Python output

CLI

# Analyze a file
graphoptim analyze myfile.py

# Analyze with detailed metrics
graphoptim analyze myfile.py --verbose

# Show a diff of what would change, and auto-save the generated file 
# into a graphoptimized/ folder for review (recommended first step)
graphoptim optimize myfile.py --diff

# Show diff for an entire project
graphoptim optimize ./src --diff

# Optimize (preview mode — prints full optimized code)
graphoptim optimize myfile.py

# Optimize in-place (creates .bak backup)
graphoptim optimize myfile.py --inplace

# Analyze an entire project
graphoptim analyze ./src

# Run the empirical benchmark
graphoptim benchmark --samples 100 --models claude,gpt4o,gemini

CLI Output Example

╭──────────────────────────────────────────────────╮
│ GraphOptim Analysis — myfile.py                   │
╰──────────────────────────────────────────────────╯
  Functions analyzed: 4
  Total optimization score: 62/100

process_data()       score: 45/100  ⚠ needs attention
├─ Cyclomatic complexity: 12   (threshold: 7)
├─ Dead nodes: 2               (unreachable at lines 34, 51)
├─ Betweenness bottleneck: 0.9 (line 22 is critical path bottleneck)
└─ Suggested passes: dead_code, centrality_split

validate_input()     score: 88/100  ✓ good
format_output()      score: 71/100  ~ acceptable
main()               score: 44/100  ⚠ needs attention

📊 How It Works

GraphOptim treats your code as a graph problem:

Source Code → AST → Control Flow Graph (CFG) → Weighted DiGraph → Analysis + Optimization
  1. Parse — Builds a CFG where nodes are basic blocks, edges are control flow transitions
  2. Analyze — Extracts 7 graph metrics (cyclomatic complexity, dead nodes, CFG diameter, avg shortest path, betweenness centrality, node/edge ratio, LOC)
  3. Detect — Pattern detectors identify dead nodes, redundant paths, bottlenecks, deep chains
  4. Select — A 0/1 Knapsack solver selects the optimal subset of optimization passes within your risk budget
  5. Optimize — Selected passes transform the AST, producing valid Python

📖 For full architecture details, see the How It Works wiki page.


🔧 Optimization Passes

Pass Detects Fixes Risk
dead_code Unreachable code after return/raise/break Removes dead AST nodes Low (0.2)
guard_clause Deeply nested if-blocks Refactors to early-return guards Med-Low (0.4)
unused_variable Assigned-but-never-read variables Removes safe unused assignments Low-Med (0.3)
constant_folding Constant arithmetic/string expressions Evaluates at parse time Very Low (0.1)
path_shortener Duplicate conditional branches Merges identical branches Medium (0.5)
centrality High-betweenness bottleneck nodes Suggests helper extraction Medium (0.6)

Knapsack Pass Selection

Instead of running all passes blindly, GraphOptim uses dynamic programming (0/1 Knapsack) to select the optimal subset:

from graphoptim.optimizer.passes.knapsack import KnapsackSelector, PassInfo

selector = KnapsackSelector(budget=0.6)  # max total risk = 0.6
result = selector.select(available_passes)
print(result.summary())

🧩 Want to add your own pass? See the Adding Passes guide.


📈 Benchmark

GraphOptim includes a benchmark pipeline that empirically validates its core hypothesis against HumanEval and MBPP datasets using three LLM providers:

Provider Model API Key Env Var
Anthropic Claude Opus 4.6 ANTHROPIC_API_KEY
OpenAI GPT-5.2 OPENAI_API_KEY
Google Gemini 2.5 Pro GOOGLE_API_KEY
# Set API keys
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="AI..."

# Run benchmark
graphoptim benchmark --samples 100 --models claude,gpt4o,gemini

The benchmark produces:

  • benchmark_results.json — Raw metrics for all functions
  • statistical_report.csv — Mann-Whitney U test results
  • benchmark_report.md — Human-readable findings with plots

🛠 Development

# Clone and set up
git clone https://github.com/agodianel/Graphoptim-Code-Optimizer.git
cd graphoptim

# Install with uv (recommended)
uv pip install -e ".[dev,benchmark]"

# Run tests
uv run pytest tests/ -v

# Run linting
uv run ruff check .

# Run formatting
uv run black .

# Type checking
uv run mypy graphoptim/ --ignore-missing-imports

See CONTRIBUTING.md for the full contributing guide.


📁 Project Structure

graphoptim/
├── parser/           # Source → Graph conversion
│   ├── cfg_builder   # Control Flow Graph builder
│   ├── dfg_builder   # Data Flow Graph builder
│   └── ast_utils     # AST helpers and node weighing
├── analyzer/         # Graph → Insights
│   ├── metrics       # 7 graph metric extraction
│   ├── patterns      # Structural pattern detectors
│   └── reporter      # Scored report generation
├── optimizer/        # Insights → Better Code
│   ├── passes/       # Individual optimization passes
│   │   ├── dead_code, path_shortener, centrality
│   │   └── knapsack  # 0/1 Knapsack pass selector
│   └── rewriter      # Pass orchestration + validation
├── benchmark/        # Empirical validation pipeline
├── cli.py            # Click + Rich CLI
└── config.py         # Configuration (env var loading)

📋 Links

📦 PyPI pypi.org/project/graphoptim
📖 Wiki docs/wiki
🐛 Issues GitHub Issues
📝 Changelog CHANGELOG.md
🔒 Security SECURITY.md
🤝 Contributing CONTRIBUTING.md
📜 License MIT

License

MIT License — See LICENSE for details.

About

Graph-theoretic structural optimizer for Python code - Detects and fixes structural inefficiencies that rule-based linters cannot catch — especially effective on AI-generated modules.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Languages