# Falkor Basic Analysis Workflow

This notebook demonstrates the basic workflow for analyzing a codebase with Falkor:

1. Setting up connections
2. Ingesting a codebase
3. Running analysis
4. Understanding the health report
5. Exploring findings

## 1. Setup and Configuration

First, let's import Falkor and configure our connection to Neo4j.

In [None]:
# Import necessary modules
from pathlib import Path
from falkor.graph import Neo4jClient, GraphSchema
from falkor.pipeline import IngestionPipeline
from falkor.detectors import AnalysisEngine
from falkor.config import load_config
import json

In [None]:
# Load configuration (or use defaults)
config = load_config()

# Connect to Neo4j
neo4j_client = Neo4jClient(
    uri=config.neo4j.uri,
    username=config.neo4j.user,
    password=config.neo4j.password,
    max_retries=config.neo4j.max_retries
)

print(f"âœ“ Connected to Neo4j at {config.neo4j.uri}")

## 2. Ingest a Codebase

Let's ingest a sample codebase into the knowledge graph. Replace `/path/to/repo` with your actual repository path.

In [None]:
# Path to the repository to analyze
repo_path = "/path/to/your/repo"

# Create ingestion pipeline
pipeline = IngestionPipeline(
    repo_path=repo_path,
    neo4j_client=neo4j_client,
    follow_symlinks=False,  # Security: don't follow symlinks
    max_file_size_mb=10,
    batch_size=100
)

print(f"âœ“ Created ingestion pipeline for {repo_path}")

In [None]:
# Run ingestion with progress tracking
def progress_callback(current, total, filename):
    if current % 10 == 0:  # Print every 10 files
        percentage = (current / total) * 100
        print(f"Progress: {current}/{total} ({percentage:.1f}%) - {Path(filename).name}")

# Ingest the codebase
pipeline.ingest(
    patterns=["**/*.py"],  # Python files only
    progress_callback=progress_callback
)

print("\nâœ“ Ingestion complete!")

In [None]:
# Check what was ingested
stats = neo4j_client.get_stats()

print("\nðŸ“Š Graph Statistics:")
print(f"  Files: {stats['total_files']}")
print(f"  Classes: {stats['total_classes']}")
print(f"  Functions: {stats['total_functions']}")
print(f"  Total Nodes: {stats['total_nodes']}")
print(f"  Relationships: {stats['total_relationships']}")

## 3. Run Analysis

Now let's run the analysis engine to detect code smells and calculate health scores.

In [None]:
# Create analysis engine
engine = AnalysisEngine(
    neo4j_client=neo4j_client,
    detector_config=config.detectors.__dict__ if hasattr(config, 'detectors') else None
)

# Run analysis
print("Running analysis...")
health = engine.analyze()

print(f"\nâœ“ Analysis complete! Grade: {health.grade}")

## 4. Understanding the Health Report

The health report contains:
- Overall grade (A-F)
- Category scores (Structure, Quality, Architecture)
- Detailed metrics
- Findings by severity

In [None]:
# Display overall health
print("\n" + "="*60)
print(f"CODEBASE HEALTH REPORT")
print("="*60)
print(f"\nOverall Grade: {health.grade}")
print(f"Overall Score: {health.overall_score:.1f}/100\n")

# Category scores
print("Category Scores:")
print(f"  Structure    (40% weight): {health.structure_score:.1f}/100")
print(f"  Quality      (30% weight): {health.quality_score:.1f}/100")
print(f"  Architecture (30% weight): {health.architecture_score:.1f}/100")

# Findings summary
print(f"\nFindings Summary:")
print(f"  Critical: {health.findings_summary.critical}")
print(f"  High:     {health.findings_summary.high}")
print(f"  Medium:   {health.findings_summary.medium}")
print(f"  Low:      {health.findings_summary.low}")
print(f"  Info:     {health.findings_summary.info}")
print(f"  Total:    {health.findings_summary.total}")

In [None]:
# Display detailed metrics
m = health.metrics

print("\n" + "="*60)
print("DETAILED METRICS")
print("="*60)

print("\nStructure Metrics:")
print(f"  Modularity:             {m.modularity:.2f} (0.3-0.7 is good)")
print(f"  Average Coupling:       {m.avg_coupling:.1f} (lower is better)")
print(f"  Circular Dependencies:  {m.circular_dependencies}")
print(f"  Bottleneck Count:       {m.bottleneck_count}")

print("\nQuality Metrics:")
print(f"  Dead Code:              {m.dead_code_percentage*100:.1f}%")
print(f"  Duplication:            {m.duplication_percentage*100:.1f}%")
print(f"  God Classes:            {m.god_class_count}")

print("\nArchitecture Metrics:")
print(f"  Layer Violations:       {m.layer_violations}")
print(f"  Boundary Violations:    {m.boundary_violations}")
print(f"  Abstraction Ratio:      {m.abstraction_ratio:.2f} (0.3-0.7 is good)")

print("\nCodebase Statistics:")
print(f"  Total Files:            {m.total_files}")
print(f"  Total Classes:          {m.total_classes}")
print(f"  Total Functions:        {m.total_functions}")
print(f"  Total LOC:              {m.total_loc:,}")

## 5. Exploring Findings

Let's examine the findings in detail, starting with the highest severity issues.

In [None]:
# Sort findings by severity
severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3, "info": 4}
sorted_findings = sorted(
    health.findings,
    key=lambda f: severity_order.get(f.severity.value, 5)
)

# Display top 5 findings
print("\n" + "="*60)
print("TOP FINDINGS")
print("="*60)

for i, finding in enumerate(sorted_findings[:5], 1):
    print(f"\n{i}. [{finding.severity.value.upper()}] {finding.title}")
    print(f"   Detector: {finding.detector}")
    print(f"   Files affected: {len(finding.affected_files)}")
    print(f"   Description: {finding.description[:100]}...")
    if finding.suggested_fix:
        print(f"   Fix: {finding.suggested_fix[:80]}...")
    print(f"   Effort: {finding.estimated_effort}")

In [None]:
# Detailed view of a specific finding
if sorted_findings:
    finding = sorted_findings[0]
    
    print("\n" + "="*60)
    print(f"DETAILED FINDING: {finding.title}")
    print("="*60)
    print(f"\nID: {finding.id}")
    print(f"Detector: {finding.detector}")
    print(f"Severity: {finding.severity.value.upper()}")
    print(f"\nDescription:\n{finding.description}")
    
    print(f"\nAffected Files ({len(finding.affected_files)}):")
    for file in finding.affected_files[:5]:
        print(f"  - {file}")
    
    if finding.suggested_fix:
        print(f"\nSuggested Fix:\n{finding.suggested_fix}")
    
    print(f"\nEstimated Effort: {finding.estimated_effort}")
    
    print(f"\nGraph Context:")
    for key, value in finding.graph_context.items():
        print(f"  {key}: {value}")

## 6. Export Results

Save the health report for further analysis or reporting.

In [None]:
# Export to JSON
output_file = "health_report.json"
with open(output_file, "w") as f:
    json.dump(health.to_dict(), f, indent=2)

print(f"âœ“ Health report saved to {output_file}")

In [None]:
# Export to HTML report
from falkor.reporters import HTMLReporter

reporter = HTMLReporter(repo_path=Path(repo_path))
reporter.generate(health, "health_report.html")

print("âœ“ HTML report generated: health_report.html")

## 7. Cleanup

Close the database connection when done.

In [None]:
neo4j_client.close()
print("âœ“ Connection closed")

## Next Steps

- Check out `02_custom_queries.ipynb` to learn how to write custom Cypher queries
- See `03_visualization.ipynb` for graph visualization techniques
- Explore `04_batch_analysis.ipynb` for analyzing multiple codebases

## Summary

In this notebook, you learned how to:

1. Configure and connect to Neo4j
2. Ingest a codebase into the knowledge graph
3. Run the analysis engine
4. Interpret health scores and metrics
5. Explore findings and get actionable insights
6. Export results in multiple formats

Falkor provides a comprehensive view of your codebase health through graph-based analysis. Use these insights to prioritize refactoring efforts and improve code quality!