# Visualizing the Knowledge Graph

This notebook shows how to visualize your codebase knowledge graph using various tools.

Topics covered:
1. Neo4j Browser visualization
2. NetworkX graph visualization
3. Custom visualizations with matplotlib
4. Interactive visualizations with Plotly
5. Exporting graphs for other tools

In [None]:
# Install required packages (run once)
# !pip install networkx matplotlib plotly pandas

In [None]:
from falkor.graph import Neo4jClient
from falkor.config import load_config
import networkx as nx
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import pandas as pd

# Setup
config = load_config()
db = Neo4jClient(
    uri=config.neo4j.uri,
    username=config.neo4j.user,
    password=config.neo4j.password
)

print("✓ Connected to Neo4j")

## 1. Neo4j Browser Visualization

The easiest way to visualize is using Neo4j Browser at http://localhost:7474

Try these queries in the browser:

```cypher
// View sample of the graph
MATCH (n)
RETURN n
LIMIT 25

// View import relationships
MATCH (f:File)-[r:IMPORTS]->(m:Module)
RETURN f, r, m
LIMIT 50

// View function call graph
MATCH (f1:Function)-[r:CALLS]->(f2:Function)
RETURN f1, r, f2
LIMIT 50
```

## 2. Import Dependency Graph with NetworkX

In [None]:
# Fetch import relationships
query = """
MATCH (f1:File)-[:IMPORTS]->(m:Module)<-[:CONTAINS]-(f2:File)
WHERE f1 <> f2
RETURN DISTINCT f1.filePath AS source, f2.filePath AS target
LIMIT 50
"""

result = db.execute_query(query)

# Build NetworkX graph
G = nx.DiGraph()
for record in result:
    # Simplify file paths for readability
    source = record['source'].split('/')[-1]
    target = record['target'].split('/')[-1]
    G.add_edge(source, target)

print(f"Graph: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges")

In [None]:
# Visualize with matplotlib
plt.figure(figsize=(15, 10))

# Use spring layout for better node positioning
pos = nx.spring_layout(G, k=0.5, iterations=50)

# Draw nodes
nx.draw_networkx_nodes(G, pos, node_color='lightblue', 
                       node_size=500, alpha=0.8)

# Draw edges
nx.draw_networkx_edges(G, pos, edge_color='gray', 
                       arrows=True, arrowsize=10, alpha=0.5)

# Draw labels
nx.draw_networkx_labels(G, pos, font_size=8)

plt.title("File Import Dependencies", fontsize=16)
plt.axis('off')
plt.tight_layout()
plt.show()

## 3. Complexity Heatmap

In [None]:
# Get complexity by file
query = """
MATCH (f:File)-[:CONTAINS]->(func:Function)
WITH f, sum(func.complexity) AS total_complexity
RETURN f.filePath AS file, 
       total_complexity
ORDER BY total_complexity DESC
LIMIT 20
"""

result = db.execute_query(query)
df = pd.DataFrame(result)
df['file'] = df['file'].apply(lambda x: x.split('/')[-1])

# Create bar chart
plt.figure(figsize=(12, 6))
colors = plt.cm.RdYlGn_r(df['total_complexity'] / df['total_complexity'].max())
plt.barh(df['file'], df['total_complexity'], color=colors)
plt.xlabel('Total Complexity', fontsize=12)
plt.title('Complexity by File (Top 20)', fontsize=14)
plt.tight_layout()
plt.show()

print("\nHigher complexity = more red = higher maintenance burden")

## 4. Interactive Network with Plotly

In [None]:
# Build interactive graph
pos = nx.spring_layout(G)

# Create edge traces
edge_x = []
edge_y = []
for edge in G.edges():
    x0, y0 = pos[edge[0]]
    x1, y1 = pos[edge[1]]
    edge_x.extend([x0, x1, None])
    edge_y.extend([y0, y1, None])

edge_trace = go.Scatter(
    x=edge_x, y=edge_y,
    line=dict(width=0.5, color='#888'),
    hoverinfo='none',
    mode='lines'
)

# Create node traces
node_x = []
node_y = []
node_text = []
for node in G.nodes():
    x, y = pos[node]
    node_x.append(x)
    node_y.append(y)
    node_text.append(f"{node}<br>In-degree: {G.in_degree(node)}<br>Out-degree: {G.out_degree(node)}")

node_trace = go.Scatter(
    x=node_x, y=node_y,
    mode='markers+text',
    hoverinfo='text',
    text=[node for node in G.nodes()],
    textposition="top center",
    hovertext=node_text,
    marker=dict(
        showscale=True,
        colorscale='YlGnBu',
        color=[G.degree(node) for node in G.nodes()],
        size=10,
        colorbar=dict(
            thickness=15,
            title='Node Connections',
            xanchor='left',
            titleside='right'
        ),
        line_width=2
    )
)

# Create figure
fig = go.Figure(data=[edge_trace, node_trace],
                layout=go.Layout(
                    title='Interactive Dependency Graph',
                    showlegend=False,
                    hovermode='closest',
                    margin=dict(b=0,l=0,r=0,t=40),
                    xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
                    yaxis=dict(showgrid=False, zeroline=False, showticklabels=False)
                ))

fig.show()
print("\nHover over nodes to see details. Darker nodes have more connections.")

## 5. Circular Dependency Visualization

In [None]:
# Find circular dependencies
query = """
MATCH (f1:File)-[:IMPORTS*2..5]->(f2:File)-[:IMPORTS*1..3]->(f1)
WHERE elementId(f1) < elementId(f2)
RETURN DISTINCT f1.filePath AS file1, f2.filePath AS file2
LIMIT 10
"""

result = db.execute_query(query)

if result:
    print(f"Found {len(result)} circular dependencies:")
    
    # Visualize first cycle
    cycle_graph = nx.DiGraph()
    for record in result[:5]:
        f1 = record['file1'].split('/')[-1]
        f2 = record['file2'].split('/')[-1]
        cycle_graph.add_edge(f1, f2)
        cycle_graph.add_edge(f2, f1)  # Bidirectional to show cycle
    
    plt.figure(figsize=(10, 8))
    pos = nx.circular_layout(cycle_graph)
    nx.draw_networkx(cycle_graph, pos, 
                     node_color='red', 
                     node_size=1000,
                     font_size=10,
                     font_weight='bold',
                     arrows=True,
                     arrowsize=20,
                     edge_color='darkred',
                     width=2)
    plt.title("Circular Dependencies (⚠️ Break These!)", fontsize=14, color='red')
    plt.axis('off')
    plt.tight_layout()
    plt.show()
else:
    print("✓ No circular dependencies found!")

## 6. Export for External Tools

In [None]:
# Export to GEXF (for Gephi)
nx.write_gexf(G, "dependency_graph.gexf")
print("✓ Exported to dependency_graph.gexf (open in Gephi)")

# Export to GraphML (for yEd)
nx.write_graphml(G, "dependency_graph.graphml")
print("✓ Exported to dependency_graph.graphml (open in yEd)")

# Export to CSV
edges_df = pd.DataFrame([(u, v) for u, v in G.edges()], columns=['source', 'target'])
edges_df.to_csv("dependency_edges.csv", index=False)
print("✓ Exported to dependency_edges.csv")

## Cleanup

In [None]:
db.close()
print("✓ Connection closed")

## Summary

This notebook demonstrated:

1. **Neo4j Browser**: Native graph visualization
2. **NetworkX**: Python-based graph analysis and visualization
3. **Matplotlib**: Static visualizations and heatmaps
4. **Plotly**: Interactive visualizations
5. **Export**: Sharing graphs with external tools

## Visualization Best Practices

- Limit node count (25-100) for readable visualizations
- Use color to encode information (complexity, centrality, etc.)
- Add interactivity for exploration
- Export to specialized tools (Gephi, yEd) for large graphs
- Focus on specific subgraphs rather than the entire codebase

## Recommended Tools

- **Neo4j Browser**: Quick exploration
- **Neo4j Bloom**: Business-friendly visualization
- **Gephi**: Advanced graph analysis and visualization
- **yEd**: Professional diagram layouting
- **Graphviz**: Automated layout algorithms

## Next Steps

- Explore `04_batch_analysis.ipynb` for multi-project analysis
- Try different layout algorithms (circular, hierarchical, force-directed)
- Create custom visualizations for specific patterns in your code