# High-Impact Examples: Converting Directed Graphs to Undirected Graphs

This notebook demonstrates the **prime graph technique** for converting directed graphs to undirected bipartite graphs while preserving the graph structure and properties.

## Key Concept
For each directed edge `u → v`, we create:
- A prime node `v'`
- Two undirected edges: `u — v'` and `v' — v`

This creates a bipartite graph where:
- Original nodes are one partition
- Prime nodes are the other partition
- The directed structure is preserved and can be recovered

---

In [None]:
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
import re
from matplotlib.patches import FancyBboxPatch

# Set up plotting style
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

## Helper Functions

In [None]:
def directed_to_prime_graph(G):
    """
    Convert a directed graph to an undirected bipartite graph using prime nodes.
    
    Args:
        G: NetworkX DiGraph
    
    Returns:
        H: NetworkX Graph (undirected bipartite)
    """
    H = nx.Graph()
    
    for src, tar in G.edges():
        src_str = str(src)
        tar_str = str(tar)
        prime_node = str(tar) + 'p'
        
        # Add two undirected edges
        H.add_edge(src_str, prime_node)
        H.add_edge(prime_node, tar_str)
    
    return H


def prime_graph_to_directed(H):
    """
    Convert a prime graph back to the original directed graph.
    
    Args:
        H: NetworkX Graph (undirected bipartite)
    
    Returns:
        DG: NetworkX DiGraph
    """
    DG = nx.DiGraph()
    
    for u, v in H.edges():
        if re.search('[a-zA-Z]', u):  # u is a prime node
            src = u.replace('p', '')
            tar = v
            if src != tar:
                DG.add_edge(tar, src)
        
        if re.search('[a-zA-Z]', v):  # v is a prime node
            src = u
            tar = v.replace('p', '')
            if src != tar:
                DG.add_edge(src, tar)
    
    return DG


def visualize_conversion(G, H, title="Graph Conversion", figsize=(18, 6)):
    """
    Visualize the directed graph, its prime graph representation, and the recovered graph.
    """
    fig, axes = plt.subplots(1, 3, figsize=figsize)
    
    # Original directed graph
    ax1 = axes[0]
    pos1 = nx.spring_layout(G, seed=42)
    nx.draw_networkx_nodes(G, pos=pos1, node_color='lightblue', 
                          node_size=800, ax=ax1)
    nx.draw_networkx_edges(G, pos=pos1, edge_color='gray', 
                          arrows=True, arrowsize=20, 
                          arrowstyle='->', ax=ax1,
                          connectionstyle='arc3,rad=0.1')
    nx.draw_networkx_labels(G, pos=pos1, font_size=12, font_weight='bold', ax=ax1)
    ax1.set_title("Original Directed Graph", fontsize=14, fontweight='bold')
    ax1.axis('off')
    
    # Prime graph (bipartite)
    ax2 = axes[1]
    # Separate original and prime nodes
    original_nodes = [n for n in H.nodes() if not re.search('[a-zA-Z]', n)]
    prime_nodes = [n for n in H.nodes() if re.search('[a-zA-Z]', n)]
    
    top_nodes = set(original_nodes)
    pos2 = nx.bipartite_layout(H, top_nodes)
    
    nx.draw_networkx_nodes(H, pos=pos2, nodelist=original_nodes,
                          node_color='lightblue', node_size=800, ax=ax2)
    nx.draw_networkx_nodes(H, pos=pos2, nodelist=prime_nodes,
                          node_color='lightcoral', node_size=600, 
                          node_shape='s', ax=ax2)
    nx.draw_networkx_edges(H, pos=pos2, edge_color='gray', ax=ax2)
    nx.draw_networkx_labels(H, pos=pos2, font_size=10, ax=ax2)
    ax2.set_title("Prime Graph (Undirected Bipartite)", fontsize=14, fontweight='bold')
    ax2.axis('off')
    
    # Recovered directed graph
    ax3 = axes[2]
    DG_recovered = prime_graph_to_directed(H)
    pos3 = nx.spring_layout(DG_recovered, seed=42)
    nx.draw_networkx_nodes(DG_recovered, pos=pos3, node_color='lightgreen', 
                          node_size=800, ax=ax3)
    nx.draw_networkx_edges(DG_recovered, pos=pos3, edge_color='gray',
                          arrows=True, arrowsize=20,
                          arrowstyle='->', ax=ax3,
                          connectionstyle='arc3,rad=0.1')
    nx.draw_networkx_labels(DG_recovered, pos=pos3, font_size=12, 
                           font_weight='bold', ax=ax3)
    
    # Verify isomorphism
    is_isomorphic = nx.is_isomorphic(G, DG_recovered)
    color = 'green' if is_isomorphic else 'red'
    ax3.set_title(f"Recovered Graph\n(Isomorphic: {is_isomorphic})", 
                 fontsize=14, fontweight='bold', color=color)
    ax3.axis('off')
    
    plt.suptitle(title, fontsize=16, fontweight='bold', y=1.02)
    plt.tight_layout()
    plt.show()
    
    return is_isomorphic

---
## Example 1: Citation Network (Academic Papers)

**Use Case**: Citation networks are naturally directed (Paper A cites Paper B). Converting to undirected allows:
- Using undirected graph algorithms (community detection, clustering)
- Analyzing co-citation patterns
- Finding influential papers using bipartite metrics

In [None]:
# Create a citation network
# Papers: 0=Foundation, 1=Theory A, 2=Theory B, 3=Application, 4=Extension
citation_graph = nx.DiGraph()
citation_graph.add_edges_from([
    ('Foundation', 'Theory_A'),
    ('Foundation', 'Theory_B'),
    ('Theory_A', 'Application'),
    ('Theory_B', 'Application'),
    ('Application', 'Extension'),
    ('Theory_A', 'Extension')
])

# Convert to prime graph
citation_prime = directed_to_prime_graph(citation_graph)

# Visualize
print("Citation Network Analysis")
print("=" * 50)
print(f"Original directed graph: {citation_graph.number_of_nodes()} nodes, {citation_graph.number_of_edges()} edges")
print(f"Prime graph: {citation_prime.number_of_nodes()} nodes, {citation_prime.number_of_edges()} edges")
print()

is_iso = visualize_conversion(citation_graph, citation_prime, 
                              title="Example 1: Citation Network")

# Analyze the prime graph
print("\nPrime Graph Properties:")
print(f"Is bipartite: {nx.is_bipartite(citation_prime)}")
print(f"Average clustering coefficient: {nx.average_clustering(citation_prime):.3f}")

---
## Example 2: Gene Regulatory Network

**Use Case**: Gene regulation is directional (Gene A regulates Gene B). Converting to undirected enables:
- Finding gene modules using community detection
- Identifying hub genes in the bipartite representation
- Applying spectral methods that require undirected graphs

In [None]:
# Create a gene regulatory network
gene_network = nx.DiGraph()

# Transcription factors -> Target genes
gene_network.add_edges_from([
    ('TF1', 'Gene_A'),
    ('TF1', 'Gene_B'),
    ('TF2', 'Gene_B'),
    ('TF2', 'Gene_C'),
    ('Gene_A', 'Gene_D'),
    ('Gene_B', 'Gene_D'),
    ('Gene_C', 'TF3'),
    ('TF3', 'Gene_D'),
    ('Gene_D', 'Gene_E')
])

# Convert to prime graph
gene_prime = directed_to_prime_graph(gene_network)

print("Gene Regulatory Network Analysis")
print("=" * 50)
print(f"Original directed graph: {gene_network.number_of_nodes()} nodes, {gene_network.number_of_edges()} edges")
print(f"Prime graph: {gene_prime.number_of_nodes()} nodes, {gene_prime.number_of_edges()} edges")
print()

is_iso = visualize_conversion(gene_network, gene_prime,
                              title="Example 2: Gene Regulatory Network")

# Find most connected genes in prime graph
print("\nNode degrees in prime graph:")
degrees = dict(gene_prime.degree())
for node, degree in sorted(degrees.items(), key=lambda x: x[1], reverse=True)[:5]:
    node_type = "Prime" if 'p' in node else "Original"
    print(f"  {node} ({node_type}): degree {degree}")

---
## Example 3: Web Page Ranking Network

**Use Case**: Web links are directed (Page A links to Page B). Converting to undirected allows:
- Using algorithms designed for undirected graphs
- Analyzing link patterns with bipartite projections
- Computing different centrality measures

In [None]:
# Create a web link network
web_graph = nx.DiGraph()

# Simulate a small web with hub and authority pages
web_graph.add_edges_from([
    ('Home', 'About'),
    ('Home', 'Products'),
    ('Home', 'Blog'),
    ('About', 'Contact'),
    ('Products', 'Product_A'),
    ('Products', 'Product_B'),
    ('Blog', 'Post_1'),
    ('Blog', 'Post_2'),
    ('Post_1', 'Products'),
    ('Post_2', 'About')
])

# Convert to prime graph
web_prime = directed_to_prime_graph(web_graph)

print("Web Page Network Analysis")
print("=" * 50)
print(f"Original directed graph: {web_graph.number_of_nodes()} nodes, {web_graph.number_of_edges()} edges")
print(f"Prime graph: {web_prime.number_of_nodes()} nodes, {web_prime.number_of_edges()} edges")
print()

is_iso = visualize_conversion(web_graph, web_prime,
                              title="Example 3: Web Page Network")

# Compare PageRank in original vs centrality in prime graph
print("\nPageRank in original directed graph:")
pagerank = nx.pagerank(web_graph)
for node, score in sorted(pagerank.items(), key=lambda x: x[1], reverse=True)[:5]:
    print(f"  {node}: {score:.3f}")

print("\nDegree centrality in prime graph (original nodes only):")
centrality = nx.degree_centrality(web_prime)
original_nodes_centrality = {k: v for k, v in centrality.items() if 'p' not in k}
for node, score in sorted(original_nodes_centrality.items(), key=lambda x: x[1], reverse=True)[:5]:
    print(f"  {node}: {score:.3f}")

---
## Example 4: Workflow/Task Dependency Graph (DAG)

**Use Case**: Task dependencies form a DAG (Task A must complete before Task B). Converting to undirected enables:
- Using bipartite matching algorithms for resource allocation
- Finding critical paths using undirected algorithms
- Analyzing workflow patterns with clustering methods

In [None]:
# Create a workflow DAG
workflow = nx.DiGraph()

# Data pipeline workflow
workflow.add_edges_from([
    ('Extract_Data', 'Validate'),
    ('Validate', 'Clean'),
    ('Validate', 'Filter'),
    ('Clean', 'Transform'),
    ('Filter', 'Transform'),
    ('Transform', 'Aggregate'),
    ('Aggregate', 'Load_DB'),
    ('Aggregate', 'Generate_Report'),
    ('Load_DB', 'Notify'),
    ('Generate_Report', 'Notify')
])

# Convert to prime graph
workflow_prime = directed_to_prime_graph(workflow)

print("Workflow Dependency Graph Analysis")
print("=" * 50)
print(f"Original DAG: {workflow.number_of_nodes()} nodes, {workflow.number_of_edges()} edges")
print(f"Prime graph: {workflow_prime.number_of_nodes()} nodes, {workflow_prime.number_of_edges()} edges")
print(f"Is DAG: {nx.is_directed_acyclic_graph(workflow)}")
print()

is_iso = visualize_conversion(workflow, workflow_prime,
                              title="Example 4: Workflow DAG", figsize=(20, 6))

# Analyze workflow structure
print("\nWorkflow Analysis:")
print(f"Topological generations: {list(nx.topological_generations(workflow))}")
print(f"\nBottleneck tasks (highest in-degree):")
in_degrees = dict(workflow.in_degree())
for node, degree in sorted(in_degrees.items(), key=lambda x: x[1], reverse=True)[:3]:
    print(f"  {node}: {degree} dependencies")

---
## Example 5: Social Network (Follow/Follower Relationships)

**Use Case**: Social media follows are directed (User A follows User B). Converting to undirected enables:
- Community detection algorithms that require undirected graphs
- Finding mutual connections through bipartite analysis
- Influence analysis using undirected centrality measures

In [None]:
# Create a social network
social = nx.DiGraph()

# User follow relationships
social.add_edges_from([
    ('Alice', 'Bob'),
    ('Alice', 'Charlie'),
    ('Bob', 'Alice'),      # Mutual follow
    ('Bob', 'David'),
    ('Charlie', 'David'),
    ('Charlie', 'Eve'),
    ('David', 'Eve'),
    ('Eve', 'Alice'),
    ('Eve', 'Charlie'),    # Mutual follow
    ('Frank', 'Alice'),
    ('Frank', 'Bob')
])

# Convert to prime graph
social_prime = directed_to_prime_graph(social)

print("Social Network Analysis")
print("=" * 50)
print(f"Original directed graph: {social.number_of_nodes()} nodes, {social.number_of_edges()} edges")
print(f"Prime graph: {social_prime.number_of_nodes()} nodes, {social_prime.number_of_edges()} edges")
print()

is_iso = visualize_conversion(social, social_prime,
                              title="Example 5: Social Network (Follow Graph)")

# Analyze influence
print("\nInfluence Analysis (Original Graph):")
in_degree = dict(social.in_degree())
out_degree = dict(social.out_degree())
print("\nFollowers (in-degree):")
for node, degree in sorted(in_degree.items(), key=lambda x: x[1], reverse=True)[:3]:
    print(f"  {node}: {degree} followers")
print("\nFollowing (out-degree):")
for node, degree in sorted(out_degree.items(), key=lambda x: x[1], reverse=True)[:3]:
    print(f"  {node}: following {degree} users")

print("\nPrime Graph Centrality (original nodes):")
betweenness = nx.betweenness_centrality(social_prime)
original_nodes_betweenness = {k: v for k, v in betweenness.items() if 'p' not in k}
for node, score in sorted(original_nodes_betweenness.items(), key=lambda x: x[1], reverse=True)[:3]:
    print(f"  {node}: betweenness {score:.3f}")

---
## Example 6: Comparative Analysis - Random vs Scale-Free Networks

**Use Case**: Demonstrating that the prime graph technique works on different graph topologies and preserves structural properties.

In [None]:
# Random graph
random_graph = nx.gnp_random_graph(10, 0.3, directed=True, seed=42)
random_prime = directed_to_prime_graph(random_graph)

print("Random Graph (Erdős-Rényi)")
print("=" * 50)
is_iso = visualize_conversion(random_graph, random_prime,
                              title="Example 6a: Random Directed Graph")

# Scale-free graph (using Barabási-Albert and adding direction)
ba_undirected = nx.barabasi_albert_graph(10, 2, seed=42)
scale_free = nx.DiGraph()
for u, v in ba_undirected.edges():
    # Randomly assign direction
    if np.random.random() > 0.5:
        scale_free.add_edge(u, v)
    else:
        scale_free.add_edge(v, u)

scale_free_prime = directed_to_prime_graph(scale_free)

print("\nScale-Free Graph (Barabási-Albert)")
print("=" * 50)
is_iso = visualize_conversion(scale_free, scale_free_prime,
                              title="Example 6b: Scale-Free Directed Graph")

# Compare properties
print("\nComparative Statistics:")
print("\nRandom Graph:")
print(f"  Density: {nx.density(random_graph):.3f}")
print(f"  Is strongly connected: {nx.is_strongly_connected(random_graph)}")
print(f"  Number of strongly connected components: {nx.number_strongly_connected_components(random_graph)}")

print("\nScale-Free Graph:")
print(f"  Density: {nx.density(scale_free):.3f}")
print(f"  Is strongly connected: {nx.is_strongly_connected(scale_free)}")
print(f"  Number of strongly connected components: {nx.number_strongly_connected_components(scale_free)}")

---
## Key Insights and Applications

### Why Convert Directed to Undirected?

1. **Algorithm Compatibility**: Many powerful graph algorithms (community detection, certain spectral methods) require undirected graphs

2. **Preserves Information**: The prime graph technique is **lossless** - you can recover the exact original directed graph

3. **Bipartite Structure**: Creates a bipartite representation that enables:
   - Bipartite matching algorithms
   - Projection-based analysis
   - Different perspective on graph structure

4. **Computational Benefits**: Some operations may be faster on undirected graphs

### Real-World Applications

- **Biology**: Gene regulatory networks, protein interactions
- **Social Sciences**: Citation networks, social media
- **Computer Science**: Web graphs, dependency graphs
- **Operations Research**: Workflow optimization, resource allocation
- **Data Science**: Feature engineering, graph embeddings

### Mathematical Properties Preserved

- Graph isomorphism
- Reachability information
- Path structures
- Connectivity patterns

---

## Quantitative Verification

Let's verify that all conversions are truly lossless:

In [None]:
# Test all examples for isomorphism
examples = [
    ("Citation Network", citation_graph),
    ("Gene Regulatory Network", gene_network),
    ("Web Page Network", web_graph),
    ("Workflow DAG", workflow),
    ("Social Network", social),
    ("Random Graph", random_graph),
    ("Scale-Free Graph", scale_free)
]

print("Isomorphism Verification")
print("=" * 60)
print(f"{'Example':<30} {'Nodes':<10} {'Edges':<10} {'Isomorphic'}")
print("=" * 60)

for name, G in examples:
    H = directed_to_prime_graph(G)
    DG_recovered = prime_graph_to_directed(H)
    is_iso = nx.is_isomorphic(G, DG_recovered)
    
    status = "✓ Yes" if is_iso else "✗ No"
    print(f"{name:<30} {G.number_of_nodes():<10} {G.number_of_edges():<10} {status}")

print("=" * 60)
print("\nAll conversions are lossless! The original directed graphs")
print("can be perfectly recovered from their prime graph representations.")