# Build Graphs and Networks

This notebook demonstrates how to build various graphs and networks for perturbation analysis.

## Features
- Build co-expression networks
- Create perturbation graphs from GO
- Construct gene-gene interaction networks
- Convert graphs to different formats


In [1]:
import numpy as np
import pandas as pd
from perturblab.types import GeneVocab
from perturblab.methods.gears import build_perturbation_graph

# Create a gene vocabulary
genes = ['TP53', 'BRCA1', 'KRAS', 'MYC', 'EGFR', 'BRCA2', 'CDKN2A', 'PTEN']
gene_vocab = GeneVocab(genes)
print(f"Gene vocabulary: {len(gene_vocab)} genes")


Gene vocabulary: 8 genes


## Build Perturbation Graph from GO


In [2]:
# Build perturbation graph using GO annotations
pert_graph = build_perturbation_graph(
    gene_vocab,
    similarity='jaccard',
    threshold=0.1,
    num_workers=1,
    show_progress=True
)

print(f"Perturbation graph:")
print(f"  Nodes: {pert_graph.n_nodes}")
print(f"  Edges: {pert_graph.n_unique_edges}")
print(f"  Average degree: {2 * pert_graph.n_unique_edges / pert_graph.n_nodes:.2f}")

# Query the graph
print(f"\nExample queries:")
for gene in genes[:3]:
    neighbors = pert_graph.neighbors(gene)
    weights = pert_graph.get_weights(gene)  # Returns numpy array
    if len(weights) > 0:
        print(f"  {gene}: {len(neighbors)} neighbors, avg weight: {np.mean(weights):.3f}")
    else:
        print(f"  {gene}: {len(neighbors)} neighbors, no weights")


[perturblab] [INFO] ðŸ§¬ Building GEARS perturbation graph
[perturblab] [INFO]    Using provided GeneVocab: 8 genes
[perturblab] [INFO]    ðŸ“– Loading GO annotations: gene2go_all.pkl
[perturblab] [INFO]    Total genes in GO database: 67,832
[perturblab] [INFO]    âœ“ Genes with GO annotations: 8
[perturblab] [INFO]    ðŸ”„ Computing pairwise gene similarities...
[perturblab] [INFO] ðŸ§¬ Building gene similarity network from GO annotations
[perturblab] [INFO]    Genes: 8
[perturblab] [INFO]    GO terms: 568
[perturblab] [INFO]    Gene-GO edges: 706
[perturblab] [INFO] ðŸ”„ Projecting bipartite graph: 8 source nodes, 568 target nodes
[perturblab] [INFO] ðŸ“Š Retrieving neighbors for all source nodes...
[perturblab] [INFO] ðŸ§® Computing pairwise similarities (method=jaccard, threshold=0.1)...


Computing similarities: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 8/8 [00:00<00:00, 7956.94it/s]

[perturblab] [INFO] âœ… Found 2 edges above threshold 0.1
[perturblab] [INFO] ðŸ“ˆ Created undirected graph: 2 unique edges, 4 total edges (undirected)
[perturblab] [INFO]    ðŸ”§ Building graph structure...
[perturblab] [INFO] âœ… GEARS perturbation graph built successfully:
[perturblab] [INFO]    Nodes: 6
[perturblab] [INFO]    Edges: 2
[perturblab] [INFO]    Average degree: 0.7
[perturblab] [INFO]    Similarity: jaccard, threshold: 0.1
Perturbation graph:
  Nodes: 6
  Edges: 2
  Average degree: 0.67

Example queries:
  TP53: 1 neighbors, avg weight: 0.107
  BRCA1: 1 neighbors, avg weight: 0.117
  KRAS: 0 neighbors, no weights





## Convert Graph to DataFrame


In [3]:
from perturblab.methods.gears import weighted_graph_to_dataframe

# Convert to edge list DataFrame
edge_df = weighted_graph_to_dataframe(pert_graph, include_node_names=True)
print(f"Edge DataFrame shape: {edge_df.shape}")
print(f"\nFirst 5 edges:")
print(edge_df.head())

# Can also convert to numeric indices
edge_df_numeric = weighted_graph_to_dataframe(pert_graph, include_node_names=False)
print(f"\nNumeric edge list shape: {edge_df_numeric.shape}")
print(edge_df_numeric.head())


Edge DataFrame shape: (4, 3)

First 5 edges:
  source target    weight
0  BRCA1  BRCA2  0.116667
1    MYC   TP53  0.107280
2  BRCA2  BRCA1  0.116667
3   TP53    MYC  0.107280

Numeric edge list shape: (4, 3)
   source  target    weight
0       1       5  0.116667
1       3       0  0.107280
2       5       1  0.116667
3       0       3  0.107280
