# Helper functions

In [1]:
import igraph as ig
import random
from contextlib import contextmanager

ig.config["plotting.backend"] = "matplotlib"

_Note:_ igraph has two stable plotting backends at the moment: Cairo and Matplotlib. It also has experimental support for plotly. The Cairo backend is dependent on the pycairo or cairocffi libraries that provide Python bindings to the popular Cairo library. In our case we want to use the matplotlib backend to avoid additional dependencies, so it is useful to fix the plotting.backend in the config in each notebook we want to use igraph's plotting functions.

---
Temporarily set the random state, restoring it afterwards.

For example:
```python
with local_random(seed=123):
    g.community_leiden()
```

In [2]:
@contextmanager
def local_random(seed=None):
    state = random.getstate()
    if seed is not None:
        random.seed(seed)
    try:
        yield
    finally:
        random.setstate(state)

---

This function is a simple wrapper that calls one of three `igraph` community detection algorithms based on the `community_detection_method` parameter.
The function also handles passing `weights` to the algorithm if the `weight_attribute_name` exists on the graph's edges, and it can take a dictionary of extra parameters for the `leiden` algorithm.

In [3]:
def community_detection(graph: ig.Graph, community_detection_method: str = "multilevel", weight_attribute_name: str = "weight", 
                        params: dict = None):
    if community_detection_method == "multilevel":
        return graph.community_multilevel(weights=weight_attribute_name if weight_attribute_name in graph.edge_attributes() else None)
    elif community_detection_method == "leiden":
        if params is None:
            return graph.community_leiden(weights=weight_attribute_name if weight_attribute_name in graph.edge_attributes() else None)
        else:
            params["weights"] = weight_attribute_name if weight_attribute_name in graph.edge_attributes() else None
            return graph.community_leiden(**params)
    elif community_detection_method == "fastgreedy":
        return graph.community_fastgreedy(weights=weight_attribute_name if weight_attribute_name in graph.edge_attributes() else None).as_clustering()

## Significance of community structure

### Method 1: Testing network structure with modularity

This function conducts a rewiring test to assess the statistical significance of your network's community structure.

It works by:

- Creating 500 randomized versions of the original graph, preserving the node degrees.

- Calculating the modularity of each randomized network's community structure.

- Returning a list of these 500 modularity values, which can then be compared to the modularity of the original graph.

In [4]:
def rewire(graph: ig.Graph, community_detection_method: str = "multilevel", params: dict = None):
    num_randomizations = 500  # Number of randomized networks to generate
    modularity_random_networks = []
    
    num_swaps_for_randomization = graph.ecount() * 10
    
    for i in range(num_randomizations):
        # G.rewire() modifies the graph in-place, so we must work on a copy.
        graph_random = graph.copy()

        graph_random.rewire(n=num_swaps_for_randomization)

        partition = community_detection(graph_random, community_detection_method, weight_attribute_name=None, params=params)
        modularity_random_networks.append(partition.modularity)

    return modularity_random_networks

This function tests the statistical significance of a network's community structure.

In [5]:
def test_community_structure(graph: ig.Graph, graph_name: str = "Karate Club Network", community_detection_method: str = "multilevel", 
                             params: dict = None):
    partition = community_detection(graph, community_detection_method, weight_attribute_name=None, params=params)
    modularity_orig = partition.modularity
    modularity_random_networks = rewire(graph, community_detection_method, params)
    plot_histogram(modularity_orig, modularity_random_networks, graph_name)

This function, `plot_histogram`, generates a histogram to visually compare the modularity of an original network with the modularity of multiple randomized versions of the same network.

The purpose of this plot is to determine the statistical significance of the original network's community structure.

In [6]:
def plot_histogram(modularity_original: float, modularity_random_networks: list[float], graph_name: str="Karate Club Network"):
    import matplotlib.pyplot as plt
    
    plt.figure(figsize=(10, 6))
    plt.hist(modularity_random_networks, bins=30, alpha=0.7, color='lightgreen',
             edgecolor='black', label='Modularity of Randomized Networks')
    
    # Set x-axis limits from 0 to 1
    plt.xlim(0, 1)
    
    # Plot a vertical line for the original network's modularity
    plt.axvline(modularity_original, color='red', linestyle='dashed', linewidth=2,
                label=f'Original Network Modularity ({modularity_original:.4f})')
    
    plt.title(f'Modularity of Original vs. Randomized {graph_name} (igraph)')
    plt.xlabel('Modularity Score')
    plt.ylabel('Frequency')
    plt.legend()
    plt.grid(axis='y', alpha=0.75)
    plt.tight_layout()
    plt.show()


### Method 2: Testing network structure with NMI values

This function, `check_key_existence`, checks if a list of keys exists in a dictionary.

In [7]:
def check_key_existence(keys, params):
    return all(key in params for key in keys)

This function is a factory that creates a specific community detection algorithm and returns it as a callable object. It prepares the chosen algorithm to be run, handling its specific parameters (like `resolution` for Leiden) before returning it.

In [8]:
def build_community_detection_method(graph, community_detection_method, params):
    community_detection = None
    if community_detection_method == "leiden":
        if params is None:
            raise ValueError("params must not be None")
        if not check_key_existence(["resolution"], params):
            raise KeyError("Key not found in params")
        resolution_list = params["resolution"]
        leiden_other_params = {k: v for k, v in params.items() if k != "resolution"}
        community_detection = lambda seed_idx: graph.community_leiden(resolution=resolution_list[seed_idx], **leiden_other_params)
    elif community_detection_method == "multilevel":
        community_detection = lambda _: graph.community_multilevel(**(params if params else {})) # Pass original params if any, or empty dict
    else:
        raise ValueError(f"Unknown community detection method: {community_detection_method}")

    if community_detection is None:
        raise RuntimeError("Failed to set up community_detector_executor.")

    return community_detection

This function repeatedly runs a community detection algorithm and checks how consistent the results are by comparing each run to a reference partition. It returns a list of **Normalized Mutual Information (NMI)** scores, which tell you how similar each new partition is to the reference.

In [9]:
def run_stochastic_community_detection(graph, reference_partition: ig.clustering.VertexClustering, num_runs: int, 
                                       community_detection_method: str = "multilevel", return_partitions: bool = False,
                                      params: dict = None):
    import random
    
    nmi_values = []
    all_partitions = []

    community_detection = build_community_detection_method(graph, community_detection_method, params)

    for i in range(num_runs):
        random.seed()
        current_partition = community_detection(i)
        all_partitions.append(current_partition)

        if not return_partitions:
            # Calculate NMI between the current partition and the reference partition
            # 'method='nmi'' specifies Normalized Mutual Information
            nmi = ig.compare_communities(reference_partition, current_partition, method='nmi')
            nmi_values.append(nmi)

    if return_partitions:
        return all_partitions
        
    return nmi_values

This function, `plot_nmi_histogram`, creates a histogram to visualize the consistency of a stochastic community detection algorithm. It shows how often the algorithm produces similar community structures.

A histogram with a high mean indicates that the algorithm is highly consistent and reliably finds the same community structure, while a low mean suggests the results are highly variable and dependent on the random seed.

In [10]:
def plot_nmi_histogram(graph: ig.Graph, nmi_values: list[float], title: str):
    import numpy as np
    import matplotlib.pyplot as plt
    
    plt.figure(figsize=(10, 6))
    plt.hist(nmi_values, bins=20, edgecolor='black', alpha=0.7, color='lightcoral')
    plt.title(title)
    plt.xlabel('Normalized Mutual Information (NMI) Score')
    plt.ylabel('Frequency')
    plt.grid(axis='y', alpha=0.75)
    
    # Set x-axis limits from 0 to 1
    plt.xlim(0, 1)

    # Add a line for the mean NMI
    mean_nmi = np.mean(nmi_values)
    plt.axvline(mean_nmi, color='blue', linestyle='dashed', linewidth=2,
                label=f'Mean NMI: {mean_nmi:.4f}')

    plt.legend()
    plt.tight_layout()
    plt.show()

This function, `calculate_pairwise_nmi`, computes the **Normalized Mutual Information (NMI)** for every unique pair of community structures in a list.

In [11]:
def calculate_pairwise_nmi(partitions: list[ig.clustering.VertexClustering]):
    """
    Calculates Normalized Mutual Information (NMI) for all unique pairs of partitions.
    """
    import itertools
    
    pairwise_nmi_values = []
    for p1, p2 in itertools.combinations(partitions, 2):
        nmi = ig.compare_communities(p1, p2, method='nmi')
        pairwise_nmi_values.append(nmi)
    return pairwise_nmi_values

### Testing Significance of Community Structure on a Grid Graph

This function plots a graph on a grid and colors the nodes by their community membership. It's designed to visualize community detection results on networks with a grid-like structure.

In [12]:
def plot_leiden_on_grid(graph, grid_cols, communities, title="Graph with Leiden Communities", plot_size=(8, 8)):
    """
    Clusters a graph using the Leiden algorithm and plots the result
    with vertices colored by their community, specifically for grid layouts.

    Args:
        graph (igraph.Graph): The graph to cluster.
        grid_cols (int): The number of columns the grid graph has. Crucial for layout.
        title (str): Title for the plot.
        plot_size (tuple): Size of the matplotlib figure (width, height).

    Returns:
        igraph.clustering.VertexClustering: The community detection result.
    """
    import matplotlib.pyplot as plt

    # Assign colors based on community membership
    palette = ig.GradientPalette("red", "blue", n=len(communities)) 
    if len(communities) > 1:
        vertex_colors = [palette.get(membership_id) for membership_id in communities.membership]
    else:
        vertex_colors = ["red"]
    
    # Generate the grid layout using the provided grid_cols
    layout = graph.layout_grid(width=grid_cols)
            
    fig, ax = plt.subplots(figsize=plot_size)

    ig.plot(
        graph,
        target=ax,
        layout=layout,
        vertex_size=10,
        vertex_color=vertex_colors, # Use community colors
        vertex_label=None,
        edge_width=0.8,
        edge_color="gray",
        bbox=(0, 0, 600, 600),
        margin=20
    )
    ax.set_title(title)
    ax.axis('off')
    plt.show()

## Community detection table

This function creates and displays an interactive table summarizing various igraph community detection algorithms and their features.

In [13]:
def show_table():
    import pandas as pd
    from itables import init_notebook_mode, show
    
    # This line enables itables to make all DataFrames interactive by default
    # If you only want specific tables to be interactive, remove this line and use show(df) explicitly
    init_notebook_mode(all_interactive=True)
    
    # Your DataFrame creation code (as before)
    data = {
        'Method': ['Edge Betweenness', 'Fast-Greedy', 'Fluid Communities', 'Infomap', 'Label Propagation', 'Leading Eigenvector', 'Leiden', 'Louvain (Multilevel)', 'Spinglass', 'Walktrap', 'Optimal Modularity', 'Voronoi'],
        'Function in igraph (Python)': ['`Graph.community_edge_betweenness()`', '`Graph.community_fastgreedy()`', '`Graph.community_fluid_communities()`', '`Graph.community_infomap()`', '`Graph.community_label_propagation()`', '`Graph.community_leading_eigenvector()`', '`Graph.community_leiden()`', '`Graph.community_multilevel()`', '`Graph.community_spinglass()`', '`Graph.community_walktrap()`', '`Graph.community_optimal_modularity()`', '`Graph.community_voronoi()`'],
        'Directed Graph Support': ['✅', '❌', '❌', '✅', '❌', '❌', '✅', '✅', '❌', '❌', '❌', '✅'],
        'Weighted Graph Support': ['✅', '✅', '❌ (weights ignored)', '✅', '✅', '✅', '✅', '✅', '✅', '✅', '✅', '✅'],
        'Signed Graph Support': ['❌', '❌', '❌', '❌', '❌', '❌', '❌', '❌', '✅', '❌', '❌', '❌'],
        'Sparse Graph Performance': ['✅', '✅ (Very efficient)', '✅', '✅', '✅', '✅', '✅', '✅', '✅', '✅', '❌ (Small graphs only)', '✅'],
        'Dense Graph Performance': ['❌ (Slow for large)', '✅ (Can handle)', '✅', '✅', '✅', '✅', '✅', '✅', '❌ (Slower)', '✅', '❌ (Small graphs only)', '✅'],
        'Deterministic': ['✅', '❌', '❌', '❌', '❌', '✅', '❌', '❌', '❌', '✅', '✅', '✅'],
    }
    
    df = pd.DataFrame(data)
    
    # To display the DataFrame with interactive features, including sticky headers AND frozen first column
    show(df, scrollY="300px", scrollCollapse=True, fixedColumns=True, pageLength=-1)

## Resolution parameter

This function, `sierpinski_graph`, generates a graph that represents a [Sierpiński triangle](https://en.wikipedia.org/wiki/Sierpiński_triangle). It uses a recursive approach to build the fractal structure and assigns coordinates to each vertex for accurate visualization.

In [14]:
def sierpinski_graph(depth):
    """
    Generates a Sierpiński triangle graph along with its vertex coordinates
    in the x and y vertex attributes.
    """
    import igraph as ig
    import numpy as np
    
    G = ig.Graph()
    if depth < 0:
        return G, []
    
    # Use a dictionary to map coordinates to a single vertex ID
    coord_to_id = {}
    next_id = 0
    
    def get_or_create_vertex(coord):
        nonlocal next_id
        coord_tuple = tuple(coord)
        if coord_tuple not in coord_to_id:
            coord_to_id[coord_tuple] = next_id
            G.add_vertex(x=coord[0], y=coord[1])
            next_id += 1
        return coord_to_id[coord_tuple]
    
    def add_triangle(g, p1, p2, p3, d):
        if d == 0:
            v_p1 = get_or_create_vertex(p1)
            v_p2 = get_or_create_vertex(p2)
            v_p3 = get_or_create_vertex(p3)
            
            if not g.are_adjacent(v_p1, v_p2): g.add_edge(v_p1, v_p2)
            if not g.are_adjacent(v_p2, v_p3): g.add_edge(v_p2, v_p3)
            if not g.are_adjacent(v_p3, v_p1): g.add_edge(v_p3, v_p1)
        else:
            a = ((p1[0] + p2[0]) / 2, (p1[1] + p2[1]) / 2)
            b = ((p2[0] + p3[0]) / 2, (p2[1] + p3[1]) / 2)
            c = ((p3[0] + p1[0]) / 2, (p3[1] + p1[1]) / 2)
            
            add_triangle(g, p1, a, c, d - 1)
            add_triangle(g, a, p2, b, d - 1)
            add_triangle(g, c, b, p3, d - 1)

    p1, p2, p3 = (0, 0), (1, 0), (0.5, np.sqrt(3)/2)
    add_triangle(G, p1, p2, p3, depth)

    return G

## Membership vector

This function plots a graph with its communities and adds a legend to the plot. 

It works by:

- Drawing the graph, coloring nodes by their community.

- Creating a list of colored boxes with community labels.

- Displaying the list as a legend on the plot.

In [15]:
def ig_plot_w_legend(g, communities):
    import matplotlib.pyplot as plt
    import matplotlib.patches as mpatches
    # Create a matplotlib figure and axes object
    fig, ax = plt.subplots(figsize=(10, 8))
    colors_map = plt.colormaps["tab10"]
    
    # Plot the graph onto the axes, coloring vertices by community
    ig.plot(
        communities,
        target=ax,
        vertex_color=[colors_map(c) for c in communities.membership],
        vertex_label=g.vs.indices,
        layout=g.layout_auto(),
    )
    
    # Corrected section for creating the legend
    unique_communities = sorted(list(set(communities.membership)))
    legend_handles = []
    
    for comm_id in unique_communities:
        color = colors_map(comm_id)
        patch = mpatches.Patch(color=color, label=f'Community {comm_id}')
        legend_handles.append(patch)
    
    # Add the legend to the plot
    ax.legend(handles=legend_handles, title="Communities")
    
    # Display the plot
    plt.show()