# Network Metrics Visualization and Comparison

This notebook loads and displays computed network metrics for the three actor collaboration networks (Niche, Almost-niche, and Mixed), enabling direct comparison of their structural properties.

## 1. Distance Metrics

Load and display path length statistics:
- **Diameter**: The longest shortest path in each network
- **Average shortest path**: Mean distance between connected nodes in the main component

In [1]:
import pandas as pd
distance = pd.read_json("analysis_5/distance.json")

print(distance)

                                        Niche graph  Almost niche graph  \
Diameter                                    5.00000            9.000000   
Main connected component shortest path      2.56384            3.098455   
Graph - number of nodes                   453.00000        14892.000000   
Graph - number of edges                  4669.00000       439100.000000   
Number of connected components              2.00000          751.000000   

                                         Mixed graph  
Diameter                                    6.000000  
Main connected component shortest path      3.754041  
Graph - number of nodes                 15345.000000  
Graph - number of edges                 83156.000000  
Number of connected components           5367.000000  


## 2. Structural Properties

Display comprehensive structural metrics for each network:
- **SCC (Strongly Connected Component)**: Largest set of nodes with paths in both directions
- **WCC (Weakly Connected Component)**: Largest connected set ignoring edge direction
- **Density**: Ratio of actual edges to possible edges
- **Reciprocity**: Proportion of edges that are bidirectional
- **Transitivity**: Overall clustering measure (global)
- **Clustering coefficient**: Average local clustering
- **Category-specific clustering**: Separate clustering for stars and emergent actors in mixed network

In [2]:
import pandas as pd
structure = pd.read_json("analysis_5/structure.json")

print(structure)

                        Niche graph  Almost niche graph  Mixed graph
Biggest SCC size         350.000000         9043.000000  5507.000000
Biggest SCC size ratio     0.772627            0.607239     0.358879
Biggest WCC size         452.000000        14141.000000  9979.000000
Biggest WCC size ratio     0.997792            0.949570     0.650310
Density                    0.022803            0.001980     0.000353
Reciprocity                0.315699            0.419535     0.326880
Transitivity               0.085211            0.089475     0.000000
Clustering                 0.118710            0.245336     0.023889
Clustering Stars           0.000000            0.000000     0.011239
Clustering Emergents       0.000000            0.000000     0.024273


In [4]:

import pandas as pd
assortativity = pd.read_json("analysis_5/assortativity.json")

print(assortativity)

                         Niche graph  Almost niche graph  Mixed graph
Degree assortativity       -0.058928            0.116357          NaN
Attribute assortativity          NaN                 NaN    -0.865414


## 3. Assortativity Metrics

Examine mixing patterns in the networks:
- **Degree assortativity**: Whether high-degree nodes connect to similar high-degree nodes
- **Attribute assortativity**: In mixed network, whether actors preferentially collaborate within their success category

Positive values indicate homophily (like connects to like), negative values indicate heterophily.

In [3]:

import pandas as pd
centrality = pd.read_json("analysis_5/centrality.json")

print(centrality)

                                        Niche graph  Almost niche graph  \
Average betweenness centrality             0.004591            0.000092   
Maximum betweenness centrality             0.086162            0.022253   
Average IN-degree centrality               0.022803            0.001980   
Maximum IN-degree centrality               0.075221            0.042710   
Average OUT-degree centrality              0.022803            0.001980   
Maximum OUT-degree centrality              0.137168            0.048284   
Average closeness centrality               0.254731            0.181684   
Maximum closeness centrality               0.517883            0.441322   
Average pagerank centrality                0.002208            0.000067   
Maximum pagerank centrality                0.024040            0.001011   
Maximum normalized pagerank centrality    10.890214           15.054149   

                                        Mixed graph  
Average betweenness centrality             0.

## 4. Centrality Measures

Display node importance metrics across all three networks:
- **Betweenness centrality**: Nodes that lie on many shortest paths
- **Degree centrality**: Number of direct connections
- **Closeness centrality**: Average distance to all other nodes
- **PageRank**: Importance based on network structure (like Google's algorithm)

Both average and maximum values are shown to understand distribution of influence.

In [None]:
import pandas as pd

# Metodo 1: Usando l'index del DataFrame
df = pd.DataFrame({
    'Niche': [453, 4669, 0.0228],
    'Almost-niche': [14892, 439100, 0.0020],
    'Mixed': [15345, 83156, 0.0004]
}, index=['Nodes', 'Edges', 'Density'])  # <-- Nomi delle righe come index

# Converti con index=True (default)
latex_table = df.to_latex(
    caption='Structural properties',
    label='tab:structure',
    float_format="%.4f"
)
print(latex_table)

## 5. Small-World Comparison

Compare observed metrics against random graph benchmarks to test for small-world properties.

A small-world network should show:
- Average path length close to random graph: log(N)/log(k)
- Clustering coefficient much higher than random: k/N

Where N = number of nodes, k = average degree.

In [None]:
import json
import os

import networkx as nx
with open("niche_graph.json") as f:
    graph_dict = json.load(f)

niche_graph = nx.node_link_graph(graph_dict)

with open("almost_niche_graph.json") as f:
    graph_dict = json.load(f)

almost_niche_graph = nx.node_link_graph(graph_dict)

with open("mixed_graph.json") as f:
    graph_dict = json.load(f)

mixed_graph = nx.node_link_graph(graph_dict)

In [None]:
from networkx.algorithms import bipartite

# 1. Prendi i gradi dei nodi originali (fondamentale per il benchmark)
top_nodes = [n for n, d in mixed_graph.nodes(data=True) if d['bipartite'] == 1]
deg_sequence = [d for n, d in mixed_graph.degree(top_nodes)]

# 2. Genera un grafo bipartito casuale con la stessa densità (Modello Erdos-Renyi)
# n = num_star, m = num_emergenti, p = probabilità di arco (densità)
n = len(top_nodes)
m = mixed_graph.number_of_nodes() - n
p = mixed_graph.number_of_edges() / (n * m)

G_random = bipartite.random_graph(n, m, p)

# 3. Calcola il clustering sul grafo random e confronta
random_clust = bipartite.clustering(G_random)
avg_random_clust = sum(random_clust.values()) / len(random_clust)

print(f"Reale: 0.0238, Casuale: {avg_random_clust}")

In [5]:
import networkx as nx
import networkit as nk
import numpy as np
# 1. Genera il benchmark BIPARTITO reale con NetworkX
n = 453
m = 14892
p = 0.0004
nxbip = nx.bipartite.random_graph(n, m, p)

# 2. Converti in NetworKit
G_rand = nk.nxadapter.nx2nk(nxbip)

# 3. Calcola l'Average Path Length (con campionamento per velocità)
# Invece di EffectiveDiameter, usa l'approssimazione della distanza media
dist_alg = nk.distance.APSP(G_rand)
dist_alg.run()
avg_path_rand = dist_alg.getDistances(asarray=True)
valid_distances = avg_path_rand[avg_path_rand > 0]
valid_distances = valid_distances[valid_distances < G_rand.numberOfNodes()]
average_path_length = np.mean(valid_distances)
print(f"ASP del Benchmark Bipartito: {average_path_length}")

ASP del Benchmark Bipartito: 7.030791838966542


In [None]:
import pandas as pd
import math
import numpy as np

degree_niche = float(np.mean([d for n, d in niche_graph.degree()]))
degree_almost__niche = float(np.mean([d for n, d in almost_niche_graph.degree()]))
nodes_niche = 453
nodes_almost = 14892
nodes_mixed = 15345


# Metodo 1: Usando l'index del DataFrame
df = pd.DataFrame({
    'Niche': [round(2.56384,4), round(0.118710, 4)],
    'Niche benchmark': [math.log(nodes_niche)/math.log(degree_niche),degree_niche/nodes_niche],
    'Almost-niche': [round(3.098455,4), round(0.245336,4)],
    'Almost-niche benchmark' : [math.log(nodes_almost)/math.log(degree_almost__niche),degree_almost__niche/nodes_almost],
    'Mixed': [round(3.754041,4), round(0.023887,4)],
    'Mixed benchmark':[round(avg_path_rand,4), round(avg_random_clust,4)]
}, index=['Average SP', 'Clustering Coefficient'])  # <-- Nomi delle righe come index

# Converti con index=True (default)
latex_table = df.to_latex(
    caption='Structural properties',
    label='tab:structure',
    float_format="%.4f"
)
print(latex_table)