In [1]:
import pandas as pd
import networkx as nx
import numpy as np
from tqdm import tqdm
import pickle
with open('graph_objects/scigrid.pkl', 'rb') as f:
    G = pickle.load(f)
    G.name = 'SciGrid'

----
# Network Robustness Analysis Based on Maximum Flow (Cai et al., 2021) 
Link: https://www.frontiersin.org/articles/10.3389/fphy.2021.792410/full
#### *Capacity Robustness Based on Maximum Flow*
The network maximum flow matrix *W* is defined as the matrix consisting of the maximum flow values between all pairs of nodes in the network:

$
W = 
\left[
\begin{matrix}
0 & c_{f_{max}}(v_1, v_2) & \cdots & c_{f_{max}}(v_1, v_N) \\
c_{f_{max}}(v_2, v_1) & 0 & \cdots & c_{f_{max}}(v_2, v_N) \\
\vdots & \vdots & \ddots & \vdots \\
c_{f_{max}}(v_N, v_1) & c_{f_{max}}(v_N, v_2) & \cdots & 0 \\
\end{matrix}
\right]
$

$N$ and $V$ denote the size of the network and the set of nodes, respectively. $V_d$ is defined as the set of damaged nodes, $N_d$ is the number of nodes in $V_d$. $V_s$ is the number of remaining nodes. 

$G = (V, E, c)$ denotes the undamaged network. $G^*_s = (V_s, E_s, c_s)$ denotes the damaged network.

Based on $W$, $W_c$ is defined as the matrix after removing the nodes in the set $V_d$ from the maximum flow matrix $W$:

$ 
W_c = 
\left[
\begin{matrix}
0 & c_{f_{max}}(v_i, v_i+1) & \cdots & c_{f_{max}}(v_i, v_{i+N_s-1}) \\
c_{f_{max}}(v_{i+1}, v_i) & 0 & \cdots & c_{f_{max}}(v_{i+1}, v_{i+N_s-1}) \\
\vdots & \vdots & \ddots & \vdots \\
c_{f_{max}}(v_{i+N_s-1}, v_i) & c_{f_{max}}(v_{i+N_s-1}, v_{i+1}) & \cdots & 0 \\
\end{matrix}
\right]
$

$W^*_c$ is defined as the maxium flow matrix recomputed from the damaged network $G^*_s$:

$ 
W^*_c = 
\left[
\begin{matrix}
0 & c^*_{f_{max}}(v_i, v_i+1) & \cdots & c^*_{f_{max}}(v_i, v_{i+N_s-1}) \\
c^*_{f_{max}}(v_{i+1}, v_i) & 0 & \cdots & c^*_{f_{max}}(v_{i+1}, v_{i+N_s-1}) \\
\vdots & \vdots & \ddots & \vdots \\
c^*_{f_{max}}(v_{i+N_s-1}, v_i) & c^*_{f_{max}}(v_{i+N_s-1}, v_{i+1}) & \cdots & 0 \\
\end{matrix}
\right]
$

Finally, the flow capacity robustness $C$ is defined as:

$
C = \frac{\sum(W^*_c)}{\sum(W_c)}
$

Cai et al. provides a method for systematically testing the performance of the network in a max-flow context, but does not instruct on the removal heuristic itself (i.e., the identification of critical network components)

----
# Vulnerability analysis method based on risk assessment for gas transmission capabilities of natural gas pipeline networks (Wang et al., 2021)

Link: https://www.sciencedirect.com/science/article/pii/S0951832021006384

### "Node Importance"
The node degree and gas transmission capacity are used to characterize the Node Importance. Weights are set for the node degree and gas transmission capacity characteristics

The importance of node $i$, denoted as $\lambda_i$, is calculated as follows:

$$\lambda_i = w_1*B_i + w_2*Q_i$$
where:
- $\lambda_i$ represents the importance of node $i$
- $B_i$ is the degree of the node, which is the ratio of the number of edges connected by $i$ th node to the number of edges of the node with the largest number of connected edges
- $Q_i$ is the gas transmission capacity characteristic of the $i$ th node, which is the ratio of the node's capacity to the pipeline network's capacity
- $w_1$ and $w_2$ are weights such that $w_1 + w_2 = 1$


In [5]:
def calculate_node_importance(G, w1, w2):
    # Calculate the degree of each node
    node_degrees = dict(G.degree())
    max_degree = max(node_degrees.values())
    normalized_degrees = {node: degree / max_degree for node, degree in node_degrees.items()}

    # Calculate the gas transmission capacity of each node
    node_capacities = {node: sum(data.get('capacity', 0) for _, _, data in G.edges(node, data=True)) for node in G.nodes()}
    max_capacity = max(node_capacities.values())
    normalized_capacities = {node: capacity / max_capacity for node, capacity in node_capacities.items()}

    # Calculate the importance of each node
    node_importance = {node: w1 * normalized_degrees[node] + w2 * normalized_capacities[node] for node in G.nodes()}
    node_name = {node: data.get('name', '') for node, data in G.nodes(data=True)}

    # Convert the dictionary to a pandas DataFrame
    df = pd.DataFrame.from_dict(node_importance, orient='index', columns=['nri']).reset_index().rename(columns={'index': 'node'}).sort_values(by='nri', ascending=False)
    df['name'] = df['node'].map(node_name)

    df = df.reindex(columns=['node', 'name', 'nri'])    
    return df

# Call the function with weights w1 and w2
node_importance_df = calculate_node_importance(G, w1=0.5, w2=0.5)
node_importance_df

Unnamed: 0,node,name,nri
348,INET_N_103,Bacton,1.000000
237,INET_N_1630,Wloclawek,0.897106
422,INET_N_912,Maldegem,0.642890
319,INET_N_317,Cortemaggiore,0.620278
16,NO_N_17,N_22,0.616541
...,...,...,...
305,INET_N_525,Gate Rotterdam,0.062500
615,INET_N_680,Irboska,0.062500
611,INET_N_1117,Parnu,0.062500
628,INET_N_569,Grenada,0.062500


### "Pipeline Importance"
The Pipeline Importance E includes two indicators: the edge loss rate and weighted flow capacity rate.

The importance of pipeline $E$, denoted as $E$, is calculated as follows:

$$E = z_1*G_{i,j} + z_2*Q_{(i,j)}^{FCR}$$

where:
- $E$ represents the importance of the pipeline
- $G_{i,j}$ is the edge loss rate, which is defined as $G_{i,j} = \frac{L_{i,j}}{L}$, where $L$ represents the number of edges in the initial network and $L_{i,j}$ represents the total edge loss of the network after the edge between node $i$ and node $j$ is attacked
- $Q_{(i,j)}^{FCR}$ is the weighted flow capacity rate
- $z_1$ and $z_2$ are weights such that $z_1 + z_2 = 1$


In [3]:
# TODO: implement the function

### Other things:
"Risk indicators for pipeline network components", i.e., performance of network at global level. The below indicators are plugged into the preference utility function to obtain the calculation formulas for the severity of consequences of the three types of risk indicators.

*The vulnerability of pipeline network is divided into the node vulnerability and pipeline vulnerability. It is the product of the risk value and importance.*

**Decreased pipeline network connectivity index**
$$I_L = 1-C_L$$
where $I_L$ is the decrease rate of the connectivity; 1 represents the connection reliability of the pipe network in the normal state; and $C_L$ is the reduction in connection reliability of the pipeline network under the failure state

**Reduction of gas transmission capacity index**
$$I_C = [Q_{MA}-Q_{MB}] / Q_{MB}$$
where $I_C$ is the reduction of the gas transmission capacity index; $G_{MA}$ is the maximum transmission capacity of the pipe network under the failure state; and $Q_{MB}$ is the maximum transmission capacity before the pipe network failure.

**Number of users experiencing interruptions index**
$$I_{BZ} = n/N$$
where $I_{BZ}$ is the number of users experiencing interruptions index; $n$ is the total number of users suffering gas shortages; and $N$ is the total number of users served.


----
# Flow-based vulnerability measures for network component importance: Experimentation with preparedness planning (Nicholson et al., 2016)

Link: https://www.sciencedirect.com/science/article/pii/S0951832015002562?ref=pdf_download&fr=RR-2&rr=862305915caa0b51

### Max Flow Edge Count (I<sub>MFcount</sub>)
Measures how often an edge is utilized in all source-sink (s–t) max flow problems. Calculated using the formula:

$$I_{\text{MFcount}}(i, j) = \frac{1} {n(n-1)} \sum_{s,t} {\mu_{st}(i, j)}$$
where $\mu_{st}=1$ if edge $(i,j)$ is used in a given $s-t$ max flow problem, 0 otherwise. Tally divided by the total number of $s-t$ pairs.

In [4]:
def max_flow_edge_count(G):
    nodes = list(G.nodes)
    tot_number_of_nodes = len(nodes)

    edge_count = {(u, v): 0 for u, v in G.edges}
    
    for i in tqdm(range(len(nodes)), desc='Calculating max flow'):
        for j in range(i+1, len(nodes)):
            flow_value, flow_dict = nx.maximum_flow(G, nodes[i], nodes[j], flow_func=nx.algorithms.flow.edmonds_karp)
            for u, flows in flow_dict.items():
                for v, flow in flows.items():
                    if flow > 0:
                        if (u, v) in edge_count:
                            edge_count[(u, v)] += 1
                        else:
                            edge_count[(v, u)] += 1
    
    n = len(nodes)
    edge_count_raw = {k: v for k, v in edge_count.items()}
    edge_count_normalized = {k: v / (n * (n - 1)) for k, v in edge_count.items()}
    
    edge_count_combined = {k: {'edge': k,'name': G.edges[k]['name'], 'max_flow_edge_count': v, 'max_flow_edge_count_normalized': edge_count_normalized[k]} for k, v in edge_count_raw.items()}
    
    df = pd.DataFrame.from_dict(edge_count_combined, orient='index').reset_index()
    df.drop(columns=['level_0', 'level_1'], inplace=True)
    
    df = df.sort_values(by='max_flow_edge_count', ascending=False)
    
    return df

# max_flow_edge_count_df = max_flow_edge_count(G)
# max_flow_edge_count_df.to_pickle('results/max_flow_edge_count_df.pkl')


In [5]:
max_flow_edge_count_df = pd.read_pickle('results/max_flow_edge_count_df.pkl')
max_flow_edge_count_df

Unnamed: 0,edge,name,max_flow_edge_count,max_flow_edge_count_normalized
130,"(INET_N_572, INET_N_1333)",Scheemda_Schuilenburg2,1315,0.002583
259,"(INET_N_1325, INET_N_1084)",Scheemda_Schuilenburg0,1313,0.002579
212,"(INET_N_1079, INET_N_1035)",Schuilenburg_Wieringermeer1,1305,0.002563
812,"(INET_N_1084, INET_N_572)",Scheemda_Schuilenburg1,1257,0.002469
813,"(INET_N_1333, INET_N_1079)",Schuilenburg_Wieringermeer0,1257,0.002469
...,...,...,...,...
597,"(INET_N_60, INET_N_1071)",PENTA_West2,0,0.000000
591,"(INET_N_179, INET_N_1081)",Oltrona_Bizzarone,0,0.000000
588,"(INET_N_1432, INET_N_819)",Laneuvelotte_Strasbourg,0,0.000000
566,"(INET_N_1136, INET_N_765)",KIP,0,0.000000


### Min Cutset Count (I<sub>cutset</sub>)
Reflects the total number of times an edge is a member of the min cutset for all s–t pairs. Calculated using the formula:

$$I_{\text{cutset}}(i, j) = \frac{1} {n(n-1)} \sum_{s,t} \delta_{st}(i, j)$$

Highlights edges that act as bottlenecks in max flow problems, and their damage can reduce overall flow.

In [6]:
def edge_cutset_count(graph):
    n = len(graph.nodes)
    edge_importance = {}

    for edge in graph.edges:
        edge_importance[edge] = 0

    for s in tqdm(graph.nodes, desc='Calculating min cutset count'):
        for t in graph.nodes:
            if s != t:
                min_cutset = nx.minimum_edge_cut(graph, s, t)
                for edge in graph.edges:
                    if edge in min_cutset or (edge[1], edge[0]) in min_cutset:
                        edge_importance[edge] += 1

    data = [{
        'edge': edge,
        'name': G.edges[edge]['name'], 
        'min_cutset_count': value,
        'min_cutset_count_normalized': (value / (n * (n - 1))),
    } for edge, value in edge_importance.items()]

    df = pd.DataFrame(data)
    df = df.sort_values(by='min_cutset_count', ascending=False)

    return df

# edge_cutset_count_df = edge_cutset_count(G)
# edge_cutset_count_df.to_pickle('results/edge_cutset_count_df.pkl')

In [7]:
edge_cutset_count_df = pd.read_pickle('results/edge_cutset_count_df.pkl')
edge_cutset_count_df

Unnamed: 0,edge,name,min_cutset_count,min_cutset_count_normalized
377,"(INET_N_1656, INET_N_912)",Maldegem_Zeebrugge_Extra,311,0.436185
241,"(INET_N_1219, INET_N_870)",MIDAL_15,226,0.316971
541,"(INET_N_987, INET_N_1451)",Maastrict_Paris_60,222,0.311360
338,"(INET_N_358, INET_N_529)",EUGAL_14,213,0.298738
254,"(INET_N_1273, INET_N_885)",Transitgas_10,202,0.283310
...,...,...,...,...
470,"(INET_N_166, INET_N_127)",Beziers_Barbaira,1,0.001403
471,"(INET_N_166, INET_N_1538)",Beziers_St Martin De Crau0,1,0.001403
472,"(INET_N_144, INET_N_120)",BGTP,1,0.001403
473,"(INET_N_144, INET_N_406)",Belfast_Dundalk,1,0.001403


### Weighted Flow Capacity Rate (WFCR)
An index used to quantify the criticality of an edge. Has been employed by the following papers:
- Vulnerability analysis method based on risk assessment for gas transmission capabilities of natural gas pipeline networks (Wang et al., 2021)
- A systematic framework of vulnerability analysis of a natural gas pipeline network (Su et al., 2018)
- Flow-based vulnerability measures for network component importance: Experimentation with preparedness planning (Nicholson et al., 2016)

It was first defined and employed by Nicholson. 
> "The flow capacity rate (FCR) quantifies how close a given edge is to becoming a potential bottleneck based on flow amount and capacity. If an edge is significantly underutilized with respect to its capacity, then it is inherently robust to disruptions that reduce capacity. [...] An edge with a high flow capacity rate is more likely to become a bottleneck than an edge with a lower value, but the expected impact to the overall network performance should also be a function of the expected contribution of the given edge [...]"

$$I_{\text{WFCR}}(i,j) = \frac{1} {n(n-1) \sum_{s,t}\omega_{st}} \sum_{s,t} \frac{[\omega_{st}(i,j)]^2} {c_{ij}}$$

In the formula for $I_{\text{WFCR}}(i,j)$:
- $\omega_{st}(i,j)$ denotes the maximum flow from source $s$ to target $t$ passing through edge $(i,j)$
- $c_{ij}$ denotes the capacity of the pipeline between nodes $i$ and $j$


In [12]:
def WFCR(G):
    nodes = list(G.nodes)
    n = len(nodes)

    edge_WFCR = {}
    tot_flow = 0

    for i in tqdm(range(n), desc='Calculating WFCR'):
        for j in range(i + 1, n):
            
            flow_value, flow_dict = nx.maximum_flow(G, nodes[i], nodes[j], flow_func=nx.algorithms.flow.edmonds_karp)
            tot_flow += flow_value
            
            for u, flows in flow_dict.items():
                for v, flow in flows.items():
                    if flow > 0:
                        if (u, v) in G.edges:
                            capacity = G.edges[(u, v)]['capacity']
                        else:
                            capacity = G.edges[(v, u)]['capacity']
                        
                        if capacity > 0:  
                            edge_WFCR[(u, v)] = edge_WFCR.get((u, v), 0) + (flow ** 2) / capacity
                        else:
                            pass

    if tot_flow == 0:
        return pd.DataFrame()
    
    data = [{
        'edge': k,
        'name': G.edges[k]['name'], 
        'wfcr': v / (n * (n - 1) * tot_flow),
    } for k, v in edge_WFCR.items()]

    df = pd.DataFrame(data)
    df = df.sort_values(by='wfcr', ascending=False)

    return df

# wfcr_df = WFCR(G)
# wfcr_df.to_pickle('results/wfcr_df.pkl')


In [13]:
wfcr_df = pd.read_pickle('results/wfcr_df.pkl')
wfcr_df

Unnamed: 0,edge,name,wfcr
67,"(INET_N_572, INET_N_1333)",Scheemda_Schuilenburg2,2.726823e-07
90,"(INET_N_1079, INET_N_1035)",Schuilenburg_Wieringermeer1,2.719365e-07
59,"(INET_N_1325, INET_N_1084)",Scheemda_Schuilenburg0,2.704316e-07
68,"(INET_N_1333, INET_N_1079)",Schuilenburg_Wieringermeer0,2.587927e-07
91,"(INET_N_1035, INET_N_1637)",Schuilenburg_Wieringermeer2,2.575221e-07
...,...,...,...
146,"(NO_N_20, NO_N_21)","16 Gas HEIDRUN, TJELDBERGODDEN",2.882725e-11
629,"(INET_N_1626, INET_N_1624)",MEGAL_Sued_1_12,2.780053e-11
785,"(INET_N_325, INET_N_886)",Craughwell_Loughshinny,1.947456e-11
684,"(INET_N_1545, INET_N_1560)",Uzhgorod_VelkeKapusany,9.560105e-12


----
# Network Robustness Index: A new method for identifying critical links and evaluating the performance of transportation networks (Scott et al., 2006)

Link: https://www.sciencedirect.com/science/article/pii/S0966692305000694

### Network Robustness Index (NRI)
We introduce the Network Robustness Index (NRI) to assess the critical importance of a highway segment within a network. The NRI measures the change in travel-time cost resulting from rerouting all traffic if the segment becomes unusable.

Let:
- $x_a$: flow (traffic volume) on link $a$
- $t_a$: travel time on link $a$, where $t_a = t_a(x_a)$ represents the link performance function or volume-delay curve

The NRI is computed based on the relationship between traffic flow and travel time, providing a more realistic evaluation compared to assumptions of no traffic and free-flow speeds.

The system-wide, travel-time cost of removing the link $c_a$ is given by:
$$c_a = \sum_{a} t_a \cdot x_a \cdot \delta_a$$

where $\delta_a$ is 1 if link $a$ is not the link removed.

The cost is compared to the base case, given by:
$$q_a = c_a - c$$
where
$$c = \sum_{a} t_a \cdot x_a$$
and $q_a$ is the value of the NRI for link $a$.

In [None]:
# TODO: implement the function

----
# Correlation analysis of different vulnerability metrics on power grids (Ouyang et al., 2013)

Link: https://www.sciencedirect.com/science/article/pii/S0378437113010133

(Correlation analysis performed on the below vulnerability metrics)

### Efficiency-based Vulnerability (VE)
Measures the normalized average inverse geodesic path distance, assessing the change in efficiency after a damage event.

Efficiency (E): $$ E = \frac{1} {N(N-1)} \sum_{i \neq j} \frac{1}{d_{ij}} $$
VE: $$ VE = \frac{E_{\text{norm}} - E_{\text{damg}}} {E_{\text{norm}}} $$

### Source–Demand Efficiency-based Vulnerability (VSDE)
Evaluates the change in efficiency, considering the shortest path between generators and load substations after a damage event.

Source–demand efficiency (SDE): $$ SDE = \frac{1} {Ng \cdot Nd} \sum_{i \in Ng, j \in Nd} \frac{1}{d_{ij}} $$
VSDE: $$ VSDE = \frac{SDE_{\text{norm}} - SDE_{\text{damg}}} {SDE_{\text{norm}}} $$

### Largest Component Size-based Vulnerability (VLCS)
Quantifies the change in the number of nodes in the largest connected sub-grid after a damage event.

Largest Component Size (LCS): $$ LCS = \frac{Nl} {N} $$
VLCS: $$ VLCS = \frac{LCS_{\text{norm}} - LCS_{\text{damg}}} {LCS_{\text{norm}}} $$

### Connectivity Level-based Vulnerability (VCL)
Measures the change in the average fraction of generators connected by each load node in the power grid.

Connectivity Level (CL): $$ CL = \frac{\sum_{i}^{g} Ng_i}{N} $$
VCL: $$ VCL = \frac{CL_{\text{norm}} - CL_{\text{damg}}} {CL_{\text{norm}}} $$

### Clustering Coefficient-based Vulnerability (VCC)
Assesses the change in the probability that adjacent nodes are connected in the power grid after a damage event.

Clustering Coefficient (CC): $$ CC = \frac{1} {N} \sum_{i}^{N} ci $$
$$ ci = \frac{\text{number of existed lines among neighbors for node i}}{\text{number of line pairs of neighbors for node i}} $$
VCC: $$ VCC = \frac{CC_{\text{norm}} - CC_{\text{damg}}} {CC_{\text{norm}}} $$

### Power Supply-based Vulnerability (VPS)
Quantifies the change in the amount of power supplied from generators to load substations (blackout size) after a damage event.

Power Supply (PS): $$ VPS = \frac{PS_{\text{norm}} - PS_{\text{damg}}} {PS_{\text{norm}}} $$


----
# Graph Vulnerability and Robustness: A Survey (Freitas et al.)

Link: https://arxiv.org/pdf/2105.00419.pdf

----
# Bottlenecks Identification and Resilience Improvement of Power Networks in Extreme Events (Tu et al., 2022)

Link: https://www.frontiersin.org/articles/10.3389/fphy.2022.941165/full

### Congestion Link Identification Algorithm

LPS: Largest Power Supply

The congestion link identification algorithm aims to enhance power network capacity through iterative greedy search. The process involves:

1. **Network Initialization:**
   - Initialize parameters and power network topology.
   - Simulate extreme disasters by randomly removing 30% of links.

2. **Connectivity Detection:**
   - Identify connected clusters in G after intense disturbances.
   - Unserved components in each cluster Si without generators are removed, setting LPS to 0.

3. **Power Adjustment:**
   - Improve LPS by adjusting remaining component parameters using interior point optimization based on limited resources after extreme events.

4. **Capacity Expansion:**
   - For each link (i, j), iteratively expand their capacity.
   - If the updated LPS increment ΔLPS is greater than the capacity increment δ, accumulate the capacity increment for the corresponding link Δij.

5. **Iteration Steps:**
   - Repeat power adjustment and capacity expansion until the capacity of each link reaches its maximum value.

6. **Results Output:**
   - Select links based on the descending order of Δ.
   - Output the set of congestion links and the LPS after quick mode adjustment and elimination of transmission bottlenecks.
