### Multiplex Network Construction Documentation

In this document, we describe the construction of a multiplex network based on the incident data from the Oklahoma Gas and Electric company. The multiplex network consists of multiple layers, each representing different types of connections between the substations.

#### Layers in the Multiplex Network

1. **Job Region**
   - **Description**: This layer represents the geographical regions where the substations are located. Nodes (substations) are connected if they belong to the same job region.
   
2. **Job Area (DISTRICT)**
   - **Description**: This layer represents the specific districts within the regions. Nodes are connected if they belong to the same job area or district.
   
3. **Month/Day/Year**
   - **Description**: This temporal layer represents the date on which incidents occurred. Nodes are connected if incidents at these substations occurred on the same day.
   
4. **Custs Affected Interval**
   - **Description**: This layer categorizes incidents based on the number of customers affected. Nodes are connected if the number of affected customers falls within the same interval (Very Low, Low, Medium, High).
   
5. **OGE Causes**
   - **Description**: This layer categorizes incidents based on their causes as defined by the Oklahoma Gas and Electric company. Nodes are connected if incidents share the same cause.
   
6. **Major Storm Event (Yes or No)**
   - **Description**: This layer represents whether an incident occurred during a major storm event. Nodes are connected if incidents at these substations were affected by the same storm event (Yes or No).
   
7. **Distribution, Substation, Transmission Type**
   - **Description**: This layer represents the type of infrastructure associated with the incidents. Nodes are connected if they belong to the same type, such as distribution, substation, or transmission.

These layers collectively provide a comprehensive view of the different relationships and interactions between the substations based on various criteria, enabling a detailed analysis of the incident data.


In [28]:
import pandas as pd
import numpy as np
import os

# Load the dataset
file_path = '/Volumes/Data/NDSU/PhD Work/Research/IME Research/AI-Energy/Data/SPP/Incidents_400.xlsx'  
data = pd.read_excel(file_path)

# Display the first few rows of the dataset to understand its structure
data.head()


Unnamed: 0,Job Display ID,CAD_ID,Job Region,Job Area (DISTRICT),Job Substation,Job Feeder,Feeder ID,Job OFF Time,Job ON Time,Job Duration Mins,...,Feeder SAIDI,AM Notes,OGE Causes,Major Storm Event Y (Yes) or N (No),"Distribution, Substation, Transmission","Transmission Voltage (69kV, 138kV, 161kv) feeding distribution substation",Month/Day/Year,Year,Equipment Desc that should be excluded from reported indices,Ark Grid Mod or OK Grid Enhancement Circuits
0,J2001.000006,PD-01012020-00063,METRO EAST,EAST,8617:SUNNYLANE,SUNNYLANE_1722,861722,2020-01-01 00:21:50,2020-01-01 09:22:20,540.5,...,,,Cause Exclusion,N,DISTRIBUTION,69kV,2020-01-01,2020,,
1,J2001.000021,PD-01012020-00363,NORTHWEST,WOODWARD,4606:CEDAR AVE,CEDAR_AVE_631,460631,2020-01-01 03:00:30,2020-01-01 04:23:00,82.5,...,,,Equipment,N,DISTRIBUTION,69kV,2020-01-01,2020,,
2,J2001.000041,,METRO WEST,EL RENO,8905:EL RENO,EL_RENO_522,890522,2020-01-01 08:39:50,2020-01-01 09:56:25,76.58,...,,,Equipment,N,DISTRIBUTION,138kV,2020-01-01,2020,,
3,J2001.000045,PD-01012020-00859,SOUTHERN,SULPHUR,5712:JOLLYVILLE,JOLLYVILLE_1264,571264,2020-01-01 08:48:18,2020-01-01 13:39:00,290.7,...,,,Equipment,N,DISTRIBUTION,138kV,2020-01-01,2020,,
4,J2001.000046,PD-01012020-00858,NORTHWEST,WOODWARD,4606:CEDAR AVE,CEDAR_AVE_622,460622,2020-01-01 08:50:25,2020-01-01 11:00:00,129.58,...,,,Equipment,N,DISTRIBUTION,69kV,2020-01-01,2020,,


In [38]:
# Extract relevant columns
relevant_columns = [
    'Job Substation', 'Job Region', 'Job Area (DISTRICT)', 'Month/Day/Year', 
    'Custs Affected', 'OGE Causes', 'Major Storm Event  Y (Yes) or N (No)', 
    'Distribution, Substation, Transmission'
]
data_subset = data[relevant_columns]

# Replace spaces in Job Substation names with underscores
data_subset['Job Substation'] = data_subset['Job Substation'].str.replace(' ', '_')

# Display the first few rows of the extracted data
data_subset.head()


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_subset['Job Substation'] = data_subset['Job Substation'].str.replace(' ', '_')


Unnamed: 0,Job Substation,Job Region,Job Area (DISTRICT),Month/Day/Year,Custs Affected,OGE Causes,Major Storm Event Y (Yes) or N (No),"Distribution, Substation, Transmission"
0,8617:SUNNYLANE,METRO EAST,EAST,2020-01-01,2,Cause Exclusion,N,DISTRIBUTION
1,4606:CEDAR_AVE,NORTHWEST,WOODWARD,2020-01-01,138,Equipment,N,DISTRIBUTION
2,8905:EL_RENO,METRO WEST,EL RENO,2020-01-01,1,Equipment,N,DISTRIBUTION
3,5712:JOLLYVILLE,SOUTHERN,SULPHUR,2020-01-01,1,Equipment,N,DISTRIBUTION
4,4606:CEDAR_AVE,NORTHWEST,WOODWARD,2020-01-01,1,Equipment,N,DISTRIBUTION


In [39]:
# Determine appropriate bin edges with slight adjustments to avoid duplicates
bins = [data_subset['Custs Affected'].min(), 2, 6, 101, data_subset['Custs Affected'].max() + 1]
labels = ['Very Low', 'Low', 'Medium', 'High']

# Use pd.cut to create intervals with the defined bins and labels
data_subset['Custs Affected Interval'] = pd.cut(data_subset['Custs Affected'], bins=bins, labels=labels, include_lowest=True)

# Display the updated dataframe with intervals
data_subset.head()


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_subset['Custs Affected Interval'] = pd.cut(data_subset['Custs Affected'], bins=bins, labels=labels, include_lowest=True)


Unnamed: 0,Job Substation,Job Region,Job Area (DISTRICT),Month/Day/Year,Custs Affected,OGE Causes,Major Storm Event Y (Yes) or N (No),"Distribution, Substation, Transmission",Custs Affected Interval
0,8617:SUNNYLANE,METRO EAST,EAST,2020-01-01,2,Cause Exclusion,N,DISTRIBUTION,Very Low
1,4606:CEDAR_AVE,NORTHWEST,WOODWARD,2020-01-01,138,Equipment,N,DISTRIBUTION,High
2,8905:EL_RENO,METRO WEST,EL RENO,2020-01-01,1,Equipment,N,DISTRIBUTION,Very Low
3,5712:JOLLYVILLE,SOUTHERN,SULPHUR,2020-01-01,1,Equipment,N,DISTRIBUTION,Very Low
4,4606:CEDAR_AVE,NORTHWEST,WOODWARD,2020-01-01,1,Equipment,N,DISTRIBUTION,Very Low


In [40]:
import networkx as nx

# Initialize the multiplex network as a dictionary of NetworkX graphs
multiplex_network = {layer: nx.Graph() for layer in relevant_columns[1:]}

# Function to add edges to the multiplex network
def add_edges_to_multiplex(data, multiplex_network):
    for layer, graph in multiplex_network.items():
        for idx, row in data.iterrows():
            substation = row['Job Substation']
            graph.add_node(substation)
            connections = data[data[layer] == row[layer]]['Job Substation'].unique()
            for connection in connections:
                if substation != connection:
                    graph.add_edge(substation, connection)

# Add edges to the multiplex network
add_edges_to_multiplex(data_subset, multiplex_network)

# Display the number of nodes and edges in each layer of the multiplex network
for layer, graph in multiplex_network.items():
    print(f'Layer: {layer}')
    print(f'Number of nodes: {graph.number_of_nodes()}')
    print(f'Number of edges: {graph.number_of_edges()}')
    print('---------------------------------')
    


Layer: Job Region
Number of nodes: 157
Number of edges: 1531
---------------------------------
Layer: Job Area (DISTRICT)
Number of nodes: 157
Number of edges: 676
---------------------------------
Layer: Month/Day/Year
Number of nodes: 157
Number of edges: 6770
---------------------------------
Layer: Custs Affected
Number of nodes: 157
Number of edges: 9298
---------------------------------
Layer: OGE Causes
Number of nodes: 157
Number of edges: 7689
---------------------------------
Layer: Major Storm Event  Y (Yes) or N (No)
Number of nodes: 157
Number of edges: 12246
---------------------------------
Layer: Distribution, Substation, Transmission
Number of nodes: 157
Number of edges: 11586
---------------------------------


In [32]:
import matplotlib.pyplot as plt

# Function to visualize each layer in the multiplex network and print the number of nodes and edges
def visualize_multiplex_network(multiplex_network):
    for layer, graph in multiplex_network.items():
        num_nodes = graph.number_of_nodes()
        num_edges = graph.number_of_edges()
        print(f"Layer: {layer}")
        print(f"Number of nodes: {num_nodes}")
        print(f"Number of edges: {num_edges}")
        print("---------------------------------")
        
        plt.figure(figsize=(12, 12))
        pos = nx.spring_layout(graph)
        nx.draw(graph, pos, with_labels=True, node_size=50, font_size=8)
        plt.title(f"Multiplex Network Layer: {layer}")
        plt.draw()  # Use draw instead of show
        plt.pause(0.001)  # Pause to update the plot

# Visualize the multiplex network layers
visualize_multiplex_network(multiplex_network)
plt.show()  # Show all plots at the end


Layer: Job Region
Number of nodes: 157
Number of edges: 1531
---------------------------------


TypeError: '_AxesStack' object is not callable

<Figure size 1200x1200 with 0 Axes>

In [41]:
# Function to create an adjacency matrix
def create_adjacency_matrix(data, column):
    unique_substations = data['Job Substation'].unique()
    substation_index = {substation: idx for idx, substation in enumerate(unique_substations)}
    size = len(unique_substations)
    
    adjacency_matrix = np.zeros((size, size), dtype=int)
    
    # Group data by the specified column
    grouped = data.groupby(column)
    
    for _, group in grouped:
        substations = group['Job Substation'].unique()
        for i in range(len(substations)):
            for j in range(i + 1, len(substations)):
                idx1, idx2 = substation_index[substations[i]], substation_index[substations[j]]
                adjacency_matrix[idx1, idx2] = 1
                adjacency_matrix[idx2, idx1] = 1
    
    return adjacency_matrix, unique_substations


In [42]:
# Create adjacency matrices for the specified layers
layers = ['Job Region', 'Job Area (DISTRICT)', 'Month/Day/Year', 'Custs Affected Interval', 
          'OGE Causes', 'Major Storm Event  Y (Yes) or N (No)', 
          'Distribution, Substation, Transmission']

output_path = '/Volumes/Data/NDSU/PhD Work/Research/IME Research/AI-Energy/Data/SPP/adjaceny matric'

if not os.path.exists(output_path):
    os.makedirs(output_path)

for layer in layers:
    # Replace invalid characters in layer names
    clean_layer_name = layer.replace("/", "_").replace(" ", "_").replace("(", "").replace(")", "")
    adjacency_matrix, unique_substations = create_adjacency_matrix(data_subset, layer)
    adjacency_df = pd.DataFrame(adjacency_matrix, index=unique_substations, columns=unique_substations)
    file_name = f'{clean_layer_name}_adjacency_matrix.csv'
    adjacency_df.to_csv(os.path.join(output_path, file_name))
    print(f'Adjacency matrix for {layer} saved to {os.path.join(output_path, file_name)}')


Adjacency matrix for Job Region saved to /Volumes/Data/NDSU/PhD Work/Research/IME Research/AI-Energy/Data/SPP/adjaceny matric/Job_Region_adjacency_matrix.csv
Adjacency matrix for Job Area (DISTRICT) saved to /Volumes/Data/NDSU/PhD Work/Research/IME Research/AI-Energy/Data/SPP/adjaceny matric/Job_Area_DISTRICT_adjacency_matrix.csv
Adjacency matrix for Month/Day/Year saved to /Volumes/Data/NDSU/PhD Work/Research/IME Research/AI-Energy/Data/SPP/adjaceny matric/Month_Day_Year_adjacency_matrix.csv
Adjacency matrix for Custs Affected Interval saved to /Volumes/Data/NDSU/PhD Work/Research/IME Research/AI-Energy/Data/SPP/adjaceny matric/Custs_Affected_Interval_adjacency_matrix.csv
Adjacency matrix for OGE Causes saved to /Volumes/Data/NDSU/PhD Work/Research/IME Research/AI-Energy/Data/SPP/adjaceny matric/OGE_Causes_adjacency_matrix.csv
Adjacency matrix for Major Storm Event  Y (Yes) or N (No) saved to /Volumes/Data/NDSU/PhD Work/Research/IME Research/AI-Energy/Data/SPP/adjaceny matric/Major_St

  grouped = data.groupby(column)
