# Introduction

In this assignment you will be given a series of tasks about using the library `power-grid-model`. The tasks include:

1. [Load input](#Assignment-1:-Load-Input-Data)
2. [Validate Input Data](#Assignment-2:-Validate-Input-Data)
3. [Construct Model](#Assignment-3:-Construct-Model)
4. [Calculate One Time Power Flow](#Assignment-4:-Calculate-One-Time-Power-Flow)
5. [Time Series Batch Calculation](#Assignment-5:-Time-Series-Batch-Calculation)
6. [N 1 Scenario-Batch-Calculation](#Assignment-6:-N-1-Scenario-Batch-Calculation)

The input data are CSV files in the `data/` folder:
* `node.csv`
* `line.csv`
* `source.csv`
* `sym_load.csv`


# Preparation

First import everything we need for this workshop:

In [1]:
import time
from typing import Dict

import numpy as np
import pandas as pd

from power_grid_model import (
    PowerGridModel,
    CalculationType,
    CalculationMethod,
    initialize_array
)

from power_grid_model.validation import (
    assert_valid_input_data,
    assert_valid_batch_data
)

Let's define a timer class to easily benchmark the calculations:

In [2]:
class Timer:
    def __init__(self, name: str):
        self.name = name
        self.start = None

    def __enter__(self):
        self.start = time.perf_counter()

    def __exit__(self, *args):
        print(f'Execution time for {self.name} is {(time.perf_counter() - self.start):0.6f} s')

The following example measures the time for a simple add operation of two numpy arrays.

In [3]:
a = np.random.rand(1000000)
b = np.random.rand(1000000)
with Timer("Add Operation"):
    c = a + b

Execution time for Add Operation is 0.002995 s


# Assignment 1: Load Input Data

The following function loads the CSV data files from folder `data/` and convert them into one dictionary of numpy structured arrays. The returned dictionary is a compatible input for the constructor of `PowerGridModel`. Please complete the function to construct the input data which is compatible with `PowerGridModel`.

In [4]:
def load_input_data() -> Dict[str, np.ndarray]:
    input_data = {}
    for component in ['node', 'line', 'source', 'sym_load']:
        
        # Use pandas to read CSV data
        df = pd.read_csv(f'data/{component}.csv')

        # Initialize array
        input_data[component] = initialize_array('input', component, len(df))

        # Fill the attributes
        for attr, values in df.items():
            input_data[component][attr] = values

        # Print some debug info
        print(f"{component:9s}: {len(input_data[component]):4d}")

    return input_data

# Load input data
with Timer("Loading Input Data"):
    input_data = load_input_data()


node     : 2001
line     : 2000
source   :    1
sym_load : 2000
Execution time for Loading Input Data is 0.052330 s


# Assignment 2: Validate Input Data

It is recommended to validate your data before constructing the `PowerGridModel`. If you are confident about your input data, you can skip this step for performance reasons. The easiest way to validate your input data is using `assert_valid_input_data`, which will raise an exception if there are any errors in your data. Please have a look at the [Validation Examples](https://github.com/alliander-opensource/power-grid-model/blob/main/examples/Validation%20Examples.ipynb) for more detailed information on the validation functions.

In [5]:
# Validate input data
with Timer("Validating Input Data"):
    assert_valid_input_data(input_data=input_data, calculation_type=CalculationType.power_flow)

Execution time for Validating Input Data is 0.009636 s


# Assignment 3: Construct Model

Create an instance of `PowerGridModel` using the input data. Benchmark the construction time.

In [6]:
# Construct model
with Timer("Model Construction"):
    model = PowerGridModel(input_data=input_data)

# Print the number of objects
print(model.all_component_count)

Execution time for Model Construction is 0.001296 s
{'line': 2000, 'node': 2001, 'source': 1, 'sym_load': 2000}


# Assignment 4: Calculate One-Time Power Flow

* Calculate one-time power flow, print the highest and lowest loading of the lines.
* Try with Newton-Raphson and linear method, compare the results and speed.

In [7]:
# Newton-Raphson Power Flow
with Timer("Newton-Raphson Power Flow"):
    result = model.calculate_power_flow(calculation_method=CalculationMethod.newton_raphson)
    
# Print min and max line loading
print("Min line loading:", min(result["line"]["loading"]))
print("Min line loading:", max(result["line"]["loading"]))

Execution time for Newton-Raphson Power Flow is 0.122817 s
Min line loading: 0.14188449783807638
Min line loading: 1.6292378285645182


In [8]:
# Linear Power Flow
with Timer("Linear Power Flow"):
    result = model.calculate_power_flow(calculation_method=CalculationMethod.linear)
    
# Print min and max line loading
print("Min line loading:", min(result["line"]["loading"]))
print("Max line loading:", max(result["line"]["loading"]))

Execution time for Linear Power Flow is 0.003432 s
Min line loading: 0.1395686087394204
Min line loading: 1.6156849991056184


# Assignment 5: Time Series Batch Calculation

## Load Profile

Below we randomly generate a dataframe of load profile. 

* The column names are the IDs of `sym_load`
* Each row is one scenario
* Each entry specifies the active power of the load
* The reactive power is zero


In [9]:
# Generate random load profile
n_scenarios = 1000
n_loads = len(input_data["sym_load"]) 
load_id = input_data["sym_load"]["id"]
load_p = input_data["sym_load"]["p_specified"]
profile = np.tile(load_p, (n_scenarios, 1)) + 1e4 * np.random.randn(n_scenarios, n_loads)
df_load_profile = pd.DataFrame(profile, columns=load_id)
display(df_load_profile)

Unnamed: 0,4002,4003,4004,4005,4006,4007,4008,4009,4010,4011,...,5992,5993,5994,5995,5996,5997,5998,5999,6000,6001
0,9.915574e+05,1.041937e+06,920479.625505,972107.888410,1.082858e+06,891937.045335,926657.373359,1.077957e+06,1.001909e+06,1.020692e+06,...,1.010317e+06,1.072578e+06,1.062850e+06,922852.019563,904239.197838,9.836537e+05,938690.083481,1.087556e+06,911676.467193,1.113923e+06
1,9.812465e+05,1.022385e+06,908367.948469,972162.905093,1.064926e+06,900278.056699,928993.730573,1.079509e+06,1.027292e+06,1.012067e+06,...,1.001604e+06,1.068702e+06,1.027396e+06,933749.871334,910464.189725,9.819365e+05,934418.756586,1.088987e+06,943760.559586,1.094036e+06
2,9.952271e+05,1.045293e+06,935740.236611,979902.302184,1.062770e+06,906971.825838,923945.046753,1.040995e+06,1.029800e+06,1.018948e+06,...,1.019384e+06,1.080574e+06,1.040944e+06,914347.525585,901601.171481,9.949111e+05,931006.971170,1.093212e+06,929287.365023,1.083244e+06
3,9.920631e+05,1.041287e+06,935290.059630,974774.739552,1.063666e+06,916073.963838,921941.854547,1.049433e+06,1.029044e+06,1.012838e+06,...,9.955819e+05,1.092287e+06,1.053018e+06,914337.464299,897821.146663,9.852155e+05,952822.357441,1.091220e+06,917377.886449,1.086366e+06
4,9.991754e+05,1.046883e+06,920477.800913,967905.490084,1.061321e+06,911676.321158,939514.382604,1.069454e+06,1.027150e+06,1.014885e+06,...,9.895435e+05,1.068490e+06,1.032782e+06,940125.474433,892227.483635,9.926455e+05,916315.025623,1.090465e+06,899889.684559,1.096315e+06
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,9.783880e+05,1.039982e+06,916964.496076,955890.782880,1.063642e+06,912633.332756,916711.275236,1.055183e+06,1.024239e+06,1.033267e+06,...,1.000847e+06,1.078962e+06,1.055384e+06,926148.888358,912486.821907,9.881993e+05,928695.937347,1.092757e+06,931931.649067,1.073828e+06
996,9.862508e+05,1.040024e+06,900568.892802,969863.876075,1.072430e+06,910842.845593,934348.359513,1.061792e+06,1.016693e+06,1.018378e+06,...,1.005536e+06,1.072628e+06,1.028477e+06,932514.384662,887349.872759,9.846185e+05,930496.807198,1.089242e+06,937716.181746,1.092802e+06
997,1.003151e+06,1.046678e+06,927742.613354,964917.835529,1.073179e+06,904855.255208,948430.444555,1.059572e+06,1.042647e+06,1.036259e+06,...,1.011350e+06,1.081160e+06,1.049811e+06,920548.904965,907760.484428,1.010574e+06,935994.963105,1.098474e+06,910193.149463,1.092341e+06
998,9.937902e+05,1.025852e+06,900835.270579,977393.738430,1.075662e+06,897561.026643,936761.567890,1.043752e+06,1.028722e+06,1.023018e+06,...,1.012671e+06,1.068783e+06,1.054626e+06,916723.336269,910574.231281,9.781457e+05,920907.040642,1.084224e+06,918655.031368,1.094858e+06


## Run Time Series Calculation

We want to run a time-series load flow batch calculation using the dataframe.

* Convert the load profile into the compatible batch update dataset.
* Run the batch calculation.
* Compare the calculation methods `newton_raphson` and `linear`.

In [10]:
# Initialize an empty load profile
load_profile = initialize_array("update", "sym_load", df_load_profile.shape)

# Set the attributes for the batch calculation (assume q_specified = 0.0)
load_profile["id"] = df_load_profile.columns.to_numpy()
load_profile["p_specified"] = df_load_profile.to_numpy()
load_profile["q_specified"] = 0.0

# Construct the update data
update_data = {"sym_load": load_profile}

In [11]:
# Validating batch data can take a long time.
# It is recommended to only validate batch data when you run into trouble.
with Timer("Validating Batch Data"):
    assert_valid_batch_data(input_data=input_data, update_data=update_data, calculation_type=CalculationType.power_flow)

Execution time for Validating Batch Data is 30.252498 s


In [12]:
# Run Newton Raphson power flow (this may take a minute...)
with Timer("Batch Calculation using Newton-Raphson"):
    model.calculate_power_flow(update_data=update_data, calculation_method=CalculationMethod.newton_raphson)

Execution time for Batch Calculation using Newton-Raphson is 8.673790 s


In [13]:
# Run linear power flow
with Timer("Batch Calculation using linear calculation"):
    model.calculate_power_flow(update_data=update_data, calculation_method=CalculationMethod.linear)

Execution time for Batch Calculation using linear calculation is 1.283085 s


# Assignment 6: N-1 Scenario Batch Calculation

We want to run a N-1 Scenario analysis. For each batch calculation, one `line` is disconnected at from- and to-side.

In [14]:
n_lines = len(input_data["line"])

# Initialize an empty line profile
line_profile = initialize_array("update", "line", (n_lines, n_lines))

# Set the attributes for the batch calculation
line_profile["id"] =  input_data["line"]["id"]
line_profile["from_status"] = 1 - np.eye(n_lines, dtype=np.uint8)
line_profile["to_status"] = 1 - np.eye(n_lines, dtype=np.uint8)

# Construct the update data
update_data = {"line": line_profile}

In [15]:
# Validating batch data can take a long time.
# It is recommended to only validate batch data when you run into trouble.
with Timer("Validating Batch Data"):
    assert_valid_batch_data(input_data=input_data, update_data=update_data, calculation_type=CalculationType.power_flow)

Execution time for Validating Batch Data is 61.045334 s


In [16]:
# Run Newton Raphson power flow (this may take a minute...)
with Timer("Batch Calculation using Newton-Raphson"):
    model.calculate_power_flow(update_data=update_data, calculation_method=CalculationMethod.newton_raphson)

Execution time for Batch Calculation using Newton-Raphson is 19.149320 s


In [17]:
# Run linear power flow
with Timer("Batch Calculation using linear calculation"):
    model.calculate_power_flow(update_data=update_data, calculation_method=CalculationMethod.linear)

Execution time for Batch Calculation using linear calculation is 4.194670 s


## Parallel processing
The `calculate_power_flow` method has an optional `threading` argument to define the number of threads ran in parallel. Experiment with different threading values and compare the results...

In [18]:
# By default, sequential threading is used
with Timer("Sequential"):
    model.calculate_power_flow(update_data=update_data)

# Single thread, this is essentially the same as running a single thread
with Timer("Single thread"):
    model.calculate_power_flow(update_data=update_data, threading=1)

# Two threads should be faster    
with Timer("Two threads in parallel"):
    model.calculate_power_flow(update_data=update_data, threading=2)

# Four threads should be even faster    
with Timer("Four threads in parallel"):
    model.calculate_power_flow(update_data=update_data, threading=4)

# Use number of threads based the machine hardware    
with Timer("Use number of threads based the machine hardware"):
    model.calculate_power_flow(update_data=update_data, threading=0)

Execution time for Sequential is 19.125076 s
Execution time for Single thread is 19.060767 s
Execution time for Two threads in parallel is 9.916578 s
Execution time for Four threads in parallel is 5.763393 s
Execution time for Use number of threads based the machine hardware is 4.290474 s
