# Introduction

In this assignment you will be given a series of tasks about using the library `power-grid-model`. The tasks include:

1. [Load input](#Assignment-1:-Load-Input-Data)
2. [Validate Input Data](#Assignment-2:-Validate-Input-Data)
3. [Construct Model](#Assignment-3:-Construct-Model)
4. [Calculate One Time Power Flow](#Assignment-4:-Calculate-One-Time-Power-Flow)
5. [Time Series Batch Calculation](#Assignment-5:-Time-Series-Batch-Calculation)
6. [N 1 Scenario-Batch-Calculation](#Assignment-6:-N-1-Scenario-Batch-Calculation)

The input data are CSV files in the `data/` folder:
* `node.csv`
* `line.csv`
* `source.csv`
* `sym_load.csv`


# Preparation

First import everything we need for this workshop:

In [None]:
import time
from typing import Dict

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from power_grid_model import (
    PowerGridModel,
    CalculationType,
    CalculationMethod,
    initialize_array
)

from power_grid_model.validation import (
    assert_valid_input_data,
    assert_valid_batch_data
)

Let's define a timer class to easily benchmark the calculations:

In [None]:
class Timer:
    def __init__(self, name: str):
        self.name = name
        self.start = None

    def __enter__(self):
        self.start = time.perf_counter()

    def __exit__(self, *args):
        print(f'Execution time for {self.name} is {(time.perf_counter() - self.start):0.6f} s')

The following example measures the time for a simple add operation of two numpy arrays.

In [None]:
a = np.random.rand(1000000)
b = np.random.rand(1000000)
with Timer("Add Operation"):
    c = a + b

# Assignment 1: Load Input Data

The following function loads the CSV data files from folder `data/` and convert them into one dictionary of numpy structured arrays. The returned dictionary is a compatible input for the constructor of `PowerGridModel`. Please complete the function to construct the input data which is compatible with `PowerGridModel`.

In [None]:
def load_input_data() -> Dict[str, np.ndarray]:
    input_data = {}
    for component in ['node', 'line', 'source', 'sym_load']:
        
        # Use pandas to read CSV data
        df = pd.read_csv(f'data/{component}.csv')

        # TODO: Initialize array
        input_data[component] = ...

        # TODO: Fill the attributes
        for attr ...:
            input_data[component][attr] = ...

        # Print some debug info
        print(f"{component:9s}: {len(input_data[component]):4d}")

    return input_data

# TODO: Load input data
with Timer("Loading Input Data"):
    input_data = ...


# Assignment 2: Validate Input Data

It is recommended to validate your data before constructing the `PowerGridModel`. If you are confident about your input data, you can skip this step for performance reasons. The easiest way to validate your input data is using `assert_valid_input_data`, which will raise an exception if there are any errors in your data. Please have a look at the [Validation Examples](https://github.com/alliander-opensource/power-grid-model/blob/main/examples/Validation%20Examples.ipynb) for more detailed information on the validation functions.

In [None]:
# TODO: Validate input data
with Timer("Validating Input Data"):
    assert_valid_input_data(...)

# Assignment 3: Construct Model

Create an instance of `PowerGridModel` using the input data. Benchmark the construction time.

In [None]:
# TODO: Construct model
with Timer("Model Construction"):
    model = PowerGridModel(...)

# Print the number of objects
print(model.all_component_count)

# Assignment 4: Calculate One-Time Power Flow

* Calculate one-time power flow, print the highest and lowest loading of the lines.
* Try with Newton-Raphson and linear method, compare the results and speed.

In [None]:
# TODO: Newton-Raphson Power Flow
with Timer("Newton-Raphson Power Flow"):
    result = ...
    
# TODO: Print min and max line loading
print("Min line loading:", ...)
print("Max line loading:", ...)

In [None]:
# TODO: Linear Power Flow
with Timer("Linear Power Flow"):
    result = ...
    
# TODO: Print min and max line loading
print("Min line loading:", ...)
print("Max line loading:", ...)

# Assignment 5: Time Series Batch Calculation

## Load Profile

Below we randomly generate a dataframe of load profile. 

* The column names are the IDs of `sym_load`
* Each row is one scenario
* Each entry specifies the active power of the load
* The reactive power is zero


In [None]:
# Generate random load profile of hourly data
n_scenarios = 1000
n_loads = len(input_data["sym_load"]) 
load_id = input_data["sym_load"]["id"]
load_p = input_data["sym_load"]["p_specified"]
profile = np.tile(load_p, (n_scenarios, 1)) + 1e4 * np.random.randn(n_scenarios, n_loads)
dti = dti = pd.date_range("2022-01-01", periods=n_scenarios, freq="H")
df_load_profile = pd.DataFrame(profile, columns=load_id, index=dti)
display(df_load_profile)

## Run Time Series Calculation

We want to run a time-series load flow batch calculation using the dataframe.

* Convert the load profile into the compatible batch update dataset.
* Run the batch calculation.
* Compare the calculation methods `newton_raphson` and `linear`.

In [None]:
# TODO: Initialize an empty load profile
load_profile = initialize_array(..., ..., ...)

# TODO: Set the attributes for the batch calculation (assume q_specified = 0.0)
load_profile["id"] = ...
load_profile["p_specified"] = ...
load_profile["q_specified"] = ...

# Construct the update data
update_data = {"sym_load": load_profile}

In [None]:
# Validating batch data can take a long time.
# It is recommended to only validate batch data when you run into trouble.
with Timer("Validating Batch Data"):
    assert_valid_batch_data(input_data=input_data, update_data=update_data, calculation_type=CalculationType.power_flow)

In [None]:
# TODO: Run Newton Raphson power flow (this may take a minute...)
with Timer("Batch Calculation using Newton-Raphson"):
    output_data = model.calculate_power_flow(...)

In [None]:
# TODO: Run linear power flow
with Timer("Batch Calculation using linear calculation"):
    output_data_linear = model.calculate_power_flow(...)

### Plotting batch results

Lets say we wish to plot the loading of the `line 7` vs time. We can use matplotlib to do so. (Note: The grid and results are randomly generated so dont be alarmed to see loading >100% or other unrealistic result)  

In [None]:
# TODO: Prepare data to be plotted
result_loading = output_data["line"]["loading"][...]
plt.plot(result_loading)
plt.title('Loading of line no. 7')
plt.xlabel('Time')
plt.ylabel('Loading')
plt.show()

### Indexing the results

Find the time stamps where loading in `line 7` is greater than `68.4%`

In [None]:
# TODO: Fill condition
ind = np.where(...)
df_load_profile.index[ind]

# Assignment 6: N-1 Scenario Batch Calculation

We want to run a N-1 Scenario analysis. For each batch calculation, one `line` is disconnected at from- and to-side.

In [None]:
n_lines = len(input_data["line"])

# TODO: Initialize an empty line profile
line_profile = initialize_array(..., ..., ...)

# TODO: Set the attributes for the batch calculation
line_profile["id"] =  ...
line_profile["from_status"] = ...
line_profile["to_status"] = ...

# Construct the update data
update_data = {"line": line_profile}

In [None]:
# Validating batch data can take a long time.
# It is recommended to only validate batch data when you run into trouble.
with Timer("Validating Batch Data"):
    assert_valid_batch_data(input_data=input_data, update_data=update_data, calculation_type=CalculationType.power_flow)

In [None]:
# TODO: Run Newton Raphson power flow (this may take a minute...)
with Timer("Batch Calculation using Newton-Raphson"):
    model.calculate_power_flow(...)

In [None]:
# TODO: Run linear power flow
with Timer("Batch Calculation using linear calculation"):
    model.calculate_power_flow(...)

## Parallel processing
The `calculate_power_flow` method has an optional `threading` argument to define the number of threads ran in parallel. Experiment with different threading values and compare the results...

In [None]:
# By default, sequential threading is used
with Timer("Sequential"):
    model.calculate_power_flow(update_data=update_data)

# TODO: Single thread, this is essentially the same as running a single thread
with Timer("Single thread"):
    model.calculate_power_flow(update_data=update_data, threading=...)

# TODO: Two threads should be faster    
with Timer("Two threads in parallel"):
    model.calculate_power_flow(update_data=update_data, threading=...)

# TODO: Four threads should be even faster    
with Timer("Four threads in parallel"):
    model.calculate_power_flow(update_data=update_data, threading=...)

# TODO: Use number of threads based the machine hardware    
with Timer("Use number of threads based the machine hardware"):
    model.calculate_power_flow(update_data=update_data, threading=...)