# Prepare Power Flow Data

This notebook presents the process of data preparation for building power flow regimes. The process is separated into stages (see, [DVC config](../dvc.yaml)):
 - "parse" --- extract necessary parameters from the raw data
 - "transform" --- combine and convert data to use in further steps
 - "prepare" --- build final dataset
 - "model" --- sampling of power flow cases.

The naming of variables in the final dataset corresponds to the [project convention](../convention.md). Some parameters necessary for data preparation are listed in [definitions](../definitions.py).

Since the NREL-118 dataset is mostly intended for OPF tasks, it skips a lot of information that is not necessary to solve these kind of tasks. Thus, to append the data with missing info, [JEAS-118 dataset](http://motor.ece.iit.edu/data/JEAS_IEEE118.doc) is used as the primary source of NREL-118. The following sections describe all the stages of data processing and all the decisions made when preparing the final dataset.

In [1]:
import os

from src.data import parse_jeas118_lines
from src.data import parse_jeas118_loads
from src.data import parse_jeas118_trafos
from src.data import parse_nrel118_buses
from src.data import parse_nrel118_escalators_ts
from src.data import parse_nrel118_gens
from src.data import parse_nrel118_hydros_nondisp_ts
from src.data import parse_nrel118_hydros_ts
from src.data import parse_nrel118_lines
from src.data import parse_nrel118_loads_ts
from src.data import parse_nrel118_outages_ts
from src.data import parse_nrel118_solars_ts
from src.data import parse_nrel118_winds_ts
from src.data import prepare_branches
from src.data import prepare_buses
from src.data import prepare_gens
from src.data import prepare_gens_ts
from src.data import prepare_loads
from src.data import prepare_loads_ts
from src.data import transform_gens_escalated_ts
from src.data import transform_loads
from src.data import transform_outages_ts
from src.data import transform_gens
from src.data import transform_gens_ts


PATH_NREL118 = os.path.join("..", "data", "raw", "nrel118")
PATH_JEAS118 = os.path.join("..", "data", "raw", "jeas118")
PATH_MANUAL = os.path.join("..", "data", "raw", "manual")

## Buses

To build a power flow model, the following information about buses is necessary:
- rated voltage level
- if the bus is in service or out of service
- name (optional)
- region (optional)
- bus coordinates for plots (optional)

Let's load and parse bus data of NREL-118 power system:

In [2]:
path_nrel118_buses = os.path.join(PATH_NREL118, "additional-files-mti-118", "Buses.csv")
nrel118_buses = parse_nrel118_buses(raw_data=path_nrel118_buses)
nrel118_buses.head(2)

Unnamed: 0,bus_name,region,load_participation_factor
0,bus_001,r1,0.047169
1,bus_002,r1,0.018496


The NREL-118 dataset contains only names and regions of buses (`load_participation_factor` is for load modelling, see [Section "Loads"](#loads)). To add missing values, it is assumed the following:
- all buses are in service
- bus "bus_069" is considered to be a slack bus
- rated voltage level of buses 8, 9, 10, 26, 30, 38, 63, 64, 65, 68, 81, 116 equals to 345 kV, the rest of buses has the voltage level of 138 kV. This corresponds to the transformer data from JEAS-118 dataset.

Coordinates of buses were [added manually](../data/raw/manual/bus_coordinates.csv) after designing [the power system plot](../resources/plot/plot.jpg).

Thus, the final bus data look as follows:

In [3]:
path_bus_coordinates = os.path.join(PATH_MANUAL, "bus_coordinates.csv")
buses = prepare_buses(
    parsed_nrel118_buses=nrel118_buses, bus_coordinates=path_bus_coordinates
)
buses.head(2)

Unnamed: 0,bus_name,region,in_service,v_rated_kv,x_coordinate,y_coordinate
0,bus_001,r1,True,138,626.0,-324.0
1,bus_002,r1,True,138,678.0,-324.0


## Branches

"Branches" is a common term both for lines and transformers. The following parameters about branches are necessary to build models:
- start and end buses of the branch
- number of parallel branch systems
- resistance
- reactance
- active conductance
- in service or out of service
- maximum power flow (optional)
- transformation ratio if the branch is a transformer (optional)
- name (optional)

Let's load and parse line data of NREL-118 power system:

In [4]:
path_nrel118_lines = os.path.join(PATH_NREL118, "additional-files-mti-118", "Lines.csv")
nrel118_lines = parse_nrel118_lines(raw_data=path_nrel118_lines)
nrel118_lines.head(2)

Unnamed: 0,branch_number,from_bus,to_bus,max_p_mw,x_pu,r_pu
0,1,bus_001,bus_002,600.0,0.0999,0.0303
1,2,bus_001,bus_003,600.0,0.0424,0.0129


Since the information about active conductance and parallel number is skipped in the NREL-118 dataset, let's load it from the JEAS-118 dataset:

In [5]:
path_jeas118_lines = os.path.join(PATH_JEAS118, "JEAS_IEEE118.doc")
jeas118_lines = parse_jeas118_lines(raw_data=path_jeas118_lines)
jeas118_lines.head(2)

Unnamed: 0,from_bus,to_bus,parallel,b_pu
0,bus_001,bus_002,1,0.0254
1,bus_001,bus_003,1,0.01082


In the NREL-118 dataset, transformers are presented as lines without values of transformation ratio. Therefore, these values will be loaded from JEAS-118 dataset:

In [6]:
path_jeas118_trafos = os.path.join(PATH_JEAS118, "JEAS_IEEE118.doc")
jeas118_trafos = parse_jeas118_trafos(raw_data=path_jeas118_trafos)
jeas118_trafos.head(2)

Unnamed: 0,from_bus,to_bus,parallel,trafo_ratio_rel
0,bus_008,bus_005,1,0.985
1,bus_026,bus_025,1,0.96


Thus, the final branch data look as follows:

In [7]:
branches = prepare_branches(
    parsed_nrel118_lines=nrel118_lines,
    parsed_jeas118_lines=jeas118_lines,
    parsed_jeas118_trafos=jeas118_trafos,
    prepared_buses=buses,
)
branches.head(2)

Unnamed: 0,branch_name,from_bus,to_bus,parallel,in_service,r_ohm,x_ohm,b_µs,trafo_ratio_rel,max_i_ka
0,branch_001_002_1,bus_001,bus_002,1,True,5.770332,19.024956,133.375341,,2.510219
1,branch_001_003_1,bus_001,bus_003,1,True,2.456676,8.074656,56.815795,,2.510219


It is assumed that branches are always in service.

## Loads

Here is the list of necessary load variables:

- bus where the load is located
- active and reactive power of the load
- if the load is in service
- name (optional)

The information about a part of the regional active load located in each bus is stored in variable `load_participation_factor` in the bus data of the NREL-118 dataset:

In [8]:
nrel118_buses.head(2)

Unnamed: 0,bus_name,region,load_participation_factor
0,bus_001,r1,0.047169
1,bus_002,r1,0.018496


Active load value of regions is stored in the time-series NREL-118 data:

In [9]:
path_nrel118_loads_ts = os.path.join(PATH_NREL118, "Input files", "RT", "Load")
nrel118_loads_ts = parse_nrel118_loads_ts(raw_data=path_nrel118_loads_ts)
nrel118_loads_ts.head(2)

Unnamed: 0,datetime,region_name,region_load
0,2024-01-01 00:00:00,r1,5698.083154
1,2024-01-01 00:00:00,r2,1967.41709


To calculate reactive power of loads, let's get the JEAS-118 data:

In [10]:
path_jeas118_loads = os.path.join(PATH_JEAS118, "JEAS_IEEE118.doc")
jeas118_loads = parse_jeas118_loads(raw_data=path_jeas118_loads)
jeas118_loads.head(2)

Unnamed: 0,bus_name,p_mw,q_mvar
0,bus_001,54.14,8.66
1,bus_002,21.23,9.55


The JEAS-118 load data will help to estimate the power factor of each load and define its reactive power at each moment of time using time-series data of active demand:

In [11]:
transformed_loads = transform_loads(
    parsed_nrel118_buses=nrel118_buses, parsed_jeas118_loads=jeas118_loads
)
transformed_loads.head(2)

Unnamed: 0,load_name,bus_name,region,load_participation_factor,load_power_factor
0,load_001,bus_001,r1,0.047169,0.987447
1,load_002,bus_002,r1,0.018496,0.911978


Thus, it is necessary to prepare two files with load data. The first file will contain the load power variation over time, the other will contain basic load information (location, etc.). It is assumed that loads are always in service.

In [12]:
loads = prepare_loads(transformed_loads=transformed_loads)
loads.head(2)

Unnamed: 0,load_name,bus_name
0,load_001,bus_001
1,load_002,bus_002


In [13]:
loads_ts = prepare_loads_ts(
    transformed_loads=transformed_loads, parsed_nrel118_loads_ts=nrel118_loads_ts
)
loads_ts.head(2)

Unnamed: 0,datetime,load_name,in_service,p_mw,q_mvar
0,2024-01-01,load_001,True,268.770998,42.991445
1,2024-01-01,load_002,True,105.393569,47.409731


## Generators


To build a power flow model, the following information about generators is necessary:
- bus where the generator is located
- active power of the generator
- if the generator is in service
- voltage set point of the generator
- max and min limits of reactive power output
- name (optional)
- max limit of active output (optional)

Let's start from parsing generator data from the NREL-118 dataset:

In [14]:
path_nrel118_gens = os.path.join(
    PATH_NREL118, "additional-files-mti-118", "Generators.csv"
)
nrel118_gens = parse_nrel118_gens(raw_data=path_nrel118_gens)
nrel118_gens.head(2)

Unnamed: 0,gen_name,bus_name,max_p_mw
0,biomass_001,bus_012,3.0
1,biomass_002,bus_012,3.0


Next, time-series data from the NREL-118 dataset are parsed:

In [15]:
# Hydro plants
path_nrel118_hydros_ts = os.path.join(PATH_NREL118, "Input files", "Hydro")
nrel118_hydros_ts = parse_nrel118_hydros_ts(raw_data=path_nrel118_hydros_ts)
nrel118_hydros_ts.head(2)

Unnamed: 0,datetime,gen_name,p_mw
0,2024-01-01 00:00:00,hydro_016,0.17696
1,2024-01-01 00:00:00,hydro_017,0.29862


In [16]:
# Solar plants
path_nrel118_solars_ts = os.path.join(PATH_NREL118, "Input files", "RT", "Solar")
nrel118_solars_ts = parse_nrel118_solars_ts(raw_data=path_nrel118_solars_ts)
nrel118_solars_ts.head(2)

Unnamed: 0,datetime,gen_name,p_mw
0,2024-01-01 00:00:00,solar_001,0.0
1,2024-01-01 00:00:00,solar_002,0.0


In [17]:
# Wind plants
path_nrel118_winds_ts = os.path.join(PATH_NREL118, "Input files", "RT", "Wind")
nrel118_winds_ts = parse_nrel118_winds_ts(raw_data=path_nrel118_winds_ts)
nrel118_winds_ts.head(2)

Unnamed: 0,datetime,gen_name,p_mw
0,2024-01-01 00:00:00,wind_001,0.458135
1,2024-01-01 00:00:00,wind_002,3.724274


In [18]:
# Non-dispatchable hydro plants
path_nrel118_hydros_nondisp_ts = os.path.join(
    PATH_NREL118,
    "additional-files-mti-118",
    "Hydro_nondipatchable.csv",
)
nrel118_hydros_nondisp_ts = parse_nrel118_hydros_nondisp_ts(
    raw_data=path_nrel118_hydros_nondisp_ts
)
nrel118_hydros_nondisp_ts.head(2)

Unnamed: 0,datetime,gen_name,p_mw
0,2024-01-01 00:00:00,hydro_036,0.51
1,2024-01-01 00:00:00,hydro_037,2.23


Escalators used to adjust generation profile to seasons or other time for all the generators, except wind, solar, and hydro:

In [19]:
# Escalators data
path_nrel118_escalators_ts = os.path.join(
    PATH_NREL118, "additional-files-mti-118", "Escalators.csv"
)
nrel118_escalators_ts = parse_nrel118_escalators_ts(raw_data=path_nrel118_escalators_ts)
nrel118_escalators_ts.head(2)

Unnamed: 0,datetime,gen_name,escalator_ratio
0,2024-01-01 00:00:00,biomass_001,0.35
1,2024-01-01 00:00:00,biomass_002,0.35


Thus, it is possible to multiply max output of generation by escalators to estimate the power output over time of all generators, except wind, solar and hydro:

In [20]:
gens_escalated_ts = transform_gens_escalated_ts(
    parsed_nrel118_gens=nrel118_gens, parsed_nrel118_escalators_ts=nrel118_escalators_ts
)
gens_escalated_ts.head(2)

Unnamed: 0,datetime,gen_name,p_mw
0,2024-01-01 00:00:00,biomass_001,1.05
1,2024-01-01 00:00:00,biomass_002,1.05


In [21]:
# Outages
path_nrel118_outages_ts = os.path.join(
    PATH_NREL118, "Input files", "Others", "GenOut.csv"
)
nrel118_outages_ts = parse_nrel118_outages_ts(raw_data=path_nrel118_outages_ts)
nrel118_outages_ts.head(2)

Unnamed: 0,datetime,gen_name,in_outage
0,2024-01-01 00:00:00,PSH_001,False
1,2024-01-01 00:00:00,PSH_002,False


Next, some intermediate calculations are performed with the following assumptions:
1. Missing outputs of power plants are set to zero (see [this script](../src/data/transform/gens_ts.py) for details).
2. Range of reactive generator output set from -0.35 to 0.75 of the actual active output (see [this script](../src/data/transform/gens_ts.py) for details).
3. Rated voltage of generators are equal to bus voltages (see [this script](../src/data/transform/gens_ts.py) for details).
4. Missing power plants in the outage data are always in service (see [this script](../src/data/transform/outages_ts.py) for details).
5. Unknown power plants in the outage data are dropped (see [this script](../src/data/transform/outages_ts.py) for details).
6. Duplicated generator outages are considered to be typos, as a result of which the state of generator "internal_combustion_gas_001" was attributed to other generators (see [this script](../src/data/transform/outages_ts.py) for details).

In [22]:
transformed_gens = transform_gens(parsed_nrel118_gens=nrel118_gens)
transformed_outages_ts = transform_outages_ts(
    parsed_nrel118_outages_ts=nrel118_outages_ts
)
transformed_gens_escalated_ts = transform_gens_escalated_ts(
    parsed_nrel118_gens=nrel118_gens,
    parsed_nrel118_escalators_ts=nrel118_escalators_ts,
)
transformed_gens_ts = transform_gens_ts(
    parsed_nrel118_winds_ts=nrel118_winds_ts,
    parsed_nrel118_solars_ts=nrel118_solars_ts,
    parsed_nrel118_hydros_ts=nrel118_hydros_ts,
    parsed_nrel118_hydros_nondisp_ts=nrel118_hydros_nondisp_ts,
    transformed_gens_escalated_ts=transformed_gens_escalated_ts,
    prepared_buses=buses,
    transformed_gens=transformed_gens,
)

Finally, let's concat all datasets to build two files with generation data --- general generation info (location, etc.), time-series data (p_mw, in_service, etc.):

In [23]:
prepared_gens = prepare_gens(transformed_gens=transformed_gens)
prepared_gens.head(2)

Unnamed: 0,gen_name,bus_name,max_p_mw
0,plant_004,bus_004,144.7
1,plant_006,bus_006,168.4


In [24]:
prepared_gens_ts = prepare_gens_ts(
    transformed_gens=transformed_gens,
    transformed_gens_ts=transformed_gens_ts,
    transformed_outages_ts=transformed_outages_ts,
)
prepared_gens_ts.head(2)

Unnamed: 0,datetime,gen_name,in_service,p_mw,v_set_kv,q_max_mvar,q_min_mvar
0,2024-01-01,plant_004,True,126.21025,138.0,94.657687,-37.863074
1,2024-01-01,plant_006,True,148.89524,138.0,111.67143,-44.668572


If the `PLANT_MODE` from [the configuration file](../definitions.py) is `True`, then generators are grouped per buses into power plants. In this case, the rated voltage of each power plant is calculated as an average of its generator voltages, and the output is calculated as a sum of outputs of each generator.
