# Adding time-series injections

In this tutorial we will outline how to produce time-series of the parameters of our network elements.

Most these time series can be produced with just our grid data files. In the example below, we take the time series of the voltage level connected to one end of the lines in our grid (here the element is the line, and the time-varying parameter is the `voltage_level1_id`).

Other time series will require injection data, such as the injected power of each generator (element is the generator, time varying parameter is `target_p`).

In [1]:
import pypowsybl as pp
import pandas as pd
from pathlib import Path

grid = pp.network.load(file='data/recollement-auto-20210101-0000-enrichi.xiidm.bz2')
print('Grid model loaded')

Grid model loaded


In [2]:
grid.get_lines()

Unnamed: 0_level_0,name,r,x,g1,b1,g2,b2,p1,q1,i1,p2,q2,i2,voltage_level1_id,voltage_level2_id,bus1_id,bus2_id,connected1,connected2
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
.CTLHL31.CTLO,,1.149352,2.103875,0.0,0.000006,0.0,0.000006,,,,,,,.CTLHP3,.CTLOP3,.CTLHP3_0,,True,False
.CTLHL32.CTLO,,1.149352,2.103875,0.0,0.000006,0.0,0.000006,,,,,,,.CTLHP3,.CTLOP3,.CTLHP3_1,.CTLOP3_3,True,True
.CTLOL31FINS,,2.127000,4.709000,0.0,0.000017,0.0,0.000045,,,,,,,.CTLOP3,FINS P3,.CTLOP3_0,FINS P3_0,True,True
.CTLOL31ZLIEB,,3.870000,9.045000,0.0,0.000032,0.0,0.000033,,,,,,,.CTLOP3,ZLIEBP3,.CTLOP3_0,ZLIEBP3_0,True,True
.G.ROL51HOSPI,,1.599000,4.665999,0.0,0.000032,0.0,0.000028,,,,,,,.G.ROP5,HOSPIP5,,HOSPIP5_0,False,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
ZNAUSL31ZPRA5,,6.660255,12.191515,0.0,0.000040,0.0,0.000041,,,,,,,ZPRA5P3,ZNAUSP3,ZPRA5P3_0,ZNAUSP3_0,True,True
ZPRRFL32ZPRR5,,0.050000,0.120000,0.0,0.000000,0.0,0.000000,,,,,,,ZPRRFP3,ZPRR5P3,ZPRRFP3_0,ZPRR5P3_0,True,True
ZQUINL61ZSSCR,,0.350000,2.370000,0.0,0.000009,0.0,0.000009,,,,,,,ZQUINP6,ZSSCRP6,ZQUINP6_0,ZSSCRP6_0,True,True
ZS.SEL31ZVLET,,0.100000,0.190000,0.0,0.000001,0.0,0.000001,,,,,,,ZS.SEP3,ZVLETP3,ZS.SEP3_0,ZVLETP3_0,True,True


If we want to get a time series of a parameter of an element, we will need to create a loop that opens all the files that we want to evaluate:

In [3]:
# Assuming we already have a list of files from the path, and we want to store all the 'voltage_level1_id' values of all the lines

list_files = ['data/recollement-auto-20210101-0000-enrichi.xiidm.bz2', 'data/recollement-auto-20210101-0005-enrichi.xiidm.bz2', 'data/recollement-auto-20210101-0010-enrichi.xiidm.bz2']
vl1data = {}
for i, file in enumerate(list_files):
    grid = pp.network.load(file)
    vl1_values = grid.get_lines(attributes=['voltage_level1_id'])['voltage_level1_id'] 
    timestep = Path(file).stem # Here we use the snapshot name to be the name of the column. You could choose some other name for the columns of the time series.
    vl1data[f'Snap{i}: {timestep}'] = vl1_values

vl1_timeseries = pd.DataFrame(vl1data)
vl1_timeseries

Unnamed: 0_level_0,Snap0: recollement-auto-20210101-0000-enrichi.xiidm,Snap1: recollement-auto-20210101-0005-enrichi.xiidm,Snap2: recollement-auto-20210101-0010-enrichi.xiidm
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
.CTLHL31.CTLO,.CTLHP3,.CTLHP3,.CTLHP3
.CTLHL32.CTLO,.CTLHP3,.CTLHP3,.CTLHP3
.CTLOL31FINS,.CTLOP3,.CTLOP3,.CTLOP3
.CTLOL31ZLIEB,.CTLOP3,.CTLOP3,.CTLOP3
.G.ROL51HOSPI,.G.ROP5,.G.ROP5,.G.ROP5
...,...,...,...
ZNAUSL31ZPRA5,ZPRA5P3,ZPRA5P3,ZPRA5P3
ZPRRFL32ZPRR5,ZPRRFP3,ZPRRFP3,ZPRRFP3
ZQUINL61ZSSCR,ZQUINP6,ZQUINP6,ZQUINP6
ZS.SEL31ZVLET,ZS.SEP3,ZS.SEP3,ZS.SEP3


We can carry out a similar procedure to produce a time series with injection data. We will be taking our injection data from [here](https://huggingface.co/datasets/PGLearn/rte7000)<span style="color:red">, where we have synthetic injection data fom the beginning of 2021 to the end of 2023. The python script below will download a chosen snapshot from the injection dataset, in the form of seven .parquet files (branch, bus, gen, load, sub, switch and vol data).</span>

In [4]:
import huggingface_hub
from huggingface_hub import snapshot_download
import os
import glob

# Hugging Face dataset
repo_id = "PGLearn/rte7000"
repo_type = "dataset"

# Local root folder where files will be saved
local_root = "rte7000"

print("Downloading selected parquet files...")

# Download only files that match the pattern "*2021-01*.parquet"
snapshot_download( # Not a snapshot in our single-timestamp sense!
    repo_id=repo_id,
    repo_type=repo_type,
    local_dir=local_root,
    allow_patterns=["*2021-01*.parquet"],
)

  from .autonotebook import tqdm as notebook_tqdm


Downloading selected parquet files...


Fetching 7 files: 100%|██████████| 7/7 [00:00<00:00, 3432.73it/s]


'C:\\Users\\Josh (eRoots)\\Documents\\VSCode\\rte7000'

In [5]:
import pandas as pd
import glob
import os
from collections import defaultdict

# List downloaded files for verification
all_downloaded = glob.glob(os.path.join(local_root, "**", "*2021-01*.parquet"), recursive=True)
print(f"Downloaded {len(all_downloaded)} parquet files:")
for f in all_downloaded:
    print("  ", f)


# Target timestamp
target_timestamp = pd.Timestamp("2021-01-01T00:10:00")
timestamp_str = target_timestamp.strftime("%Y-%m-%dT%H-%M-%S")

# Top-level folder
top_folder = "rte7000"

# NEW: output folder
output_folder = "data"
os.makedirs(output_folder, exist_ok=True)

# Pattern: all parquet files in subfolders containing "_2021-01"
pattern = os.path.join(top_folder, "*", "*_2021-01*.parquet")
parquet_files = glob.glob(pattern)

print(f"Found {len(parquet_files)} matching parquet files")

# Group files by their immediate parent folder (bus, branch, etc.)
files_by_folder = defaultdict(list)

for file in parquet_files:
    folder_name = os.path.basename(os.path.dirname(file))
    files_by_folder[folder_name].append(file)

# Process each folder separately
for folder_name, files in files_by_folder.items():
    print(f"\nProcessing folder: {folder_name}")

    snapshot_list = []

    for file in files:
        print(f"  Reading {file}")

        df = pd.read_parquet(file)
        df['datetime'] = pd.to_datetime(df['datetime'])

        snapshot = df[df['datetime'] == target_timestamp]

        if not snapshot.empty:
            snapshot_list.append(snapshot)

    if snapshot_list:
        snapshot_df = pd.concat(snapshot_list, ignore_index=True)
    else:
        snapshot_df = pd.DataFrame()

    # Save inside data/
    output_file = os.path.join(
        output_folder,
        f"snapshot_{folder_name}_{timestamp_str}.parquet"
    )

    snapshot_df.to_parquet(output_file, index=False)

    print(f"  -> Saved {len(snapshot_df)} rows to {output_file}")

Downloaded 7 parquet files:
   rte7000\branch\branch_2021-01.parquet
   rte7000\bus\bus_2021-01.parquet
   rte7000\gen\gen_2021-01.parquet
   rte7000\load\load_2021-01.parquet
   rte7000\sub\sub_2021-01.parquet
   rte7000\switch\switch_2021-01.parquet
   rte7000\vol\vol_2021-01.parquet
Found 7 matching parquet files

Processing folder: branch
  Reading rte7000\branch\branch_2021-01.parquet
  -> Saved 9518 rows to data\snapshot_branch_2021-01-01T00-10-00.parquet

Processing folder: bus
  Reading rte7000\bus\bus_2021-01.parquet
  -> Saved 6467 rows to data\snapshot_bus_2021-01-01T00-10-00.parquet

Processing folder: gen
  Reading rte7000\gen\gen_2021-01.parquet
  -> Saved 5625 rows to data\snapshot_gen_2021-01-01T00-10-00.parquet

Processing folder: load
  Reading rte7000\load\load_2021-01.parquet
  -> Saved 6876 rows to data\snapshot_load_2021-01-01T00-10-00.parquet

Processing folder: sub
  Reading rte7000\sub\sub_2021-01.parquet
  -> Saved 4811 rows to data\snapshot_sub_2021-01-01T00-

As before, if we wish to create a time series of a parameter of an element, we can create a loop which opens all of the relevant files. We can use the pyarrow engine to read and convert these .parquet files to pandas, and impose the indexing by `id` (so that we can correctly identify grid with injection data).

In [6]:
list_inj_files = ['data/snapshot_gen_2021-01-01T00-00-00.parquet', 'data/snapshot_gen_2021-01-01T00-05-00.parquet', 'data/snapshot_gen_2021-01-01T00-10-00.parquet'] 

p_timeseries = pd.concat(
    [
        pd.read_parquet(f, engine="pyarrow")
          .set_index('id')['target_p']
          .rename(f'Snap{i}: @{i*5}min')
        for i, f in enumerate(list_inj_files)
    ],
    axis=1
)

p_timeseries


Unnamed: 0_level_0,Snap0: @0min,Snap1: @5min,Snap2: @10min
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
.CTLO3GROUP.1,0.150,1.500000e-01,1.500000e-01
.CTLO3GROUP.2,0.150,1.500000e-01,1.500000e-01
ARGIAINF,0.000,0.000000e+00,0.000000e+00
ARGOEIN2,0.000,0.000000e+00,0.000000e+00
ARGOEIN3,0.144,1.440000e-01,1.440000e-01
...,...,...,...
YQUELING,0.000,0.000000e+00,0.000000e+00
YVETOINF,0.000,0.000000e+00,0.000000e+00
YVETOING,0.000,0.000000e+00,0.000000e+00
YZEURIN2,0.000,0.000000e+00,0.000000e+00
