# Selecting a subset of time periods

Running a capacity expansion model for multiple regions with 8760 hours of load/generation data might be too computationally complex. PowerGenome includes a method (with more to come in the future) for selecting a representitive subset of periods. Doing so requires first generating all load and generation profiles.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from pathlib import Path
import itertools

import pandas as pd
from powergenome.load_profiles import make_final_load_curves
from powergenome.generators import GeneratorClusters
from powergenome.util import (
    build_scenario_settings,
    init_pudl_connection,
    load_settings,
    reverse_dict_of_lists
)

from powergenome.GenX import reduce_time_domain
from powergenome.external_data import make_generator_variability

## Import settings
This assumes that the settings file is set up for multiple scenarios/planning periods. If you are using a settings file with only a single scenario/planning period, remove or comment out the line with `build_scenario_settings`.

In [3]:
pudl_engine, pudl_out = init_pudl_connection()
cwd = Path.cwd()

settings_path = (
    cwd.parent / "example_system" / "test_settings.yml"
)
settings = load_settings(settings_path)
settings["input_folder"] = settings_path.parent / settings["input_folder"]
scenario_definitions = pd.read_csv(
    settings["input_folder"] / settings["scenario_definitions_fn"]
)
scenario_settings = build_scenario_settings(settings, scenario_definitions)

## Load curves

In [4]:
load_curves = make_final_load_curves(pudl_engine, scenario_settings[2030]["p1"])
load_curves

region,CA_N,CA_S,WECC_AZ
time_index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,13668,17427,11038
2,15587,19766,11764
3,15774,20006,11449
4,15288,19398,11061
5,14756,18745,10714
...,...,...,...
8756,15283,16710,11083
8757,14957,16367,10837
8758,14665,16064,10675
8759,14237,15546,10615


## Generation profiles

In [5]:
gc = GeneratorClusters(pudl_engine, pudl_out, scenario_settings[2030]["p1"])
all_gens = gc.create_all_generators()

961.5000000000002  MW without lat/lon
Creating gdf
['Solar Photovoltaic', 'Onshore Wind Turbine', 'Batteries', 'Biomass', 'Conventional Hydroelectric', 'Natural Gas Fired Combined Cycle', 'Other_peaker', 'All Other', 'Natural Gas Fired Combustion Turbine', 'Other Natural Gas']


No model tag values found for DR ('NoneType' object has no attribute 'items')
Selected technology landbasedwind capacity in region CA_N less than minimum (8424.4314 < 25000 MW)
Selected technology landbasedwind capacity in region CA_S less than minimum (23639.682500000003 < 45000 MW)
No model tag values found for DR ('NoneType' object has no attribute 'items')
Transmission investment costs are missing or zero for some resources and will not be included in the total investment costs.


In [6]:
gen_variability = make_generator_variability(all_gens)

## Reduce time domain
This function selects `N` periods of `x` days from the 8760 hours. It uses the settings parameters
- `reduce_time_domain` (a boolean value)
- `time_domain_periods` (`N`)
- `time_domain_days_per_period` (`x`)
- `include_peak_day` (if the day of peak demand should always be included in the output)
- `demand_weight_factor` (weighting factor for demand relative to generation profiles)

It outputs time reduced generation profiles and load profiles, along with a dataframe that tracks the sequential order of cluster slots in a year. The sequential order is needed to track long-duration storage across time. The weight of hours in each cluster are provided sequentially in the column `Sub_Weights`.

In [7]:
(
    reduced_resource_profile,
    reduced_load_profile,
    long_duration_storage,
) = reduce_time_domain(gen_variability, load_curves, scenario_settings[2030]["p1"])

In [8]:
# Resource profiles are in the same column order as rows in all_gens
reduced_resource_profile

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,51,52,53,54,55,56,57,58,59,60
1,1,0.407363,1,1,1,1,0.4475,0.407363,0.6086,1,...,0.812324,0.708245,1,1,1,1,1,1,0.102265,0.102282
2,1,0.407463,1,1,1,1,0.3886,0.407463,0.5136,1,...,0.650029,0.564608,1,1,1,1,1,1,0.057640,0.057697
3,1,0.407563,1,1,1,1,0.3113,0.407563,0.3523,1,...,0.410820,0.335105,1,1,1,1,1,1,0.036849,0.036848
4,1,0.407663,1,1,1,1,0.3344,0.407663,0.1554,1,...,0.000000,0.032329,1,1,1,1,1,1,0.019270,0.019276
5,1,0.407770,1,1,1,1,0.5849,0.407770,0.0000,1,...,0.000000,0.000000,1,1,1,1,1,1,0.022819,0.022817
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
476,1,0.376590,1,1,1,1,0.0005,0.376590,0.6647,1,...,0.802279,0.855651,1,1,1,1,1,1,0.634888,0.634933
477,1,0.376590,1,1,1,1,0.0008,0.376590,0.6729,1,...,0.715021,0.855857,1,1,1,1,1,1,0.615957,0.616050
478,1,0.376590,1,1,1,1,0.0342,0.376590,0.7098,1,...,0.657944,0.880527,1,1,1,1,1,1,0.529412,0.529504
479,1,0.376590,1,1,1,1,0.0737,0.376590,0.6530,1,...,0.643267,0.804631,1,1,1,1,1,1,0.444557,0.444532


In [9]:
# This is formatted for GenX, drop any columns you don't need.
# I'm adding the cluster label here to match with the long_duration_storage parameter below
hours_per_cluster = settings["time_domain_days_per_period"] * 24
cluster_labels = [[N] * hours_per_cluster for N in range(1, settings["time_domain_periods"] + 1)]
reduced_load_profile["cluster"] = list(itertools.chain.from_iterable(cluster_labels))
reduced_load_profile

Unnamed: 0,Voll,Demand_segment,Cost_of_demand_curtailment_perMW,Max_demand_curtailment,$/MWh,Subperiods,Hours_per_period,Sub_Weights,Time_index,CA_N,CA_S,WECC_AZ,cluster
0,9000.0,1.0,1.000,1.000,9000.0,4.0,120.0,3000.0,1,19865.0,21116.0,13342.0,1
1,,2.0,0.067,0.075,603.0,,,3000.0,2,19719.0,20946.0,13300.0,1
2,,,,,,,,2640.0,3,19278.0,20516.0,13184.0,1
3,,,,,,,,120.0,4,18258.0,19517.0,13351.0,1
4,,,,,,,,,5,17150.0,18288.0,12985.0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
475,,,,,,,,,476,24884.0,28142.0,20787.0,4
476,,,,,,,,,477,26090.0,29347.0,21592.0,4
477,,,,,,,,,478,27270.0,30508.0,22184.0,4
478,,,,,,,,,479,28180.0,31294.0,22650.0,4


In [10]:
long_duration_storage.head(50)

Unnamed: 0,slot,cluster
25,1,2
50,2,3
51,3,3
52,4,3
53,5,3
54,6,3
55,7,3
56,8,3
57,9,3
58,10,3
