# Group existing generators within regions and create new resources

This notebook shows how to use the PowerGenome API to cluster existing resources and create new-build resource options by model region. You'll need a settings file, such as the `test_settings.yml` provided in the `example_system` folder, with the following parameters.

For both existing and new resource:
- model_regions
- region_aggregations
- model_year
- target_usd_year
- atb_usd_year
- startup_fuel_use
- startup_vom_costs_mw
- startup_vom_costs_usd_year
- startup_costs_type
- startup_costs_per_cold_start_mw
- startup_costs_per_cold_start_usd_year
- existing_startup_costs_tech_map
- new_build_startup_costs

Specific to existing resource:
- num_clusters
- retirement_ages
- atb_existing_year
- existing_om_muiltiplier
- eia_atb_tech_map
- proposed_status_included
- proposed_gen_heat_rates
- proposed_min_load

Specific to new-build resources:
- atb_cost_case
- atb_financial_case
- atb_cap_recovery_years
- atb_new_gen
- renewables_clusters
- cost_multiplier_region_map
- cost_multiplier_technology_map
- transmission_investment_cost (if spur-lines are needed for non-renewable resources)

To calculate fuel costs for each resource:
- aeo_fuel_region_map
- eia_series_region_names
- eia_series_fuel_names
- eia_aeo_year
- eia_series_scenario_names
- aeo_fuel_scenarios
- aeo_fuel_usd_year
- tech_fuel_map
- fuel_emission_factors
- carbon_tax (optional)

And if CCS resources are included:
- ccs_fuel_map
- ccs_capture_rate
- ccs_disposal_cost

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from pathlib import Path

import geopandas as gpd
import numpy as np
import pandas as pd
from powergenome.generators import GeneratorClusters
from powergenome.GenX import reduce_time_domain
from powergenome.load_profiles import make_final_load_curves
from powergenome.params import DATA_PATHS
from powergenome.util import (
    build_scenario_settings,
    init_pudl_connection,
    load_settings,
    check_settings
)
from powergenome.external_data import (
    make_demand_response_profiles,
    make_generator_variability,
)

pd.options.display.max_columns = 200

## Import settings and create database connections
This assumes that the settings file is set up for multiple scenarios/planning periods. If you are using a settings file with only a single scenario/planning period, remove or comment out the line with `build_scenario_settings`.

Once the settings file is loaded, use the `data_years` to limit the amount of data processed in `pudl_out`. Limiting the data years will reduce run time. 

Finally, check the settings for some common user errors. These currently include:
- Are all aggregated regions in `region_aggregations` valid IPM regions?
- Are all model regions included in the parameters `cost_multiplier_region_map` and `aeo_fuel_region_map`?
- Are any column names included more than once in `generator_columns`?
- The AEO reference scenario names for fuel cost and demand growth are of the form `REF<AEO year>`. Does the AEO year match the parameter `eia_aeo_year`?
- Are the technologies in `atb_new_gen` all valid names?

Most of these catch simple mistakes like misspelled region/technology names. If you would like any additional checks included, [submit an issue](https://github.com/PowerGenome/PowerGenome/issues) or make the changes yourself and submit a pull request.

In [3]:
cwd = Path.cwd()

settings_path = (
    cwd.parent / "example_systems" / "CA_AZ" / "test_settings.yml"
)
settings = load_settings(settings_path)
settings["input_folder"] = settings_path.parent / settings["input_folder"]
scenario_definitions = pd.read_csv(
    settings["input_folder"] / settings["scenario_definitions_fn"]
)
scenario_settings = build_scenario_settings(settings, scenario_definitions)

pudl_engine, pudl_out, pg_engine = init_pudl_connection(
    freq="AS",
    start_year=min(settings.get("data_years")),
    end_year=max(settings.get("data_years")),
)

check_settings(settings, pg_engine)

## Initialize a `GeneratorClusters` object

`GeneratorClusters` is how existing generators are clustered, and it provides a convinience method for creating new-build resources (`GeneratorClusters.create_new_generators`).

In [7]:
gc = GeneratorClusters(pudl_engine, pudl_out, pg_engine, scenario_settings[2045]["p1"])

794.1999999999999  MW without lat/lon


### Existing generators

Lets look at how existing generators have been clustered in the different regions. The raw output from `create_region_technology_clusters` has lots of extra columns that might not be needed. Some of them include:

- `resource` is a snake case version of `technology`.
- `unmodified_cap_size` is the average size of generating units. `Cap_size` is `unmodified_cap_size` multiplied by the capacity factor for technologies that are derated by their CF.
- `Existing_Cap_MW` is `Cap_size` multiplied by `num_units`.
- `heat_rate_mmbtu_mwh_iqr` and `heat_rate_mmbtu_mwh_std` are measurements of how widely the heate rate varies within a cluster. If these values are large compared to `Heat_rate_MMBTU_per_MWh` then you might consider increasing the number of clusters for that technology/region.
- `Fuel` is a combination of the region (assigned by the settings parameter `fuel_region_map`), the AEO scenario (assigned by the settings parameter `aeo_fuel_scenarios`), and the fuel type for that generator resource (assigned by the settings parameters `tech_fuel_map` and `ccs_fuel_map`). Only fuels in AEO are included, which means that biomass/hydrogen/RNG, etc are not options.

The various "model tags" that are included in the settings file are unique to GenX. If you find them useful, they can be used to assign various values of any data type to generators in a column name of your choosing. If you don't want to use them, they can be ignored or removed via `settings.pop(<name>)`.

In [8]:
existing_gen = gc.create_region_technology_clusters()

Technology Conventional Hydroelectric changed capacity from 12699.700000000008 to 12704.200000000008


Creating gdf
['Solar Photovoltaic', 'Onshore Wind Turbine', 'Batteries', 'Biomass', 'Conventional Hydroelectric', 'Natural Gas Fired Combined Cycle', 'Other_peaker', 'All Other', 'Natural Gas Fired Combustion Turbine', 'Other Natural Gas']


No model tag values found for MinCapTag_2 ('MinCapTag_2')
No model tag values found for Reg_Max ('Reg_Max')
No model tag values found for Rsv_Max ('Rsv_Max')


In [9]:
existing_gen

Unnamed: 0,region,technology,cluster,index,Cap_size,minimum_load_mw,Heat_Rate_MMBTU_per_MWh,Fixed_OM_Cost_per_MWyr,Var_OM_Cost_per_MWh,heat_rate_mmbtu_mwh_iqr,heat_rate_mmbtu_mwh_std,fixed_o_m_mw_std,Min_Power,num_units,plant_id_eia,unit_id_pudl,capacity_factor,unmodified_cap_size,Existing_Cap_MW,unmodified_existing_cap_mw,Start_fuel_MMBTU_per_MW,Fuel,Start_Cost_per_MW,THERM,VRE,Num_VRE_Bins,MUST_RUN,STOR,FLEX,HYDRO,Commit,ESR_1,ESR_2,New_Build,Hydro_level,CapRes_1,CapRes_2,Hydro_Energy_to_Power_Ratio,MinCapTag_1,MinCapTag_2,Reg_Max,Rsv_Max,Resource,profile
0,CA_N,Biomass,1,0,1.094,1.064,16.157,122976.0,5.655,1.483,4.067,0.0,0.354,56,"[10748, 10748, 10748, 50112, 54517, 56080, 560...","[10748_U1C06, 10748_U3J16, 10748_U4J16, 50112_...",0.364,3.007,61.264,168.392,0.0,,0.0,0,0,0,1,0,0,0,1,1,1,-1,0.0,0.9,0.9,0.0,0,0,0,0,biomass,
1,CA_N,Conventional Hydroelectric,1,1,62.171,3.74,9.104,44560.0,0.0,0.0,0.01,0.0,0.06,117,"[217, 218, 218, 219, 220, 220, 221, 222, 222, ...","[217_1, 218_2, 218_3, 219_1, 220_H1, 220_H2, 2...",,62.171,7274.007,7274.007,0.0,,0.0,0,0,0,0,0,0,1,0,0,1,-1,0.5,0.8,0.8,158.730159,0,0,0,0,conventional_hydroelectric,"[0.21514035275363813, 0.2148970782125778, 0.21..."
2,CA_N,Geothermal,1,2,27.477,14.878,9.104,198040.0,0.0,0.0,0.0,0.0,0.389,23,"[286, 286, 286, 286, 286, 286, 286, 286, 286, ...","[286_U11, 286_U12, 286_U13, 286_U14, 286_U16, ...",0.718,38.278,631.971,880.394,0.0,,0.0,0,0,0,1,0,0,0,0,1,1,-1,0.0,0.9,0.9,0.0,0,0,0,0,geothermal,
3,CA_N,Hydroelectric Pumped Storage,1,3,89.55,23.85,0.0,38460.0,0.0,0.0,0.0,0.0,0.266,20,"[104, 437, 437, 437, 446, 446, 446, 446, 446, ...","[104_1, 437_2, 437_4, 437_6, 446_1, 446_2, 446...",-0.022,89.55,1791.0,1791.0,0.0,,0.0,0,0,0,0,1,0,0,0,0,0,0,0.0,0.95,0.95,0.0,0,0,0,0,hydroelectric_pumped_storage,
4,CA_N,Natural Gas Fired Combined Cycle,1,4,274.485,138.315,7.361,11243.353,3.697,1.149,2.85,3050.441,0.504,13,"[7307, 55748, 55933, 55970, 56078, 56298, 5629...","[1.0, 1.0, 1.0, 1.0, 1.0, 56298_0001, 56298_00...",0.396,274.485,3568.305,3568.305,0.0,pacific_reference_naturalgas,87.209891,1,0,0,0,0,0,0,1,0,0,1,0.0,0.9,0.9,0.0,0,0,0,0,natural_gas_fired_combined_cycle,
5,CA_N,Natural Gas Fired Combustion Turbine,1,5,86.648,43.714,10.865,9960.122,4.409,0.879,1.449,1220.348,0.505,21,"[7315, 7315, 7315, 50064, 56135, 56135, 56639,...","[7315_2, 7315_3, 7315_4, 50064_003, 56135_1, 5...",0.144,86.648,1819.608,1819.608,0.0,pacific_reference_naturalgas,113.079725,1,0,0,0,0,0,0,1,0,0,1,0.0,0.9,0.9,0.0,0,0,0,0,natural_gas_fired_combustion_turbine,
6,CA_N,Onshore Wind Turbine,1,6,2.5,0.033,9.183,43205.0,0.0,0.053,0.061,0.0,0.013,3,"[61067, 62654, 62654]","[61067_WTG1, 62654_WTG1, 62654_WTG2]",0.289,2.5,1327.6,7.5,0.0,,0.0,0,1,1,0,0,0,0,0,1,1,0,0.0,0.8,0.8,0.0,0,0,0,0,onshore_wind_turbine,"[0.0152, 0.0066, 0.0403, 0.1088, 0.059, 0.0401..."
7,CA_N,Small Hydroelectric,1,7,1.769,0.468,9.104,44560.0,0.0,0.0,0.003,0.0,0.085,133,"[34, 161, 161, 161, 162, 162, 180, 214, 215, 2...","[34_1P, 161_1, 161_2, 161_3, 162_1, 162_2, 180...",0.323,5.484,235.277,729.372,0.0,,0.0,0,0,0,1,0,0,0,0,1,1,-1,0.0,0.8,0.8,0.0,0,0,0,0,small_hydroelectric,"[0.21514035275363813, 0.2148970782125778, 0.21..."
8,CA_N,Solar Photovoltaic,1,8,17.275,0.0,9.122,18760.0,0.0,0.0,0.025,0.0,0.0,190,"[56768, 56813, 56813, 56875, 56909, 56909, 569...","[56768_CR1, 56813_1, 56813_2, 56875_TBD, 56909...",0.236,17.275,2787.1,3282.25,0.0,,0.0,0,1,1,0,0,0,0,0,1,1,0,0.0,0.8,0.8,0.0,0,0,0,0,solar_photovoltaic,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
9,CA_S,Biomass,1,9,1.457,1.535,11.354,122976.0,3.974,1.859,2.554,0.0,0.51,46,"[10387, 10387, 52204, 52204, 56898, 56898, 571...","[10387_1A, 10387_2A, 52204_OTA5, 52204_OTA6, 5...",0.484,3.011,67.022,138.506,0.0,,0.0,0,0,0,1,0,0,0,1,1,1,-1,0.0,0.9,0.9,0.0,0,0,0,0,biomass,


### New generators

New generators are based on data from NREL's Annual Technology Baseline (ATB). ATB uses a format of `\<technology>_<tech_detail>` to describe resources, such as `NaturalGas_CTAvgCF`. There are additional parameters including the cost case and financial case that are needed to specify the capex for a specific generator type. Note that many of the `tech_detail` values have identical capex and O&M values, so it doesn't matter which one is used. Examples include the capacity factor of combustion resources (`CTAvgCF` vs `CTHighCF`) and UtilityPV (`Chicago`, `KansasCity`, and `LosAngeles` have different capacity factors that we don't use in PowerGenome).

The raw output from `create_new_generators` has many more columns than existing generators. Several of these - such as `basis_year`, `capex`, `capex_mwh`, `cap_recovery_years`, `waccnomtech`, `regional_cost_multiplier`, and `interconnect_annuity` - are used to calculate the final `Inv_cost_per_MWyr` and `Inv_cost_per_MWhyr`. `lcoe` is pre-calculated using 2030 mid-range ATB costs, and is not specific to the model year for each run. The underlying data are retained here so they can be easily reviewed, but only keep the columns that you want/need.

The column `variability` has array values with the annual generation profiles for a resource. If these profiles represent 2012 they have 8784 hourly values.

The column `Max_Cap_MW` has a value of -1 if there is no limit on the capacity. 

If the user has included demand response profiles as an external file (using the `demand_response_fn` parameter in settings and the associated `demand_response` and `demand_response_resources` parameters), demand response resources will be included.

### Modified and custom new generators

By default, new generators are created using data from NREL ATB. But it is possible to modify an existing ATB generator type or create a modified copy of an ATB generator. To understand 

Modified ATB generators are included in the `atb_modifiers` parameter. In ATB 2019, NREL used a blend of combustion turbine and combined cycle types that we felt was unrealistic. Therefore we have included, by default, the following modifications. The modifying values are provided as `[<op>, <value>]`, where `op` is an operator of type `add`, `mul`, `sub`, and `truediv`. Again, this parameter modifies an existing technology in-place.

```
atb_modifiers:
  ngct:
    technology: NaturalGas
    tech_detail: CTAvgCF
    capex: [mul, 0.76]
    Var_OM_cost_per_MWh: [mul, 1.51]
    Fixed_OM_cost_per_MWyr: [mul, 0.56]
    Heat_rate_MMBTU_per_MWh: [mul, 0.97]
  ngcc:
    technology: NaturalGas
    tech_detail: CCAvgCF
    capex: [mul, 0.89]
    Var_OM_cost_per_MWh: [mul, 0.73]
    Fixed_OM_cost_per_MWyr: [mul, 0.95]
    Heat_rate_MMBTU_per_MWh: [mul, 0.98]
```

New generators based on modifications of an existing ATB generator are included in the parameter `modified_atb_new_gen`. The example below creates a NGCC resource called `NaturalGas_CCS100_Mid` with 100% CO₂ capture. Any new resources need to be added to the parameters `cost_multiplier_technology_map`, `new_build_startup_costs`, and `model_tag_values` (optional). Names are linked using string matching, so the full name doesn't need to be included in these parameters (but it should be long enough to be unique).

```
modified_atb_new_gen:
  NGCCS100:
    new_technology: NaturalGas
    new_tech_detail: CCS100
    new_cost_case: Mid
    atb_technology: NaturalGas
    atb_tech_detail: CCCCSAvgCF
    atb_cost_case: Mid
    size_mw: 500
    capex: [add, 116000]
    heat_rate: [add, 0.365]
    o_m_fixed_mw: [add, 9670]
    o_m_variable_mwh: [mul, 1.076]
```

### New wind/solar resources

In the settings, uses can specify the type of resource, maximum capacity, number of clusters, and maximum LCOE (optional) in a model region. With these parameters, PowerGenome combines pre-clustered groups of resources, calculating the total capacity and a weighted generation profile for each resource. Utility PV capacity in these clusters is calculated at 45 MW/km^2, and we recommend that users deflate this value (`cap_multiplier`). For offshore wind, specify fixed/floating (`turbine_type`) and if the sites are part of BOEM lease areas (`pref_site`).

```
renewables_clusters:
  - region: CA_N
    technology: landbasedwind
    max_clusters: 2
    max_lcoe: 110
    min_capacity: 25000
  - region: CA_N
    technology: offshorewind
    turbine_type: floating
    pref_site: 1
    max_clusters: 3
    min_capacity: 40000
  - region: CA_S
    technology: landbasedwind
    max_clusters: 4
    max_lcoe: 100
    min_capacity: 45000
  - region: CA_S
    technology: utilitypv
    max_clusters: 5
    max_lcoe: 75
    min_capacity: 100000
    cap_multiplier: 0.2
  - region: WECC_AZ
    technology: utilitypv
    max_clusters: 3
    min_capacity: 100000
    cap_multiplier: 0.2
```

**NOTE:** Wind/solar resources have a value in the `LCOE` column. This is a pre-computed LCOE that assumes 2030 mid-range capex values from ATB 2019, and is not specific to settings file you are using. This LCOE is used to cluster project areas using a hierarchical clustering method. 

### Interconnection costs
Wind and solar resources have pre-calculated interconnection costs, so the user doesn't need to supply any additional data. But if a spur line is needed for thermal resources, these distances should be included under the column `spur_miles` in the file specified by `capacity_limit_spur_fn`. This spur line distance is then multiplied by costs for each model region listed in the `transmission_investment_cost` parameter. Spur line capex in the example settings file is from ReEDS documentation. You can look at [a mapping of IPM regions to ReEDS regions](https://github.com/gschivley/pg_misc/blob/master/create_clusters/site_interconnection_costs.py#L32-L155) and the associated transmission costs that I have compiled.

In [10]:
new_gen = gc.create_new_generators()

Selected technology landbasedwind capacity in region CA_N less than minimum (8424.4314 < 25000 MW)
Selected technology landbasedwind capacity in region CA_S less than minimum (23639.682500000003 < 45000 MW)
No model tag values found for MinCapTag_2 ('MinCapTag_2')
No model tag values found for Reg_Max ('Reg_Max')
No model tag values found for Rsv_Max ('Rsv_Max')
Transmission investment costs are missing or zero for some resources and will not be included in the total investment costs.


In [11]:
new_gen.columns

Index(['technology', 'basis_year', 'Fixed_OM_Cost_per_MWyr',
       'Fixed_OM_Cost_per_MWhyr', 'Var_OM_Cost_per_MWh', 'capex_mw',
       'capex_mwh', 'Inv_Cost_per_MWyr', 'Inv_Cost_per_MWhyr',
       'Heat_Rate_MMBTU_per_MWh',
       ...
       'Flexible_Demand_Energy_Eff', 'Ramp_Up_Percentage',
       'Ramp_Dn_Percentage', 'Up_Time', 'Down_Time', 'NACC_Eff',
       'NACC_Peak_to_Base', 'Reg_Cost', 'Rsv_Cost', 'Resource'],
      dtype='object', length=106)

#### Spur line/interconnection distances and capacity limits
Interconnection distances and the maximum available capacity for UtilityPV, LandbasedWind, and OffshoreWind are all included in the data used to select and combine clusters. Spur line distances for other resources must be provided by the user in a CSV file and included in the settings file as `capacity_limit_spur_fn`.

In [12]:
cols = [
    "region",
    "technology",
    "cluster",
    "Max_Cap_MW",
    "lcoe",
    "capex_mw",
    "regional_cost_multiplier",
    "Inv_Cost_per_MWyr",
    "plant_inv_cost_mwyr",
    "Start_Cost_per_MW",
    "interconnect_annuity",
    "spur_inv_mwyr",
    "spur_miles",
    "offshore_spur_inv_mwyr",
    "tx_inv_mwyr",
    "profile",
]
new_gen[cols]

Unnamed: 0,region,technology,cluster,Max_Cap_MW,lcoe,capex_mw,regional_cost_multiplier,Inv_Cost_per_MWyr,plant_inv_cost_mwyr,Start_Cost_per_MW,interconnect_annuity,spur_inv_mwyr,spur_miles,offshore_spur_inv_mwyr,tx_inv_mwyr,profile
0,CA_N,NaturalGas_CCCCSAvgCF_Mid,0.0,-1.0,0.0,2145350.0,1.090697,175864.910407,163125.0,95.033926,0.0,12739.910407,20.0,0.0,0.0,0
1,CA_N,NaturalGas_CCAvgCF_Mid,0.0,-1.0,0.0,830173.7,1.339623,95861.0,95861.0,95.033926,0.0,0.0,0.0,0.0,0.0,0
2,CA_N,NaturalGas_CTAvgCF_Mid,0.0,-1.0,0.0,646953.0,1.211268,67546.0,67546.0,123.224672,0.0,0.0,0.0,0.0,0.0,0
3,CA_N,Battery_*_Mid,0.0,-1.0,0.0,128342.2,1.041215,10876.0,10876.0,0.0,0.0,0.0,0.0,0.0,0.0,0
4,CA_N,Nuclear_mid__,0.0,-1.0,0.0,4815000.0,1.249179,504976.776018,473127.0,255.782928,0.0,31849.776018,50.0,0.0,0.0,0
5,CA_N,NaturalGas_CCS100_Mid,0.0,-1.0,0.0,2261350.0,1.090697,184684.910407,171945.0,95.033926,0.0,12739.910407,20.0,0.0,0.0,0
6,CA_N,LandbasedWind_LTRG1_Mid_110,1.0,1259.55,109.3174,1161777.0,2.109174,202819.4931,164722.0,0.0,38097.4931,7928.683242,12.447,0.0,0.0,"[0.33980817, 0.43698293, 0.46549338, 0.6501417..."
7,CA_N,LandbasedWind_LTRG1_Mid_110,2.0,7164.8814,101.403571,1161777.0,2.109174,201407.661403,164722.0,0.0,36685.661403,19510.610728,30.629118,0.0,0.0,"[0.06773967, 0.07600677, 0.095823325, 0.154638..."
8,CA_N,OffShoreWind_OTRG10_Mid_floating_1,1.0,18920.5433,161.759154,2448976.0,1.308483,683766.25509,220590.0,0.0,463176.25509,138210.374951,216.972287,0.0,0.0,"[0.02386371, 0.051461402, 0.079115495, 0.18678..."
9,CA_N,OffShoreWind_OTRG10_Mid_floating_1,2.0,15748.0,172.627669,2448976.0,1.308483,600024.037885,220590.0,0.0,379434.037885,45103.736382,70.806991,0.0,0.0,"[0.5650235, 0.5299122, 0.2336824, 0.15720288, ..."


#### Generation profiles

Hourly generation profiles are saved in a `variability` column of the dataframe. These are then extracted using the function `make_generator_variability`. The variability (generation profile) dataframe is in the same (column) order as rows in the generator dataframe.

In [13]:
existing_variability = make_generator_variability(existing_gen)
existing_variability

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25
0,1.0,0.215140,1.0,1.0,1.0,1.0,0.0152,0.215140,0.0000,1.0,0.104891,1.0,1.0,1.0,1.0,0.431704,0.104891,0.000000,1.0,0.2601,1.0,1.0,1.0,1.0,1.0,0.0000
1,1.0,0.214897,1.0,1.0,1.0,1.0,0.0066,0.214897,0.0000,1.0,0.104837,1.0,1.0,1.0,1.0,0.571841,0.104837,0.000000,1.0,0.2603,1.0,1.0,1.0,1.0,1.0,0.0000
2,1.0,0.214654,1.0,1.0,1.0,1.0,0.0403,0.214654,0.0000,1.0,0.104783,1.0,1.0,1.0,1.0,0.530605,0.104783,0.000000,1.0,0.2606,1.0,1.0,1.0,1.0,1.0,0.0000
3,1.0,0.214411,1.0,1.0,1.0,1.0,0.1088,0.214411,0.0000,1.0,0.104728,1.0,1.0,1.0,1.0,0.484157,0.104728,0.000000,1.0,0.2608,1.0,1.0,1.0,1.0,1.0,0.0000
4,1.0,0.214167,1.0,1.0,1.0,1.0,0.0590,0.214167,0.0000,1.0,0.104620,1.0,1.0,1.0,1.0,0.589168,0.104620,0.000000,1.0,0.2610,1.0,1.0,1.0,1.0,1.0,0.0000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8755,1.0,0.215936,1.0,1.0,1.0,1.0,0.0004,0.215936,0.2899,1.0,0.232444,1.0,1.0,1.0,1.0,0.009415,0.232444,0.278583,1.0,0.2591,1.0,1.0,1.0,1.0,1.0,0.2031
8756,1.0,0.215179,1.0,1.0,1.0,1.0,0.0293,0.215179,0.0000,1.0,0.232256,1.0,1.0,1.0,1.0,0.022796,0.232256,0.000000,1.0,0.2593,1.0,1.0,1.0,1.0,1.0,0.0000
8757,1.0,0.214423,1.0,1.0,1.0,1.0,0.0617,0.214423,0.0000,1.0,0.232069,1.0,1.0,1.0,1.0,0.053236,0.232069,0.000000,1.0,0.2595,1.0,1.0,1.0,1.0,1.0,0.0000
8758,1.0,0.213666,1.0,1.0,1.0,1.0,0.0811,0.213666,0.0000,1.0,0.231880,1.0,1.0,1.0,1.0,0.196757,0.231880,0.000000,1.0,0.2597,1.0,1.0,1.0,1.0,1.0,0.0000


Since the variability column names are only integers, it can help to replace them with descriptive strings.

In [14]:
existing_variability.columns = (
    existing_gen["region"]
    + "_"
    + existing_gen["Resource"]
    + "_"
    + existing_gen["cluster"].astype(str)
)
existing_variability

Unnamed: 0,CA_N_biomass_1,CA_N_conventional_hydroelectric_1,CA_N_geothermal_1,CA_N_hydroelectric_pumped_storage_1,CA_N_natural_gas_fired_combined_cycle_1,CA_N_natural_gas_fired_combustion_turbine_1,CA_N_onshore_wind_turbine_1,CA_N_small_hydroelectric_1,CA_N_solar_photovoltaic_1,CA_S_biomass_1,CA_S_conventional_hydroelectric_1,CA_S_geothermal_1,CA_S_hydroelectric_pumped_storage_1,CA_S_natural_gas_fired_combined_cycle_1,CA_S_natural_gas_fired_combustion_turbine_1,CA_S_onshore_wind_turbine_1,CA_S_small_hydroelectric_1,CA_S_solar_photovoltaic_1,WECC_AZ_biomass_1,WECC_AZ_conventional_hydroelectric_1,WECC_AZ_conventional_steam_coal_1,WECC_AZ_hydroelectric_pumped_storage_1,WECC_AZ_natural_gas_fired_combined_cycle_1,WECC_AZ_natural_gas_fired_combustion_turbine_1,WECC_AZ_nuclear_1,WECC_AZ_solar_photovoltaic_1
0,1.0,0.215140,1.0,1.0,1.0,1.0,0.0152,0.215140,0.0000,1.0,0.104891,1.0,1.0,1.0,1.0,0.431704,0.104891,0.000000,1.0,0.2601,1.0,1.0,1.0,1.0,1.0,0.0000
1,1.0,0.214897,1.0,1.0,1.0,1.0,0.0066,0.214897,0.0000,1.0,0.104837,1.0,1.0,1.0,1.0,0.571841,0.104837,0.000000,1.0,0.2603,1.0,1.0,1.0,1.0,1.0,0.0000
2,1.0,0.214654,1.0,1.0,1.0,1.0,0.0403,0.214654,0.0000,1.0,0.104783,1.0,1.0,1.0,1.0,0.530605,0.104783,0.000000,1.0,0.2606,1.0,1.0,1.0,1.0,1.0,0.0000
3,1.0,0.214411,1.0,1.0,1.0,1.0,0.1088,0.214411,0.0000,1.0,0.104728,1.0,1.0,1.0,1.0,0.484157,0.104728,0.000000,1.0,0.2608,1.0,1.0,1.0,1.0,1.0,0.0000
4,1.0,0.214167,1.0,1.0,1.0,1.0,0.0590,0.214167,0.0000,1.0,0.104620,1.0,1.0,1.0,1.0,0.589168,0.104620,0.000000,1.0,0.2610,1.0,1.0,1.0,1.0,1.0,0.0000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8755,1.0,0.215936,1.0,1.0,1.0,1.0,0.0004,0.215936,0.2899,1.0,0.232444,1.0,1.0,1.0,1.0,0.009415,0.232444,0.278583,1.0,0.2591,1.0,1.0,1.0,1.0,1.0,0.2031
8756,1.0,0.215179,1.0,1.0,1.0,1.0,0.0293,0.215179,0.0000,1.0,0.232256,1.0,1.0,1.0,1.0,0.022796,0.232256,0.000000,1.0,0.2593,1.0,1.0,1.0,1.0,1.0,0.0000
8757,1.0,0.214423,1.0,1.0,1.0,1.0,0.0617,0.214423,0.0000,1.0,0.232069,1.0,1.0,1.0,1.0,0.053236,0.232069,0.000000,1.0,0.2595,1.0,1.0,1.0,1.0,1.0,0.0000
8758,1.0,0.213666,1.0,1.0,1.0,1.0,0.0811,0.213666,0.0000,1.0,0.231880,1.0,1.0,1.0,1.0,0.196757,0.231880,0.000000,1.0,0.2597,1.0,1.0,1.0,1.0,1.0,0.0000


In [15]:
make_generator_variability(new_gen)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36
0,1.0,1.0,1.0,1.0,1.0,1.0,0.339808,0.067740,0.023864,0.565023,0.011022,1.0,1.0,1.0,1.0,1.0,1.0,0.430082,0.286035,0.239627,0.490622,0.000000,0.000000,0.000000,0.000000,0.000000,1.0,1.0,1.0,1.0,1.0,1.0,0.000000,0.000000,0.00000,0.089289,0.089232
1,1.0,1.0,1.0,1.0,1.0,1.0,0.436983,0.076007,0.051461,0.529912,0.031337,1.0,1.0,1.0,1.0,1.0,1.0,0.496679,0.237119,0.292402,0.473638,0.000000,0.000000,0.000000,0.000000,0.000000,1.0,1.0,1.0,1.0,1.0,1.0,0.000000,0.000000,0.00000,0.083419,0.083400
2,1.0,1.0,1.0,1.0,1.0,1.0,0.465493,0.095823,0.079115,0.233682,0.053202,1.0,1.0,1.0,1.0,1.0,1.0,0.486465,0.165805,0.376477,0.453231,0.000000,0.000000,0.000000,0.000000,0.000000,1.0,1.0,1.0,1.0,1.0,1.0,0.000000,0.000000,0.00000,0.032132,0.032114
3,1.0,1.0,1.0,1.0,1.0,1.0,0.650142,0.154638,0.186783,0.157203,0.127425,1.0,1.0,1.0,1.0,1.0,1.0,0.454154,0.154096,0.429210,0.442136,0.000000,0.000000,0.000000,0.000000,0.000000,1.0,1.0,1.0,1.0,1.0,1.0,0.000000,0.000000,0.00000,0.024408,0.024365
4,1.0,1.0,1.0,1.0,1.0,1.0,0.656204,0.154106,0.388197,0.082434,0.221323,1.0,1.0,1.0,1.0,1.0,1.0,0.481397,0.178022,0.480092,0.411889,0.000000,0.000000,0.000000,0.000000,0.000000,1.0,1.0,1.0,1.0,1.0,1.0,0.000000,0.000000,0.00000,0.015448,0.015418
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8755,1.0,1.0,1.0,1.0,1.0,1.0,0.077191,0.021500,0.045171,0.030488,0.335281,1.0,1.0,1.0,1.0,1.0,1.0,0.035673,0.055388,0.052983,0.056044,0.383319,0.382493,0.371831,0.387646,0.397305,1.0,1.0,1.0,1.0,1.0,1.0,0.302154,0.293488,0.29772,0.621627,0.621585
8756,1.0,1.0,1.0,1.0,1.0,1.0,0.062224,0.018055,0.019462,0.017133,0.428032,1.0,1.0,1.0,1.0,1.0,1.0,0.060415,0.136356,0.100320,0.111646,0.000000,0.000000,0.000000,0.000000,0.000000,1.0,1.0,1.0,1.0,1.0,1.0,0.000000,0.000000,0.00000,0.604531,0.604569
8757,1.0,1.0,1.0,1.0,1.0,1.0,0.031747,0.026687,0.098691,0.144561,0.378416,1.0,1.0,1.0,1.0,1.0,1.0,0.078338,0.272622,0.140229,0.158945,0.000000,0.000000,0.000000,0.000000,0.000000,1.0,1.0,1.0,1.0,1.0,1.0,0.000000,0.000000,0.00000,0.511843,0.511823
8758,1.0,1.0,1.0,1.0,1.0,1.0,0.071849,0.033144,0.077648,0.101721,0.351909,1.0,1.0,1.0,1.0,1.0,1.0,0.128073,0.348353,0.227002,0.284965,0.000000,0.000000,0.000000,0.000000,0.000000,1.0,1.0,1.0,1.0,1.0,1.0,0.000000,0.000000,0.00000,0.089289,0.089232
