# Settings/scenario management

Capacity expansion modeling is often an exercise in exploring the difference between different technical, cost, or policy scenarios across a range of planning years, so PowerGenome has a built-in method for creating modified versions of a single baseline scenario. Within the settings file this shows up in how planning periods are defined and a nested dictionary that allows any "normal" parameter to be modified for different scenarios.

## Scenario management files

Scenario management is deeply build into the input file structure. So much so, in fact, that it might be difficult to create inputs for a single scenario without following the layout designed for multiple scenarios.

### Scenario names

Each scenario has a long name and a short identifier, defined in the `case_id_description_fn` file (`test_case_id_description.csv` in the example). These cases are assumed to be the same across planning periods. When using the command line interface, case folders are created using the format `<case_id>_<model_year>_<case_description>`, so they look something like `p1_2030_Tech_CES_with_RPS`. Case IDs are used in the `scenario_definitions_fn` file (it's `test_scenario_inputs.csv` or `test_scenario_inputs_short.csv` in the example), and the `emission_policies_fn` (`test_rps_ces_emission_limits.csv`).

## Planning periods

When running a single planning period, many functions expect the parameters `model_year` and `model_first_planning_year` to be integers (a single year). In a multi-planning period settings file, each of these parameters should be a list of integers and they should be the same length. They now represent a paired series of the first and last years in each of the planning periods to be investigated.

```
model_year: [2030, 2045]
model_first_planning_year: [2020, 2031]
```

In this case, planning years of 2030 and 2045 will be investigated. Hourly demand is calculated for planning years. The first year in a planning period is needed because technology costs are calculated as the average of all costs over a planning period. So for the first planning period of 2020-2030, load/demand will be calculated for 2030 and the cost of building a new generator will be the average of all values from 2020-2030.

## Settings management

The parameter `settings_management` is a nested dictionary with alternative values for any parameters that will be modified as part of a sensitivity, or that might have different values across planning periods. The structure of this dictionary is:

```
settings_management:
  <model year>:
    <sensitivity column name>:
      <sensitivity value name>:
        <settings parameter name>:
          <settings parameter value>
```

`<sensitivity column name>` is the name of a column in the `scenario_definitions_fn` parameter (it's `test_scenario_inputs.csv` in the example). The first columns of this file have a `case_id` and `year` that uniquely define each model run. Model runs might test the effect of different natural gas prices (`ng_price` in the example file), with values of `reference` and `low`. The corresponding section of the `settings_management` parameter for the planning year 2030 will look like:

```
settings_management:
  2030:
    ng_price:  # <sensitivity column name>
      reference:  # <sensitivity value name>
        aeo_fuel_scenarios:  # <settings parameter name>
          naturalgas: reference  # <settings parameter value>
      low:
        aeo_fuel_scenarios:
          naturalgas: high_resource
```
So in this case we're modifying the settings parameter `aeo_fuel_scenarios` by defining different AEO scenario names for the `naturalgas` fuel type. By default, this section of the settings file looks like:

```
eia_series_scenario_names:
  reference: REF2020
  low_price: LOWPRICE
  high_price: HIGHPRICE
  high_resource: HIGHOGS
  low_resource: LOWOGS

aeo_fuel_scenarios:
  coal: reference
  naturalgas: reference
  distillate: reference
  uranium: reference
```

So we're changing the AEO case from `reference` to `high_resource` (which correspond to `REF2020` and `HIGHOGS` in the EIA open data API).

It's important to understand that parameter values are updated by searching for `key:value` pairs in a dictionary and updating them. This means that in the example above I was able to change the AEO scenario for just natural gas prices, and I didn't have to list the other fuel types. But if the `value` is a list and only one item should be changed, then the entire list must be included in `settings_management`. As an example, cost scenarios for new-build generators are usually defined like:

```
# Format for each list item is <technology>, <tech_detail>, <cost_case>, <size>
atb_new_gen:
  - [NaturalGas, CCCCSAvgCF, Mid, 500]
  - [NaturalGas, CCAvgCF, Mid, 500]
  - [NaturalGas, CTAvgCF, Mid, 100]
  - [LandbasedWind, LTRG1, Mid, 1]
  - [OffShoreWind, OTRG10, Mid, 1]
  - [UtilityPV, LosAngeles, Mid, 1]
  - [Battery, "*", Mid, 1]
```

If I want to have low cost renewables capex in a scenario, the corresponding section of `settings_management` should include all technologies, even if they don't change. This is because the ATB technologies are defined in a list of lists.

```
settings_management:
  2030:
    renewable_capex:
      low:
        atb_new_gen:
          - [NaturalGas, CCCCSAvgCF, Mid, 500]
          - [NaturalGas, CCAvgCF, Mid, 500]
          - [NaturalGas, CTAvgCF, Mid, 100]
          - [LandbasedWind, LTRG1, Low, 1]
          - [OffShoreWind, OTRG10, Low, 1]
          - [UtilityPV, LosAngeles, Low, 1]
          - [Battery, "*", Low, 1]
````

In [1]:
%load_ext autoreload
%autoreload 2

In [1]:
from pathlib import Path

import pandas as pd
from powergenome.util import (
    build_scenario_settings,
    init_pudl_connection,
    load_settings,
    check_settings
)

## Import settings

Settings are imported by reading the YAML file and converting it to a Python dictionary. In the code below I'm loading the settings and creating a nested dictionary `scenario_settings` that has all of the modified parameters for each case.

Settings can also be checked for some common errors using the `check_settings` function.

In [12]:
cwd = Path.cwd()

settings_path = (
    cwd.parent / "example_systems" / "CONUS-3-zone" / "settings"
)
settings = load_settings(settings_path)
settings["input_folder"] = settings_path.parent / settings["input_folder"]
scenario_definitions = pd.read_csv(
    settings["input_folder"] / settings["scenario_definitions_fn"]
)
scenario_settings = build_scenario_settings(settings, scenario_definitions)

pudl_engine, pudl_out, pg_engine = init_pudl_connection(
    freq="AS",
    start_year=min(settings.get("eia_data_years")),
    end_year=max(settings.get("eia_data_years")),
)

check_settings(settings, pg_engine)

We can check to see that the `p1` case (no emission constraints) does not point to an emissions policy file and that `p2` does.

In [15]:
scenario_settings[2050]["p1"]["emission_policies_fn"]

In [16]:
scenario_settings[2050]["p2"]["emission_policies_fn"]

'emission_policies.csv'