In [None]:
import pandas as pd
import SALib
from pathlib import Path
import os
import subprocess
import sys
import matplotlib.pyplot as plt

To view this notebook as slides, run `jupyter nbconvert examples/example.ipynb --to slides --post serve` on the command line

# ECEMP Skills Workshop: Global Sensitivity Analysis

- Will Usher, KTH Royal Institute of Technology
- Trevor Barnes, Simon Fraser University

# Introduction

# What is global sensitivity analysis?

- Global sensitivity analysis is “*the study of how uncertainty in the output of a model...can be apportioned to different sources of uncertainty in the model input*”. 
- “Global” means that all parameters are moved over their input ranges at the same time.
- By contrast, in one-at-a-time (OAT) sensitivity analysis, only one dimension of the model input space is explored while holding all other dimensions at their central value.
- An excellent introduction to global sensitivity analysis can be found in Saltelli (2007).

# Why perform global sensitivity analysis?

- Most energy system optimisation modelling studies use scenario analysis
- Global sensitivity analysis can supplement and support scenario analysis
  - identify key drivers of model results (factor prioritisation)
  - identify unimportant parameters (factor fixing)
  - identify interesting areas of the input space (factor mapping)

When models are used in a static manner, for example when influential parameters are held near a constant central value, these aspects of optimisation models – the apparent redundancy of parameters, lack of interaction effects and non-linear behaviour, may induce modellers to overlook parametric uncertainty. For example, modellers understandably tend to neglect the dimensions of the input space that are not represented in the results, resulting in parts of the model going “stale” as assumptions are not revisited or refreshed. On the other hand, modellers will commit a lot of attention to the parameters associated with technologies that do appear in the model results, even if those parameters are less influential than one of the static parameters.

## Methods for conducting global sensitivity analysis

Variance-based approaches compute the proportion of the variance in the output that is explained by variation in the input. 

- Computation of a first-order (only the direct effect) and total-order (direct and interaction effects) index for each input recovers all interaction and non-linear effects of a model. 
- Variance-based approaches require a relatively large number \$(500(k+2))\$ of model runs.

Elementary effects test, or “Method of Morris” lowers computationally needs \$(10(k+1))\$ by computing an average of local derivatives over a discrete space of input parameters.

- It provides an estimate for the total-order index produced by more advanced variance-based approaches 
- can be applied to groups of inputs to increase coverage and reduce computational demands.

# References

Andrea Saltelli, Marco Ratto, Terry Andres, Francesca Campolongo, Jessica Cariboni, Debora Gatelli, Michaela Saisana, and Stefano Tarantola. Global Sensitivity Analysis. The Primer. 1st ed. John Wiley & Sons, Ltd, 2007. https://doi.org/10.1002/9780470725184.


# Examples

## Example One

In this example, we will first show build a simple model and perform 
**LOCAL** sensitivity analysis

### Reference Energy System

Consider the simple energy system optimization model below. It includes:

- A mining technology for uranium
- A nuclear power generation technology
- Uranium fuel
- Electricity fuel

![RES](./assets/Example_One/RES.PNG)

### Input Data

Lets first create structural input data for the model

In [None]:
# REFERENCE ONLY. 
# DATA ALREADY CREATED. 
# DO NOT CHANGE
data_dir = Path('assets/Example_One/')

YEARS = range(2020, 2070)
TECHNOLOGIES = ['MINE_URANIUM', 'NUCLEAR']
FUELS = ['URANIUM', 'ELECTRICITY']
REGIONS = ['R1']
TIME_SLICE = ['S1D1', 'S1D2', 'S2D1', 'S2D2','S3D1','S3D2','S4D1','S4D2']
YEAR_SPLIT = 0.125
MODES = [1]

#### Add Capital Costs

In [None]:
def add_capex(cost):
    years = range(2020, 2070)
    df = pd.DataFrame(
        [['R1', 'NUCLEAR', years[0], cost]] * len(years), 
        columns=['REGION', 'TECHNOLOGY', 'YEAR', 'VALUE'])
    df['YEAR'] = years
    df.to_csv(Path(data_dir, 'data', 'CapitalCost.csv'), index=False)
    return df

In [None]:
capex = 4000 # M$ / GW
add_capex(capex).head()

#### Add Fixed Operational Costs

In [None]:
def add_fixed(cost):
    years = range(2020, 2070)
    df = pd.DataFrame(
        [['R1', 'NUCLEAR', YEARS[0], cost]] * len(YEARS), 
        columns=['REGION', 'TECHNOLOGY', 'YEAR', 'VALUE'])
    df['YEAR'] = YEARS
    df.to_csv(Path(data_dir, 'data', 'FixedCost.csv'), index=False)
    return df

In [None]:
fixed = 50 # M$ / GW
add_fixed(fixed).head()

#### Add Variable Costs

In [None]:
def add_variable(cost):
    years = range(2020, 2070)
    df = pd.DataFrame(
        [['R1', 'MINE_URANIUM', 1, years[0], cost]] * len(YEARS), 
        columns=['REGION','TECHNOLOGY','MODE_OF_OPERATION','YEAR','VALUE'])
    df['YEAR'] = YEARS
    df.to_csv(Path(data_dir, 'data', 'VariableCost.csv'), index=False)
    return df

In [None]:
variable = 2.5 # M$ / PJ
add_variable(variable).head()

#### Add Nuclear Power Plant Efficiency 

In [None]:
def add_efficiency(eff):
    years = range(2020, 2070)
    df = pd.DataFrame(
        [['R1', 'NUCLEAR', 'URANIUM', 1, years[0], eff]] * len(years), 
        columns=['REGION','TECHNOLOGY','FUEL','MODE_OF_OPERATION','YEAR','VALUE'])
    df['YEAR'] = years
    df.to_csv(Path(data_dir, 'data', 'InputActivityRatio.csv'), index=False)
    return df

In [None]:
eff = 1.25 # 80 %
add_efficiency(eff).head()

#### Add Demand

In [None]:
def add_demand(start_value, yearly_increase):
    demand_data = []
    for year in range(2020, 2070):
        demand_data.append([
            'R1', 
            'ELECTRICITY',
            year,
            start_value*(1+yearly_increase)**(year-YEARS[0])
        ])
    df = pd.DataFrame(demand_data, columns=['REGION','FUEL','YEAR','VALUE'])
    df.to_csv(Path(data_dir, 'data', 'SpecifiedAnnualDemand.csv'), index=False)
    return df

In [None]:
start_value = 1000 # PJ
yearly_increase = 0.05 # %
add_demand(start_value, yearly_increase).head()

#### Add Operational Life

In [None]:
def add_operational_life(op_life):
    df = pd.DataFrame([['R1','NUCLEAR',op_life]], columns=['REGION','TECHNOLOGY','VALUE'])
    df.to_csv(Path(data_dir, 'data', 'OperationalLife.csv'), index=False)
    return df

In [None]:
op_life = 50 # years
add_operational_life(op_life).head()

### One-at-a-time Sensitivity Analysis

Run the model with the baseline input parameters and get a reference objective value

In [None]:
def run_model(path_to_data_dir, path_to_model_file, result_file_name):
    os.system(
        'otoole convert datapackage datafile {} {}' .format(
            Path(path_to_data_dir,'datapackage.json'), 
            Path(path_to_data_dir,'data.txt')
        )
    )
    os.system(
        'glpsol -m {} -d {} --wlp {} --check' .format(
            Path(path_to_model_file),
            Path(path_to_data_dir,'data.txt'),
            Path(path_to_data_dir,'model.lp'),
        )
    )
    os.system(
        'cbc {} solve -solu {}' .format(
            Path(path_to_data_dir,'model.lp'),
            Path(path_to_data_dir,result_file_name)
        )
    )

def get_objective_value(result_files):
    for result_file in result_files:
        print(f'{result_file.name}')
        os.system('head -1 {}' .format(Path(result_file)))

In [None]:
run_model(data_dir, '../resources/osemosys.txt', 'baseline.txt')
get_objective_value([Path(data_dir, 'baseline.txt')])

Now that we have got a basline objective value ($572721), we would choose a parmaeter to alter a few times to understand how it effects the overall objective value

In [None]:
result_files = []

# Iterate over capital cost
capital_costs = [2500, 3000, 3500, 4500, 5000]
for capital_cost in capital_costs:
    add_capex(capital_cost)
    result_file = f'capex_{capital_cost}.txt'
    result_files.append(Path(data_dir, result_file))
    run_model(data_dir, '../resources/osemosys.txt', result_file)

# resest to origianl capital cost
add_capex(4000)

# Iterate over variable cost
var_costs = [1.5, 2.0, 3.0, 3.5, 4.0]
for var_cost in var_costs:
    add_variable(var_cost)
    result_file = f'var_cost_{var_cost}.txt'
    result_files.append(Path(data_dir, result_file))
    run_model(data_dir, '../resources/osemosys.txt', result_file)

# reset to original variable cost
add_variable(2.5)

In [None]:
get_objective_value(result_files)

Create simple plots to see how the costs change with parameters

In [None]:
df_capex = pd.DataFrame([
    [2500, 446506],
    [3000, 488577],
    [3500, 530649],
    [4000, 572721],
    [4500, 614793],
    [5000, 656865],
], columns=['Capex', 'Objective_Cost']).set_index('Capex')

df_var = pd.DataFrame([
    [1.5, 511727],
    [2.0, 542224],
    [2.5, 572721],
    [3.0, 603218],
    [3.5, 633715],
    [4.0, 664212],
], columns=['Var_Cost', 'Objective_Cost']).set_index('Var_Cost')

fig, axs = plt.subplots(1,2, figsize=(14,5), sharey=True)
df_capex.plot(ax=axs[0], marker='o', title='Effects of Capital Cost', xlabel='Capital Cost (M$/GW)', ylabel='Objective Cost (M$)')
df_var.plot(ax=axs[1], marker='o', title='Effects of Variable Cost', xlabel='Variable Cost (M$/PJ)', ylabel='Objective Cost (M$)')

### Problems with local sensitivity analysis

- Does not capture interactions
- Difficult to understand what parameter is more influential on results

## Example Two 

We will repeat example one, except employ **GLOBAL** sensitivity analysis using
the Method of Morris. This will be done through an automated workflow. 

### Reference Energy Systen

Our RES remains the same. It includes:
- A mining technology for uranium
- A nuclear power generation technology
- Uranium fuel
- Electricity fuel

![RES](examples/assets/Example_One/RES.PNG)

## Introduce the workflow

1. Update config files
2. Run the workflow
3. Assess results

### Update a configuration files

Navigate to `config/scenarios.csv` file and update it to match the following:


| name | description      | path                             |
|------|------------------|----------------------------------|
|0     | "Simple Example" | examples/assets/datapackage.json |

Navigate to [`config/parameters.csv`](../edit/config/parameters.csv) file and update it to match the following:

|name |group|indexes|min_value_base_year|max_value_base_year|min_value_end_year|max_value_end_year|dist|interpolation_index|action|
|-----|-----|-------|-------------------|-------------------|------------------|------------------|----|-------------------|------|
|CapitalCost  |capital  |"R1,NUCLEAR"        |4000 |5000 |2000 |3000 |"unif" |"YEAR" |"interpolate" |
|VariableCost |variable |"R1,MINE_URANIUM,1" |3.5  |4.0  |1.5  |1.75 |"unif" |"YEAR" |"interpolate" |

In this table, we are specifying one of these three different methods for interpolating values over time. 

![GSA_options](examples/assets/Example_One/GSA_options.PNG)

### Run the Workflow

In [None]:
def clean_snakemake():
    wd = sys.path[0]
    os.chdir(wd)
    subprocess.Popen("snakemake --cores 4 clean", cwd="..", shell=True)
def run_snakemake():
    wd = sys.path[0]
    os.chdir(wd)
    subprocess.Popen("snakemake --cores 4", cwd="..", shell=True)

clean_snakemake()
run_snakemake()

In [None]:
!snakemake --cores 4

In [None]:
!ls
