# Lake model continued

In the previous week you used the lake problem as a means of getting aquinted with the workbench. In this assignment we will continue with the lake problem, focussing explicitly on using it for open exploration. You can use the second part of [this tutoria](https://emaworkbench.readthedocs.io/en/latest/indepth_tutorial/open-exploration.html) for help.

**It is paramount that you are using the lake problem with 100 decision variables, rather than the one found on the website with the seperate anthropogenic release decision**

## Apply sensitivity analysis
There is substantial support in the ema_workbench for global sensitivity. For this, the workbench relies on [SALib](https://salib.readthedocs.io/en/latest/) and feature scoring which is a machine learning alternative for global sensitivity analysis. 


1. Apply Sobol with 3 seperate release policies (0, 0.05, and 0.1) and analyse the results for each release policy seperately focusing on the reliability objective. Do the sensitivities change depending on the release policy? Can you explain why or why not?

*hint: you can use sobol sampling for the uncertainties, and set policies to a list with the 3 different release policies. Next, for the analysis using logical indexing on the experiment.policy column you can select the results for each seperate release policy and apply sobol to each of the three seperate release policies. If this sounds too complicated, just do it on each release policy seperately.*



In [1]:
from lakemodel_function import lake_problem

from ema_workbench import Model, RealParameter, TimeSeriesOutcome, ScalarOutcome, Policy, ema_logging

if __name__ == "__main__":
    ema_logging.log_to_stderr(level=ema_logging.INFO)

    model = Model('lakemodel', function=lake_problem)
    
    model.levers = [RealParameter(f"l{i}",0,0.1) for i in range(100)]
    
    model.uncertainties = [RealParameter('mean', 0.01, 0.05),
                       RealParameter('stdev', 0.001, 0.005),
                       RealParameter('b', 0.1, 0.45),
                       RealParameter('q', 2, 4.5),
                       RealParameter('delta', 0.93, 0.99)]
    
    model.outcomes = [ScalarOutcome('max_P'),
                      ScalarOutcome('utility'),
                      ScalarOutcome('inertia'),
                      ScalarOutcome('reliability')]

In [2]:
from ema_workbench import Policy

policies = [Policy('0', **{l.name:0 for l in model.levers}),
            Policy('0.05', **{l.name:0.05 for l in model.levers}),
            Policy('0.1', **{l.name:0.1 for l in model.levers})]

In [3]:
from ema_workbench import MultiprocessingEvaluator, ema_logging,SequentialEvaluator
from ema_workbench.em_framework.evaluators import SOBOL
from ema_workbench.em_framework import get_SALib_problem

ema_logging.log_to_stderr(ema_logging.INFO)

with MultiprocessingEvaluator(model) as evaluator:
    experimentsSOBOL, resultsSOBOL = evaluator.perform_experiments(100, policies, uncertainty_sampling=SOBOL)

[MainProcess/INFO] pool started
[MainProcess/INFO] performing 1200 scenarios * 3 policies * 1 model(s) = 3600 experiments
[MainProcess/INFO] 360 cases completed
[MainProcess/INFO] 720 cases completed
[MainProcess/INFO] 1080 cases completed
[MainProcess/INFO] 1440 cases completed
[MainProcess/INFO] 1800 cases completed
[MainProcess/INFO] 2160 cases completed
[MainProcess/INFO] 2520 cases completed
[MainProcess/INFO] 2880 cases completed
[MainProcess/INFO] 3240 cases completed
[MainProcess/INFO] 3600 cases completed
[MainProcess/INFO] experiments finished
[MainProcess/INFO] terminating pool


In [4]:
problem = get_SALib_problem(model.uncertainties)
problem

{'num_vars': 5,
 'names': ['b', 'delta', 'mean', 'q', 'stdev'],
 'bounds': [(0.1, 0.45), (0.93, 0.99), (0.01, 0.05), (2, 4.5), (0.001, 0.005)]}

In [None]:
from SALib.analyze.sobol import analyze
sobol_results = {}

for policy in experiments.policy.unique():
    logical = experiments.policy == policy
    y = results['reliability']
    indices = analyze(problem,y)
    sobol_results[policy] = indices

In [None]:
from IPython.core import display as ICD
import pandas as pd

for i in sobol_results.keys():
    Si_filter = {k:sobol_results[i][k] for k in ['ST','ST_conf','S1','S1_conf']}
    Si_df = pd.DataFrame(Si_filter, index=problem['names'])
    ICD.display(i,Si_df)

2. Repeat the above analysis for the 3 release policies but now with extra trees feature scoring and for all outcomes of interest. As a bonus, use the sobol experiment results as input for extra trees, and compare the results with those resulting from latin hypercube sampling.

*hint: you can use [seaborn heatmaps](https://seaborn.pydata.org/generated/seaborn.heatmap.html) for a nice figure of the results*


In [None]:
with MultiprocessingEvaluator(model) as evaluator:
    experimentsLHS, resultsLHS = evaluator.perform_experiments(100, policies, uncertainty_sampling= 'lhs')

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt
from ema_workbench.analysis import feature_scoring

cleaned_experimentsLHS = experimentsLHS.drop(columns=[l.name for l in model.levers]).drop(columns=['policy','model'])
cleaned_experimentsSOBOL = experimentsSOBOL.drop(columns=[l.name for l in model.levers]).drop(columns=['policy','model'])

for policy in experimentsLHS.policy.unique():
    
    fig, (ax0, ax1) = plt.subplots(1, 2,figsize=(15, 5))
    fig.suptitle('Policy: %s'%policy)
    
    for i,j in enumerate(['SOBOL','LHS']):  
        
        logical = eval('experiments'+j).policy == policy
        subset_results = {k:v[logical] for k,v in eval('results'+j).items()}
        scores = feature_scoring.get_feature_scores_all(eval('cleaned_experiments'+j)[logical],subset_results,alg='extra trees')
        
        eval('ax'+str(i)).set_title(j)
        sns.heatmap(scores,annot=True, cmap='viridis',ax=eval('ax'+str(i)))

    plt.show()