# Lake model continued

In the previous week you used the lake problem as a means of getting aquinted with the workbench. In this assignment we will continue with the lake problem, focussing explicitly on using it for open exploration. You can use the second part of [this tutorial](https://emaworkbench.readthedocs.io/en/latest/indepth_tutorial/open-exploration.html) for help.

**It is paramount that you are using the lake problem with 100 decision variables, rather than the one found on the website with the seperate anthropogenic release decision**

## Apply scenario discovery

1. Generate 10 policies and 1000 scenarios and evaluate them.
2. The experiments array contains the values for each of the 100 decision levers. This might easily mess up the analysis. Remove these columns from the experiment array. *hint: use `experiments.drop`*
3. Apply scenario discovery, focussing on the 10 percent of worst outcomes for reliability


In [1]:
#Importing important Libraries
from lakemodel_function import lake_problem
from ema_workbench import (RealParameter, ScalarOutcome, Constant,
                           Model)
import pandas as pd
from ema_workbench.analysis import prim
import numpy as np
import matplotlib as plt

ModuleNotFoundError: No module named 'lakemodel_function'

**Step 1**: This step involves the specification of Uncertainties, Levers and outcomes for the lake problem. These will be used for the open exploration

In [None]:
from lakemodel_function import lake_problem

from ema_workbench import (Model, RealParameter, ScalarOutcome)

#instantiate the model
lake_model = Model('lakeproblem', function=lake_problem)
lake_model.time_horizon = 100 # used to specify the number of timesteps

#specify uncertainties
lake_model.uncertainties = [RealParameter('mean', 0.01, 0.05),
                            RealParameter('stdev', 0.001, 0.005),
                            RealParameter('b', 0.1, 0.45),
                            RealParameter('q', 2.0, 4.5),
                            RealParameter('delta', 0.93, 0.99)]

# set levers, one for each time step
lake_model.levers = [RealParameter(f"l{i}", 0, 0.1) for i in 
                     range(lake_model.time_horizon)] # we use time_horizon here

#specify outcomes 
lake_model.outcomes = [ScalarOutcome('max_P'),
                       ScalarOutcome('utility'),
                       ScalarOutcome('inertia'),
                       ScalarOutcome('reliability')] #  ScalarOutcome.MINIMIZE???

**Step 2**: At this level, we perform the experiments for 1000 scenarios and 10 policies as specified in the problem description. For this, we use a workbench library called MultiprocessingEvaluator

In [None]:
from ema_workbench import MultiprocessingEvaluator

n_scenarios = 1000   #Scenario specification
n_policies = 10      #Policiy specification

#performing the experiments for given number of scenarios and policies
with MultiprocessingEvaluator(lake_model) as evaluator:
    experiments, outcomes = evaluator.perform_experiments(n_scenarios, n_policies)

In [None]:
#Print the experiment results
experiments   

**Step 3**: As required in the problem descripiton, we drop the values of the 100 decision levers so that they do not mess up the analysis

In [None]:
droplist = []               #Generate empty list 
for x in range (100):
    lever = "l"+str(x)
    droplist.append("l"+str(x))      # Append decision lever values as strings to the list

print(droplist)          #print list of decision levers 

In [None]:
adjusted_experiments= experiments.drop(droplist, axis=1)      #drop the generated list pf decision levers 

In [None]:
adjusted_experiments   # Print new experiment results 

In [None]:
outcomes['reliability']   #print reliability column 

In [None]:
outcomes = pd.DataFrame.from_dict(outcomes)   # convert outcomes to dataFrame 

In [None]:
outcomes   #print outcomes 

**step 4**: Here, we need to focus on the 10% worst outcomes for reliability. For this, we first of all subset the subset those values using a nsmallest function, specifying 1000 for the percentage, since 10% of 10000 values is 1000

In [None]:
a = outcomes

In [None]:
lowest_reliability = outcomes.nsmallest(1000, "reliability")   #subset 10% worst reliability values

In [None]:
lowest_reliability_df = lowest_reliability["reliability"]

In [None]:
lowest_reliability_df.max()   # Find the maximum of the worst values 

In [None]:
x = adjusted_experiments
y = outcomes['reliability'] <lowest_reliability_df.max()
prim_alg = prim.Prim(x, y, threshold=0.1, peel_alpha = 0.11)
box1 = prim_alg.find_box()

In [None]:
box1.show_tradeoff()            
plt.show()         #print the results on a box plot 

In [None]:
box1.inspect_tradeoff()

In [None]:
#Show the results in table and visual format
box1.inspect(4)
box1.inspect(4, style='graph')
plt.show()      

In [None]:
# Show the results as scatter plots 
box1.select(23) # make boxes to cover the 23 worst values 
fig = box1.show_pairs_scatter()
fig.set_size_inches((12,12))
plt.show()

## Visualize the results using Dimensional Stacking
Take the classification of outcomes as used in step 3 of scenario discovery, and instead visualize the results using dimensional stacking. How do these results compare to the insights from scenario discovery?

In [None]:
from ema_workbench.analysis import dimensional_stacking   # import the required library for dimensional stacking 

x = experiments
y = outcomes['reliability'] <lowest_reliability_df.max()
dimensional_stacking.create_pivot_plot(x,y, 2, nbins=3)
plt.show()      #print table 