Welcome to the README for the Sensitivity Modular Pipeline.

To run this notebook, please first run the cells in the `Setup` section at the bottom of this notebook to set up a dummy catalog and parameters. 

This cell is hidden in built docs

# Sensitivity Modular Pipeline

When conducting predictive-prescriptive projects (like Optimus), understanding of how the predictive model (objective function) takes on different values while varying control conditions is critical at multiple project steps.

Sensitivity data and charts help to improve predictive modeling by quickly surfacing any physical relationships which the model incorrectly captures. Moreover, such a visualization can help to increase operator and client buy-in by helping operators to understand the logic and relationships which the model has learned.

On the optimization side, sensitivity charts and sensitivity data can help when making choices on which recommendations should be accepted. For example, an operator might chose to set a control at a value different from a recommendation if the sensitivity chart is relatively flat in the region of the recommendation, which implies that the objective function would not change much by tweaking that single control.

## Overview

The main pipeline in the sensitivity modular pipeline creates a dataframe to enable later creation of sensitivity charts as described above. The produced data is at the recommendation level, and provides the estimated objective and control tag values, holding all other variables constant. CSTs can use this data in 3 ways:

1. To create static visuals themselves
2. To leverage the sensitivity charts within the UI.
3. To interactively query the objective function (predictive model) in the bundled model "simulator". The model simulator is a `streamlit` application which can be found in the same directory as this pipeline, and which creates an interactive sensitivity chart. The user can use this to view the objective function values while dynamically tweaking control and state values.

## Parameters

The sensitivity pipeline requires a small number of parameters to run. In particular, users need to specify the default resolution of the plot, as well as the columns to use as unique ids when specifying a block/shift and it's associated recommendations. If using neural network predictive models, users may need to also specify a dictionary of extra keyword arguments.

When leveraging the interactive model simulator, users will need to specify a mapping of dataset names. The model simulator loads a few generic data sources to memory when it runs, and depending on their use-case users may need to change the datasource names in order to point to the appropriate quantities in their respective data catalog.


```yaml
recommend_sensitivity:
    n_points: 50 # Resolution/number of objective values to plot when the tagdict doesn't specify a constraint set of values.
    unique_ids: # The unique columns which help to identify a set of recommendations.
        - run_id
        - timestamp
    objective_kwargs: {} # When performing counterfactuals with neural networks, these may need to be specified.
    sensitivity_app_data_mapping: # Datasets to map when using the streamlit application.
        features: test_static_features # Name of the dataset holding features to load.
        model: train_trainset_model # Name of the dataset holding the objective to load.
        recs: test_recommendations # Name of the dataset holding the recommendations to load.
        sensitivity_data: test_sensitivity_plot_df # Dataframe of sensitivity data to load.
        timestamp_col: timestamp # Column id of the timestamp column 
```

## Data Sets
The sensitivity pipeline must be run *after* the optimization/recommend pipeline. This is because of the dependency the sensitivity pipeline has on the current recommendations. These must be known in order to properly evaluate the counterfactual of what would happen if the control values are changed independently.

### Inputs
The sensitivity pipeline requires the following input:

- `sensitivity.td`: The tag dictionary.
- `sensitivity.input_model`: The predictive model which serves as objective function for optimization.
- `sensitivity.input_data`: The dataframe of features to be used for the optimization/counterfactuals. These data were not used for model building.
- `sensitivity.recommendations`: The set of recommendations that were found when running the `recommend` modular pipeline on `sensitivity.input_data`.


### Outputs
The pipeline creates the following outputs:
- `sensitivity.sensitivity_plot_df`: A dataframe that holds data for creating sensitivity charts for every recommendation-control combination present in input data. Running this pipeline for 1 recommendation that's provided for two controls will result in a sensitivity plot dataframe that can be used to create two curves, of the `objective` vs. `control_1` and of the `objective`
vs. `control_2`.

### Intermediate Outputs
The sensitivity pipeline has only one node, and as such has no intermediate outputs.

## Example Usage

This next cell provdes a code sample that applies the sensitivity pipeline on example catalog and data.

In [95]:
### Pipeline Execution
from kedro.pipeline import Pipeline, pipeline
from optimus_pkg.modular_pipelines import sensitivity

sensitivity_pipeline = Pipeline(
    [
        pipeline(
            pipe=sensitivity.create_pipeline(),
            inputs={"sensitivity.input_data": "reading_data",
                    "sensitivity.td" : "tag_dict",
                    "sensitivity.input_model": "mock_model",
                    "sensitivity.recommendations": "mock_recommendations"
                   },
            parameters={"params:sensitivity": "params:example_sensitivity"},
            namespace="build",
        ),
    ]
)

In [96]:
from kedro.runner import SequentialRunner
sensitivity = SequentialRunner().run(pipeline = sensitivity_pipeline, catalog = catalog)

## Pipeline Nodes

### Generate Sensitivity Data (`create_sensitivity_plot_data`)
Users must run `create_sensitivity_plot_data` to generate the data which can later be used for creating sensitivity charts. The example in this readme generates a dataframe with 1000 rows, since it is generating sensitivity curves for __10__ recommendations, each of which contains __2__ controllable variables, and it's evaluating the objective for each of these by default at __50__ points.

This function generates sensitivity data by evaluating the objective at different values of the control tag, while keeping the other state and control values constant at those given by the observed optimization data and recommendations.

The generated dataframe has a simple structure, for a given (`run_id`, `timestamp`) combination, it provides the `value` and `tag_id` of the target and the control tag being varied.

In [102]:
from optimus_pkg.modular_pipelines.sensitivity.nodes import create_sensitivity_plot_data
sensitivity_output = create_sensitivity_plot_data(params=catalog.load('params:example_sensitivity'),
                                                  td=catalog.load('tag_dict'),
                                                  opt_df=catalog.load('reading_data'),
                                                  model=catalog.load('mock_model'),
                                                  recommendations=catalog.load('mock_recommendations'))

# Inspect output
sensitivity_output['sensitivity_plot_df']

Unnamed: 0,target_value,control_value,control_tag_id,target_tag_id,run_id,timestamp
0,-2.329794,0.0,mill_a_power,out_quantity,90,2020-01-17 01:00:00+00:00
1,-2.329794,10.0,mill_a_power,out_quantity,90,2020-01-17 01:00:00+00:00
2,-2.329794,20.0,mill_a_power,out_quantity,90,2020-01-17 01:00:00+00:00
3,-2.329794,30.0,mill_a_power,out_quantity,90,2020-01-17 01:00:00+00:00
4,-2.329794,40.0,mill_a_power,out_quantity,90,2020-01-17 01:00:00+00:00
...,...,...,...,...,...,...
995,566.114624,450.0,mill_b_power,out_quantity,99,2020-01-18 13:00:00+00:00
996,566.114624,460.0,mill_b_power,out_quantity,99,2020-01-18 13:00:00+00:00
997,566.114624,470.0,mill_b_power,out_quantity,99,2020-01-18 13:00:00+00:00
998,566.114624,480.0,mill_b_power,out_quantity,99,2020-01-18 13:00:00+00:00


## Interactive App
Bundled with this modular pipeline is a `streamlit` application that can be used by data scientists in order to understand and query their developed predictive model. In particular, it allows users to sanity check their model by manipulating state and control variables directly. Users can generate a sensitivity curve (holding a chosen control variable constant) to then inspect that the objective function captures expected physical relationships.

Users are encouraged to leverage this application for working directly with subject matter experts, either internally or from the client. Once the dependency is installed (streamlit, via `pip install streamlit`), users can fire up the application by entering 

```bash
streamlit run <path/to/streamlit_application.py>
```

## Set Up

Run all of the cells below to set up a dummy catalog and parameters file so that this README can be executed.

(This cell is hidden in built docs.)

In [1]:
# HIDDEN
%reload_kedro
import logging, sys
logging.disable(sys.maxsize)

  and should_run_async(code)


2020-10-01 13:45:11,196 - root - INFO - ** Kedro project Project Clisham
2020-10-01 13:45:11,227 - root - INFO - Defined global variable `context` and `catalog`
2020-10-01 13:45:11,241 - root - INFO - Registered line magic `run_viz`


### Datasets

The following cells set up all datasets needed by the readme, and adds them to the Kedro catalog.

In [2]:
import pandas as pd
from kedro.io import MemoryDataSet

catalog = context.catalog

import pandas as pd
import numpy as np
from kedro.io import MemoryDataSet

catalog = context.catalog

# Dataset
reading_data = pd.DataFrame(
    [
        ["2020-01-02 01:00:00+00:00",687,0.08,696,0.5188,1,225,150,312.0532,674.4625,1],
        ["2020-01-02 05:00:00+00:00",1008,0.0801,995,0.5485,1,150,100,311.8781,675.2811,1],
        ["2020-01-02 09:00:00+00:00",1022,0.0803,1001,0.5467,1,150,100,314.8943,679.4983,1],
        ["2020-01-02 13:00:00+00:00",919,0.0805,912,0.5671,1,100,110,312.1971,674.2039,1],
        ["2020-01-02 17:00:00+00:00",862,0.0804,862,0.5944,1,200,70,314.6743,672.8459,1],
        ["2020-01-02 21:00:00+00:00",970,0.0802,969,0.6146,1,250,90.1181252218157,307.657,682.2695,1],
        ["2020-01-03 01:00:00+00:00",974,0.0799,988,0.6178,1,175,90,306.5857,675.7084,1],
        ["2020-01-03 05:00:00+00:00",0,0.0798,0,0.6508,0,0,3.03164900590976E-14,327.9345,698.3604,0],
        ["2020-01-03 09:00:00+00:00",803,0.0805,800,0.657,1,0,90,310.9423,681.5325,0],
        ["2020-01-03 13:00:00+00:00",961,0.0802,979,0.6259,1,175,110,310.0354,674.3029,1],
        ["2020-01-03 17:00:00+00:00",658,0.0804,660,0.6501,1,200,150,313.3239,676.1373,1],
        ["2020-01-03 21:00:00+00:00",0,0.0813,0,0.6508,0,0,0,317.5846,707.0803,0],
        ["2020-01-04 01:00:00+00:00",650,0.0807,667,0.654,1,200,150,316.2123,672.7933,1],
        ["2020-01-04 05:00:00+00:00",686,0.0805,682,0.65,1,200.416666666666,140,313.1902,669.526,1],
        ["2020-01-04 09:00:00+00:00",532,0.0796,536,0.5673,1,250,150,320.9448,675.5863,1],
        ["2020-01-04 13:00:00+00:00",652,0.0795,649,0.5685,1,150,140,308.2203,674.5019,1],
        ["2020-01-04 17:00:00+00:00",0,0.0788,0,0.6508,0,0,0,338.9611,690.3064,0],
        ["2020-01-04 21:00:00+00:00",917,0.0792,918,0.6027,1,150,120,309.0135,680.4741,1],
        ["2020-01-05 01:00:00+00:00",714,0.0805,698,0.6244,1,0,80,308.7847,673.3415,0],
        ["2020-01-05 05:00:00+00:00",829,0.081,827,0.6226,1,300,119.999999999999,313.1086,672.6314,1],
        ["2020-01-05 09:00:00+00:00",936,0.0819,933,0.6421,1,150,90,307.0394,672.8082,1],
        ["2020-01-05 13:00:00+00:00",861,0.0821,865,0.6334,1,102.083333333333,90.5,310.1825,670.6424,1],
        ["2020-01-05 17:00:00+00:00",969,0.0824,961,0.6014,1,248.693877320413,110,309.7602,678.7276,1],
        ["2020-01-05 21:00:00+00:00",858,0.0822,871,0.61,1,200,70.4514585551491,310.7353,677.3865,1],
        ["2020-01-06 01:00:00+00:00",186,0.0819,185,0.6133,1,100,150,322.5813,677.7321,1],
        ["2020-01-06 05:00:00+00:00",909,0.0818,904,0.6478,1,225.416666666666,80.6666666666667,309.0367,678.4767,1],
        ["2020-01-06 09:00:00+00:00",725,0.0816,729,0.6609,1,249.999999999999,130,312.6725,678.0195,1],
        ["2020-01-06 13:00:00+00:00",908,0.081,918,0.6513,1,174.999999999999,100,316.2988,672.8651,1],
        ["2020-01-06 17:00:00+00:00",693,0.0812,692,0.6915,1,249.999999999999,130,313.826,669.6709,1],
        ["2020-01-06 21:00:00+00:00",831,0.0813,830,0.679,1,99.9999999999999,110,310.9738,671.7716,1],
        ["2020-01-07 01:00:00+00:00",508,0.0811,505,0.7231,1,149.999999999999,140,314.2215,680.3052,1],
        ["2020-01-07 05:00:00+00:00",982,0.0811,980,0.7086,1,199.999999999999,90.1181252218157,317.2909,671.0733,1],
        ["2020-01-07 09:00:00+00:00",528,0.0813,509,0.7175,1,299.999999999999,60.6181252218157,316.6879,681.8346,1],
        ["2020-01-07 13:00:00+00:00",141,0.0816,143,0.6899,1,299.999999999999,60,325.0202,669.8182,1],
        ["2020-01-07 17:00:00+00:00",746,0.0816,740,0.665,1,0,90.6181252218157,314.6654,674.2833,0],
        ["2020-01-07 21:00:00+00:00",856,0.0821,836,0.6829,1,249.999999999999,80,306.9396,678.6587,1],
        ["2020-01-08 01:00:00+00:00",865,0.082,858,0.6919,1,249.999999999999,120,311.0646,675.179,1],
        ["2020-01-08 05:00:00+00:00",0,0.082,0,0.6508,0,0,0,337.7422,686.6715,0],
        ["2020-01-08 09:00:00+00:00",873,0.0819,876,0.7345,1,224.999999999999,109.999999999999,308.5363,676.043,1],
        ["2020-01-08 13:00:00+00:00",588,0.0823,587,0.7716,1,249.999999999999,70,315.7485,675.3029,1],
        ["2020-01-08 17:00:00+00:00",450,0.082,447,0.7723,1,249.999999999999,140,321.2094,672.5368,1],
        ["2020-01-08 21:00:00+00:00",900,0.0817,900,0.7466,1,174.999999999999,90,312.7723,673.4908,1],
        ["2020-01-09 01:00:00+00:00",592,0.0823,589,0.7768,1,99.9999999999999,80,313.3557,676.8744,1],
        ["2020-01-09 05:00:00+00:00",605,0.0823,602,0.8284,1,200,140,313.4357,670.8157,1],
        ["2020-01-09 09:00:00+00:00",83,0.0819,79,0.8308,1,0,140,317.4701,676.8344,0],
        ["2020-01-09 13:00:00+00:00",844,0.0827,843,0.8124,1,200,119.999999999999,311.4123,673.174,1],
        ["2020-01-09 17:00:00+00:00",822,0.0826,821,0.8134,1,200.416666666666,120.118125221815,309.6565,675.2802,1],
        ["2020-01-09 21:00:00+00:00",592,0.0826,592,0.8113,1,225,149.118125221815,319.2571,675.2868,1],
        ["2020-01-10 01:00:00+00:00",553,0.0826,557,0.8108,1,225,150,317.0447,675.6524,1],
        ["2020-01-10 05:00:00+00:00",532,0.0826,535,0.8091,1,225,150,315.7294,672.3293,1],
        ["2020-01-10 09:00:00+00:00",801,0.0828,802,0.8365,1,250,89.9999999999999,311.1699,676.1043,1],
        ["2020-01-10 13:00:00+00:00",576,0.0831,578,0.805,1,225,60,314.5984,675.91,1],
        ["2020-01-10 17:00:00+00:00",859,0.0831,861,0.8043,1,250,100,309.9465,675.2635,1],
        ["2020-01-10 21:00:00+00:00",602,0.0835,603,0.7936,1,175,140,310.1047,678.958,1],
        ["2020-01-11 01:00:00+00:00",138,0.0843,137,0.8837,1,300,129.451458555149,327.8637,671.8796,1],
        ["2020-01-11 05:00:00+00:00",637,0.0849,647,0.8738,1,150,80,314.3259,674.7014,1],
        ["2020-01-11 09:00:00+00:00",672,0.086,672,0.8464,1,100,90.1181252218157,315.5031,673.696,1],
        ["2020-01-11 13:00:00+00:00",709,0.0864,699,0.8285,1,149.166666666666,119.333333333333,312.6024,674.983,1],
        ["2020-01-11 17:00:00+00:00",449,0.0867,445,0.836,1,150,60.6181252218157,314.7537,673.8767,1],
        ["2020-01-11 21:00:00+00:00",648,0.0873,672,0.826,1,0,110,316.8285,675.3172,0],
        ["2020-01-12 01:00:00+00:00",476,0.0875,478,0.8222,1,0,120,316.4472,676.5403,0],
        ["2020-01-12 05:00:00+00:00",787,0.0875,796,0.8304,1,150.360543987079,110,309.547,673.1431,1],
        ["2020-01-12 09:00:00+00:00",509,0.0883,514,0.8148,1,100,80.2847918884824,315.5023,677.5161,1],
        ["2020-01-12 13:00:00+00:00",68,0.089,69,0.823,1,300,60.6181252218157,321.1251,676.8117,1],
        ["2020-01-12 17:00:00+00:00",735,0.0894,737,0.8205,1,99.9999999999999,99.9514585551491,309.8675,675.2958,1],
        ["2020-01-12 21:00:00+00:00",69,0.089,72,0.8508,1,101.25,60,327.7026,672.4248,1],
        ["2020-01-13 01:00:00+00:00",49,0.0892,52,0.8217,1,300,60.6181252218157,323.7028,674.8497,1],
        ["2020-01-13 05:00:00+00:00",532,0.0894,530,0.8328,1,248.693877320413,130,320.9106,677.6884,1],
        ["2020-01-13 09:00:00+00:00",542,0.0895,545,0.8008,1,175,60,307.7974,677.1192,1],
        ["2020-01-13 13:00:00+00:00",54,0.0903,53,0.7687,1,300,140,328.6859,671.6881,1],
        ["2020-01-13 17:00:00+00:00",431,0.0913,431,0.7718,1,250,140,319.522,678.7672,1],
        ["2020-01-13 21:00:00+00:00",717,0.0925,707,0.7436,1,174.943877320413,70.4514585551491,315.5843,679.195,1],
        ["2020-01-14 01:00:00+00:00",913,0.0928,913,0.7238,1,150,109.784791888482,311.3711,677.2516,1],
        ["2020-01-14 05:00:00+00:00",402,0.0926,393,0.7219,1,297.860543987079,150,317.3057,676.5348,1],
        ["2020-01-14 09:00:00+00:00",595,0.0931,586,0.6666,1,150,150,312.3444,676.34,1],
        ["2020-01-14 13:00:00+00:00",490,0.0933,501,0.6652,1,150,150,314.4505,678.3199,1],
        ["2020-01-14 17:00:00+00:00",897,0.0931,902,0.6536,1,175,120,309.7077,680.2626,1],
        ["2020-01-14 21:00:00+00:00",949,0.0929,951,0.6581,1,250,110,312.9609,677.4517,1],
        ["2020-01-15 01:00:00+00:00",868,0.0917,866,0.6445,1,250,80,312.0066,674.2676,1],
        ["2020-01-15 05:00:00+00:00",627,0.0919,621,0.5736,1,300,70,312.5692,674.1235,1],
        ["2020-01-15 09:00:00+00:00",756,0.0932,755,0.5805,1,175,60,311.1696,673.588,1],
        ["2020-01-15 13:00:00+00:00",988,0.093,987,0.5546,1,250,110,311.9389,675.2769,1],
        ["2020-01-15 17:00:00+00:00",945,0.0929,941,0.5625,1,250,120,308.5106,678.2621,1],
        ["2020-01-15 21:00:00+00:00",1015,0.0936,1018,0.5149,1,250,110,308.3757,675.4695,1],
        ["2020-01-16 01:00:00+00:00",1017,0.0948,1037,0.515,1,150,100,306.6239,677.5196,1],
        ["2020-01-16 05:00:00+00:00",789,0.094,773,0.5035,1,225,60,306.8268,674.2809,1],
        ["2020-01-16 09:00:00+00:00",933,0.0939,932,0.5274,1,300,90,309.6046,681.5638,1],
        ["2020-01-16 13:00:00+00:00",883,0.0937,904,0.5372,1,298.333333333333,90.6666666666667,308.7355,672.3064,1],
        ["2020-01-16 17:00:00+00:00",1000,0.0941,1005,0.52,1,224.166666666666,119.166666666666,308.1909,679.8845,1],
        ["2020-01-16 21:00:00+00:00",376,0.0942,375,0.5421,1,0,60,315.1672,678.5088,0],
        ["2020-01-17 01:00:00+00:00",328,0.0943,330,0.5324,1,0,60,321.0558,674.1672,0],
        ["2020-01-17 05:00:00+00:00",795,0.0936,779,0.5097,1,225,60,310.7517,677.3963,1],
        ["2020-01-17 09:00:00+00:00",724,0.0938,721,0.4969,1,150,60,312.123,673.7257,1],
        ["2020-01-17 13:00:00+00:00",807,0.094,802,0.4652,1,100,70,311.8593,674.259,1],
        ["2020-01-17 17:00:00+00:00",1058,0.095,1060,0.3816,1,224.110543987079,79.9999999999999,310.4457,677.123,1],
        ["2020-01-17 21:00:00+00:00",898,0.0949,894,0.372,1,175,59.9999999999999,307.9332,675.768,1],
        ["2020-01-18 01:00:00+00:00",1079,0.0954,1079,0.3133,1,175,119.618125221815,308.2923,679.2404,1],
        ["2020-01-18 05:00:00+00:00",1053,0.0955,1047,0.3428,1,150,79.9999999999999,307.1569,675.3588,1],
        ["2020-01-18 09:00:00+00:00",993,0.096,994,0.3254,1,0,90,303.3249,676.3484,0],
        ["2020-01-18 13:00:00+00:00",853,0.0966,854,0.3359,1,300,60.6181252218157,312.9771,678.0764,1],

    ],
    columns = ["timestamp","in_quantity","ratio","out_quantity","perc_score","on_off_ind_a","mill_b_power","mill_a_power","load_b","load_a","on_off_ind_b"]
)
reading_data["timestamp"] = pd.to_datetime(reading_data["timestamp"])
reading_data = reading_data.reset_index().rename(columns={'index':'run_id'})
catalog.add("reading_data",
            MemoryDataSet(data=reading_data),
            replace = True
           )
catalog.load("reading_data")

Unnamed: 0,run_id,timestamp,in_quantity,ratio,out_quantity,perc_score,on_off_ind_a,mill_b_power,mill_a_power,load_b,load_a,on_off_ind_b
0,0,2020-01-02 01:00:00+00:00,687,0.0800,696,0.5188,1,225.0,150.000000,312.0532,674.4625,1
1,1,2020-01-02 05:00:00+00:00,1008,0.0801,995,0.5485,1,150.0,100.000000,311.8781,675.2811,1
2,2,2020-01-02 09:00:00+00:00,1022,0.0803,1001,0.5467,1,150.0,100.000000,314.8943,679.4983,1
3,3,2020-01-02 13:00:00+00:00,919,0.0805,912,0.5671,1,100.0,110.000000,312.1971,674.2039,1
4,4,2020-01-02 17:00:00+00:00,862,0.0804,862,0.5944,1,200.0,70.000000,314.6743,672.8459,1
...,...,...,...,...,...,...,...,...,...,...,...,...
95,95,2020-01-17 21:00:00+00:00,898,0.0949,894,0.3720,1,175.0,60.000000,307.9332,675.7680,1
96,96,2020-01-18 01:00:00+00:00,1079,0.0954,1079,0.3133,1,175.0,119.618125,308.2923,679.2404,1
97,97,2020-01-18 05:00:00+00:00,1053,0.0955,1047,0.3428,1,150.0,80.000000,307.1569,675.3588,1
98,98,2020-01-18 09:00:00+00:00,993,0.0960,994,0.3254,1,0.0,90.000000,303.3249,676.3484,0


### Tag Dictionary

This cell sets up the tag dictionary needed by the readme, and adds it to the Kedro catalog

In [20]:
from project_clisham.optimus_core.tag_management import TagDict

tags = pd.DataFrame(
    [
        ["perc_score","Percentage score","input","numeric","%",0,1,0,1, np.nan,np.nan, np.nan,np.nan,"TRUE",np.nan,],
        ["in_quantity","Input quantity","input","numeric","Units",0, np.nan,np.nan,np.nan, np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,],
        ["ratio","Ratio","input","numeric","%",0,1,0,1,  np.nan,np.nan,np.nan,np.nan,"TRUE",np.nan,],
        ["out_quantity","Output quantity","output","numeric","Units",0, np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,"TRUE",],
        ["load_a","Load A","state","numeric","Units",0,1000,0,1000, np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,],
        ["mill_a_power","Mill A power","control","numeric","Units",0,500,0,500, np.nan,np.nan,"on_off_ind_a",np.nan,"TRUE",np.nan,],
        ["load_b","Load B","state","numeric","Units",0,1000,0,1000, np.nan,np.nan,np.nan,np.nan,"TRUE",np.nan,],
        ["mill_b_power","Mill B power","control","numeric","Units",0,500,0,500, np.nan,np.nan,"on_off_ind_a, on_off_ind_b",np.nan,"TRUE",np.nan,],
        ["on_off_ind_a","On Off Tag A","on_off","boolean","on/off",np.nan, np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,"TRUE",np.nan,],
        ["on_off_ind_b","On Off Tag B","on_off","boolean","on/off",np.nan, np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,"TRUE","TRUE",np.nan,],

],
    columns = ["tag","name","tag_type","data_type","unit","range_min","range_max","op_min","op_max","max_delta",'constraint_set', "on_off_dependencies","derived","model_feature","target"],

)

catalog.add("tag_dict", 
            MemoryDataSet( data=TagDict(tags)),
            replace = True
           )
catalog.load("tag_dict").to_frame()

  and should_run_async(code)


Unnamed: 0,tag,name,tag_type,data_type,unit,range_min,range_max,op_min,op_max,max_delta,constraint_set,on_off_dependencies,derived,model_feature,target
0,perc_score,Percentage score,input,numeric,%,0.0,1.0,0.0,1.0,,,,,True,
1,in_quantity,Input quantity,input,numeric,Units,0.0,,,,,,,,,
2,ratio,Ratio,input,numeric,%,0.0,1.0,0.0,1.0,,,,,True,
3,out_quantity,Output quantity,output,numeric,Units,0.0,,,,,,,,,True
4,load_a,Load A,state,numeric,Units,0.0,1000.0,0.0,1000.0,,,,,,
5,mill_a_power,Mill A power,control,numeric,Units,0.0,500.0,0.0,500.0,,,on_off_ind_a,,True,
6,load_b,Load B,state,numeric,Units,0.0,1000.0,0.0,1000.0,,,,,True,
7,mill_b_power,Mill B power,control,numeric,Units,0.0,500.0,0.0,500.0,,,"on_off_ind_a, on_off_ind_b",,True,
8,on_off_ind_a,On Off Tag A,on_off,boolean,on/off,,,,,,,,,True,
9,on_off_ind_b,On Off Tag B,on_off,boolean,on/off,,,,,,,,True,True,


### Parameters

This cell sets up all parameters needed by the readme, and adds them to the Kedro catalog.

In [21]:
import yaml
params = yaml.load("""
n_points: 50
unique_ids:
    - run_id
    - timestamp
objective_kwargs: {}
sensitivity_app_data_mapping:
    features: test_static_features
    model: train_trainset_model
    recs: test_recommendations
    sensitivity_data: test_sensitivity_plot_df
    timestamp_col: timestamp
""")
catalog.add_feed_dict({'params:example_sensitivity': params}, replace=True)

catalog.load("params:example_sensitivity")

  and should_run_async(code)
  


{'n_points': 50,
 'unique_ids': ['run_id', 'timestamp'],
 'objective_kwargs': {},
 'sensitivity_app_data_mapping': {'features': 'test_static_features',
  'model': 'train_trainset_model',
  'recs': 'test_recommendations',
  'sensitivity_data': 'test_sensitivity_plot_df',
  'timestamp_col': 'timestamp'}}

### Model
This cell creates mock predictive model which will serve as our objective function over which to calculate control sensitivities.

In [23]:
from xgboost import XGBRegressor as xgb
from project_clisham.optimus_core.transformers import SelectColumns
from sklearn.pipeline import Pipeline as SklearnPipeline

xgb_regressor = xgb(random_state=42,
           objective='reg:squarederror',
           n_estimators= 25,
           verbose=False,
           n_jobs=1)

td = catalog.load('tag_dict')
train_data = catalog.load('reading_data').iloc[:90,:]
test_data = catalog.load('reading_data').iloc[90:,:]
features = td.select('model_feature')
target = td.select('target')

model = SklearnPipeline([
    ("select_columns", SelectColumns(features)),
    ('regressor', xgb_regressor)
])
model.fit(X=train_data, y=train_data[target])

catalog.add("mock_model", 
            MemoryDataSet(data=model),
            replace = True
           )
catalog.add("mock_opt_data", 
            MemoryDataSet(data=test_data),
            replace = True
           )

  and should_run_async(code)


### Optimization Recs
This creates a (small) set of recommendations that is needed for calculating sensitivities of objective values to control values.

In [28]:
from project_clisham.pipelines.optimization.recommendation.recommendation_nodes import bulk_optimize
opt_params = yaml.load(
"""
datetime_col: 'timestamp'
model_features: ['model_feature']
opt_target: "TRUE"
solver:
    class: optimizer.solvers.DifferentialEvolutionSolver
    kwargs:
        sense: "maximize"
        seed: 0
        maxiter: 100
        mutation: [0.5, 1.0]
        recombination: 0.7
        strategy: "best1bin"
stopper:
    class: optimizer.stoppers.NoImprovementStopper
    kwargs:
        patience: 10
        sense: "maximize"
        min_delta: 0.1
n_jobs: 6
objective_kwargs: {}

""")
opt_output = bulk_optimize(opt_params, td, test_data, model, {} )
recommendations = opt_output['recommendations']
recommended_controls = opt_output['recommended_controls']
projected_optimization = opt_output['projected_optimization']
catalog.add("mock_recommendations", 
            MemoryDataSet(data=recommendations),
            replace = True
           )

  and should_run_async(code)
100%|██████████| 10/10 [00:00<00:00, 14.16it/s]
