<a href="https://colab.research.google.com/github/seldonian-toolkit/Tutorials/blob/main/tutorial_c_running_the_seldonian_engine.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Table of contents
> Introduction 

> Outline

> An example Seldonian machine learning problem

> Install Seldonian Engine library

> Running the Seldonian Engine

> Extracting important quantities

> Summary



<a name="introduction"></a>
## Introduction
The Seldonian Engine library is one of the components of the Seldonian Toolkit. The engine is the core library that implements a basic Seldonian algorithm. The Experiments library is another component of the toolkit that runs many trials of a Seldonian algorithm. In doing so, it calls the engine many times. Because the Experiments library is dependent on the Engine library, but not vice versa, we present the Engine first in these tutorials. However, once you are more familiar with these libraries and Seldonian algorithms in general, you will find that the typical workflow involves first running Seldonian Experiments with the Experiments library. Once a Seldonian model is vetted with the Experiments library, then one can run the engine a single time to obtain a safe or fair model. The process can be thought of analogously to the development/deployment process. The Experiments library is used for development, and when it is time to deploy the model, the Engine library is used.


<a name="outline"></a>
## Outline
In this tutorial, you will learn how to:

* Use the engine to set up a (quasi)-Seldonian machine learning algorithm (QSA).
* Run the algorithm using the engine and understand its output.

Note that due to the choice of confidence-bound method used in this tutorial (Student's $t$-test), the algorithms in this tutorial are technically quasi-Seldonian algorithms (QSAs). See <a href="https://seldonian.cs.umass.edu/Tutorials/overview/">the overview</a> for more details.







<a name="example"></a>
## An example Seldonian machine learning problem
Consider a simple supervised regression problem with two continuous random variables $X$ and $Y$. Let the goal be to predict the label $Y$ using the single feature $X$. One approach to this problem is to use gradient descent on a linear regression model with the mean squared error (MSE) as the objective function. Recall that the mean squared error of predictions $\hat Y$ is the expected squared difference between the actual value of $Y$ and the prediction $\hat Y$, i.e., $\mathbf{E}[(Y-\hat Y)^2]$. We can approximate an optimal solution by minimizing the objective function with respect to the weights of the model, $\theta$, which in this case are just the intercept and slope of the line.

Now, let's suppose we want to add the following two constraints into the problem:


1.  Ensure that the MSE is less than or equal to $2.0$ with a probability of at least $0.9$. 
2. Ensure that the MSE is <i>greater than or equal to</i> $1.25$ with a probability of at least $0.9$.

Notice that this second constraint conflicts with the primary objective of minimizing the MSE. Though this particular constraint is contrived, it models the common setting of interest wherein safety and fairness constraints conflict with the primary objective.

This problem can now be fully formulated as a Seldonian machine learning problem:

Minimize the MSE, subject to the constraints:

*  $g_{1}: \mathrm{Mean\_Squared\_Error} \leq 2.0$, and ${\delta}_1=0.1$.  
*  $g_{2}: \mathrm{Mean\_Squared\_Error} \geq 1.25$, and ${\delta}_2=0.1$.

First, notice that the values of ${\delta}_1$ and ${\delta}_2$ are both $0.1$. This is because constraints are enforced with a probability of at least $1-{\delta}$, and we stated that the constraints should be enforced with a probability of at least $0.9$. The Seldonian algorithm will attempt to satisfy both of these constraints simultaneously, while also minimizing the primary objective, the MSE. If it cannot find a solution that satisfies the constraints at the confidence levels provided, it will return "NSF", i.e., "No Solution Found". 

Next, notice that here the MSE is <i>not</i> just the average squared error on the available training data. These constraints are much stronger: they are constraints on the MSE when the learned model is applied to <i>new data</i>. This is important because we don't just want machine learning models that appear to be safe or fair on the training data. We want machine learning models that are safe or fair when used to made decisions or predictions in the future.



<a name="install"></a>
## Install Seldonian Engine library

In [None]:
# first check Python version. Needs to be Python >= 3.8
!python --version

Python 3.8.16


In [None]:
!pip install seldonian-engine==0.7.7

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting seldonian-engine==0.7.7
  Downloading seldonian_engine-0.7.7-py3-none-any.whl (116 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.1/116.1 KB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: seldonian-engine
  Attempting uninstall: seldonian-engine
    Found existing installation: seldonian-engine 0.7.6
    Uninstalling seldonian-engine-0.7.6:
      Successfully uninstalled seldonian-engine-0.7.6
Successfully installed seldonian-engine-0.7.7


<a name="running_the_engine"></a>
## Running the Seldonian Engine


To code this example using the engine, we need to follow these steps.


1.  Define the data — we will generate some synthetic data for X and Y for this case.
2.  Create parse trees from the behavioral constraints.
3.  Define the underlying machine learning model. 
4.  Create a spec object containing all of this information and some hyperparameters — we can ignore many of these in this example. For a full list of parameters and their default values, see the API docs for <a href="https://seldonian-toolkit.github.io/Engine/build/html/_autosummary/seldonian.spec.SupervisedSpec.html#seldonian.spec.SupervisedSpec">SupervisedSpec</a>.
5. Run the Seldonian algorithm using the spec object. 

Let's write out the code to do this. Each step above is enumerated in comments in the code below. We will make heavy use of helper functions with many hidden defaults. In the tutorials that follow, we will explore how to customize running the engine.

In [None]:
# Imports first
import autograd.numpy as np   # Thinly-wrapped version of Numpy
from seldonian.models.models import LinearRegressionModel
from seldonian.spec import SupervisedSpec
from seldonian.seldonian_algorithm import SeldonianAlgorithm
from seldonian.utils.tutorial_utils import (
    make_synthetic_regression_dataset)
from seldonian.parse_tree.parse_tree import (
    make_parse_trees_from_constraints)

In [None]:
np.random.seed(0)
num_points=1000  
# 1. Define the data - X ~ N(0,1), Y ~ X + N(0,1)
dataset = make_synthetic_regression_dataset(
    num_points=num_points)

In [None]:
# 2. Create parse trees from the behavioral constraints 
# constraint strings:
constraint_strs = ['Mean_Squared_Error >= 1.25','Mean_Squared_Error <= 2.0']
# confidence levels: 
deltas = [0.1,0.1] 

parse_trees = make_parse_trees_from_constraints(
    constraint_strs,deltas)

In [None]:
# 3. Define the underlying machine learning model
model = LinearRegressionModel()

In [None]:
# 4. Create a spec object, using some
# hidden defaults we won't worry about here
spec = SupervisedSpec(
    dataset=dataset,
    model=model,
    parse_trees=parse_trees,
    sub_regime='regression',
)

In [None]:
# 5. Run seldonian algorithm using the spec object
SA = SeldonianAlgorithm(spec)
passed_safety,solution = SA.run()

Safety dataset has 600 datapoints
Candidate dataset has 400 datapoints
Have 200 epochs and 1 batches of size 400 for a total of 200 iterations
Epoch: 0, batch iteration 0
Epoch: 1, batch iteration 0
Epoch: 2, batch iteration 0
Epoch: 3, batch iteration 0
Epoch: 4, batch iteration 0
Epoch: 5, batch iteration 0
Epoch: 6, batch iteration 0
Epoch: 7, batch iteration 0
Epoch: 8, batch iteration 0
Epoch: 9, batch iteration 0
Epoch: 10, batch iteration 0
Epoch: 11, batch iteration 0
Epoch: 12, batch iteration 0
Epoch: 13, batch iteration 0
Epoch: 14, batch iteration 0
Epoch: 15, batch iteration 0
Epoch: 16, batch iteration 0
Epoch: 17, batch iteration 0
Epoch: 18, batch iteration 0
Epoch: 19, batch iteration 0
Epoch: 20, batch iteration 0
Epoch: 21, batch iteration 0
Epoch: 22, batch iteration 0
Epoch: 23, batch iteration 0
Epoch: 24, batch iteration 0
Epoch: 25, batch iteration 0
Epoch: 26, batch iteration 0
Epoch: 27, batch iteration 0
Epoch: 28, batch iteration 0
Epoch: 29, batch iteration

In [None]:
print(passed_safety,solution)

True [0.16911355 0.1738146 ]


The output shows some of the default values that were hidden in the script. For example, we are running gradient descent in "batch" mode, i.e., putting all of our candidate data (400 data points) in at once and running for 200 epochs. These settings can be changed, but we won't cover that in this tutorial.

Notice that SA.run() returns two values. `passed_safety` is a Boolean indicating whether the candidate solution found during candidate selection passed the safety test. If `passed_safety==False` , then `solution="NSF"`, i.e., "No Solution Found". If `passed_safety==True`, then the solution is the array of model weights that resulted in the safety test passing. In this example, we got `passed_safety=True` and a candidate solution of: `[0.16911355 0.1738146]`, which indicate the intercept and slope of the line that candidate selection found. 

<a name="extracting"></a>
## Extracting important quantities
There are a few quantities of interest that are not automatically returned by `SA.run()`. One such quantity is the value of the primary objective function (the MSE) evaluated on the safety data for the model weights returned by the algorithm, $\hat{f}(\theta_{\text{cand}},D_{\text{safety}})$. Given that the solution passed the safety test, we know that $\hat{f}(\theta,D_{\text{safety}})$ will likely be between $1.25$ and $2.0$ (and the actual MSE on future data will be in this range with high probability). The <code class="highlight">SA</code> object provides the introspection we need to extract this information through the <a href="https://seldonian-toolkit.github.io/Engine/build/html/_autosummary/seldonian.seldonian_algorithm.SeldonianAlgorithm.html#seldonian.seldonian_algorithm.SeldonianAlgorithm.evaluate_primary_objective">SA.evaluate_primary_objective()</a> method:

In [None]:
st_primary_objective = SA.evaluate_primary_objective(
    theta=solution,
    branch='safety_test')
print(st_primary_objective)

1.6118814175141167


This is indeed between $1.25$ and $2.0$, which satisfies the behavioral constraints. We can use the same method to check the value of the primary objective function evaluated on the candidate data at this solution:

In [None]:
cs_primary_objective = SA.evaluate_primary_objective(
    theta=solution,
    branch='candidate_selection')
print(cs_primary_objective)

1.5566336763115234


While we know in this case that the safety test passed, i.e., the high-confidence upper bounds on the constraints were less than or equal to zero, we might be interested in what the actual values of those upper bounds were during the safety test. We can use the <a href="https://seldonian-toolkit.github.io/Engine/build/html/_autosummary/seldonian.seldonian_algorithm.SeldonianAlgorithm.html#seldonian.seldonian_algorithm.SeldonianAlgorithm.get_st_upper_bounds">SA.get_st_upper_bounds()</a> method for this.

In [None]:
print(SA.get_st_upper_bounds())

{'1.25-(Mean_Squared_Error)': -0.2448558988476761, 'Mean_Squared_Error-(2.0)': -0.2710930638194431}


This returns a dictionary where the keys are the constraint strings and the values are the upper bounds. The values you see should be close to the values above, but may differ slightly. Here are some things to note about this dictionary:


*   Both upper bounds are less than or equal to zero, as expected. 
*   The keys of this dictionary show the constraint strings in a slightly different form than how we originally defined them. They are written in the form: $g_i \leq 0$, where $g_i$ here represents the $i$th constraint function. For example, $1.25-(\text{Mean_Squared_Error})\leq0$ is mathematically equivalent to $\text{Mean_Squared_Error} \geq 1.25$, the form we used to specify our second constraint at the beginning of the tutorial. This rearrangement is done for consistency in interpreting the upper bounds
*   Because this information is returned in a dictionary, the order of the constraints is not guaranteed to be the same as the order in which we specified our constraints originally.

More introspection to the `SA` object is possible, but it is beyond the scope of this tutorial.

<a name="summary"></a>
## Summary
In this tutorial, we demonstrated how to:

*  Use the engine to set up a Seldonian machine learning algorithm.
*  Run the algorithm using the engine.
*  Extract and understand important quantities generated by the engine.