# Experiments

There are a large number of variables that can change the behavior and outcome of a machine learning (ML) model. The choice the data scientist makes for these variables can mean the difference between the success and failure of the model. Most often, the choices made in one situation are not generalizable, and each new problem requires revisiting the choice of an algorithm and all the parameters associated with it.   

The Cortex SDK provides a facility to track the choices the data scientist makes to improve the preformance of the ML model. The main structure for tracking model performance is the `experiment`. An `experiement` is a containers for `run`s. `run`s are associate parameters, metrics, and artifacts created in the process of identifying the best algorithms for modelling a skill.

### Creating an Experiment

Experiments are created through the client:

In [None]:
from cortex import Cortex

client = Cortex.local()
exp = client.experiment('example/sample-experiment')
exp

Executing the cell above displays the experiment runs as a table. The __ID__ is gerneated by the `run` and is a [cuid](https://github.com/ericelliott/cuid).  The __Date__ is the time for the `run` the down to the second, formatted in GMT time. Each experiment `run` is timed and the __Took__ column displays the experiment run elpase time. __Params__ and __Metrics__ are keyword arguments that you can use to configure a `run`. The empty table is populated as we create and execute `run`s.

### Experiments Depend on Data

Experiments are run on datasets. This example uses the [UCI Iris dataset](https://archive.ics.uci.edu/ml/datasets/Iris).

In [None]:
import pandas as pd

df = pd.read_csv('./data/iris.data')

We need to create a training set and a test set from this one data source. We'll use a sklearn facility to do this:

In [None]:
from sklearn.model_selection import train_test_split

all_inputs = df[['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']].values
all_classes = df['Class'].values

(train_inputs, test_inputs, train_classes, test_classes) = train_test_split(all_inputs, all_classes, test_size=0.333, train_size=0.667)

## Creating Runs

Two runs are created for this experiment using a [decision tree classifier](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html). The first uses [gini impurity](https://en.wikipedia.org/wiki/Decision_tree_learning#Gini_impurity) as a loss funtion: 

In [None]:
from sklearn.tree import DecisionTreeClassifier

dtc_g = DecisionTreeClassifier(criterion='gini')

dtc_g_run = exp.start_run()

dtc_g_run.start()
dtc_g.fit(train_inputs, train_classes)
dtc_g_run.stop()

Now run a second experiment using [information gain](https://en.wikipedia.org/wiki/Information_gain_in_decision_trees) (specified by the parameter `entropy`) for the loss function. Here the Run context manager is used (which manages the start and stop of a run), making the code simpler.

In [None]:
dtc_e = DecisionTreeClassifier(criterion='entropy')

with exp.start_run() as run:
    dtc_e.fit(train_inputs, train_classes)

dtc_e_run = run

## Run Logging

Runs have parameters, metrics, metadata and artifacts that can be used to track and manage experiment results.   

In [None]:
dtc_g_run.set_meta('model','DecisionTreeClassifier')
dtc_g_run.log_param('criterion','gini')
dtc_g_run.log_artifact('model',dtc_g)
dtc_g_run.log_metric('score',dtc_g.score(test_inputs, test_classes))

dtc_e_run.set_meta('model','DecisionTreeClassifier')
dtc_e_run.log_param('criterion','entropy')
dtc_e_run.log_artifact('model',dtc_e)
dtc_e_run.log_metric('score',dtc_e.score(test_inputs, test_classes))

Examining the experiment:

In [None]:
exp

The runs can also be examined:

In [None]:
dtc_g_run.to_json()