# Basic Training Loop

This notebook shows the type of code you would expect to see in the post EDA process of modelling, where cyclical experimentation is the name of the game. 

This is can experiment on the iris dataset, using a tree based approach.

## Depedencies

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn import preprocessing
from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score
from mlflow import log_metric, log_param
from mlflow.sklearn import log_model
import mlflow
import os

## Training Loop Setup

In [2]:
url= "https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/639388c2cbc2120a14dcf466e85730eb8be498bb/iris.csv"
df = pd.read_csv(url)
df.sample(2)

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
136,6.3,3.4,5.6,2.4,virginica
16,5.4,3.9,1.3,0.4,setosa


In [3]:
le = preprocessing.LabelEncoder()
df['species'] = le.fit_transform(df['species'])

In [4]:
y = df['species']
x = df[['sepal_length', 'petal_length']]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=2)

In [9]:
MAX_DEPTH = 5

Get remote server URI & tell mlflow

In [14]:
REMOTE_MLFLOW_SERVER = os.environ['REMOTE_TRACKING_SERVER']
mlflow.set_tracking_uri(REMOTE_MLFLOW_SERVER)

In [15]:
try:
    mlflow.create_experiment("iris_decision_tree")
except:
    print('The experiment may already exist.')

The experiment may already exist.


## Training Loop

In [16]:
mlflow.set_experiment("iris_decision_tree")

with mlflow.start_run(nested=True):

    log_param("MAX_DEPTH", MAX_DEPTH)

    clf = DecisionTreeClassifier(random_state=1, max_depth=MAX_DEPTH)
    clf.fit(x_train, y_train)

    y_pred = clf.predict(x_test)
    acc = accuracy_score(y_test, y_pred)

    log_metric("Accuracy", acc)
    log_model(clf, "Model")

In [13]:
print(classification_report(y_test, y_pred, target_names=le.classes_))

              precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        20
  versicolor       0.83      0.94      0.88        16
   virginica       0.92      0.79      0.85        14

    accuracy                           0.92        50
   macro avg       0.92      0.91      0.91        50
weighted avg       0.92      0.92      0.92        50

