# Basic Training Loop

This notebook shows the type of code you would expect to see in the post EDA process of modelling, where cyclical experimentation is the name of the game. 

This is can experiment on the iris dataset, using a LR based approach.

This experiment uses one first rather than two as in the tree example.

## Depedencies

In [11]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from mlflow import log_metric, log_param
from mlflow.sklearn import log_model
import mlflow
import os

## Training Loop Setup

In [12]:
url= "https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/639388c2cbc2120a14dcf466e85730eb8be498bb/iris.csv"
df = pd.read_csv(url)
df.sample(2)

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
135,7.7,3.0,6.1,2.3,virginica
18,5.7,3.8,1.7,0.3,setosa


In [13]:
le = preprocessing.LabelEncoder()
df['species'] = le.fit_transform(df['species'])

In [14]:
y = df['species']
x = df[['sepal_length']]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=2)

In [15]:
MAX_ITER = 100

Get remote server URI & tell mlflow

In [16]:
REMOTE_MLFLOW_SERVER = os.environ['REMOTE_TRACKING_SERVER']
mlflow.set_tracking_uri(REMOTE_MLFLOW_SERVER)

In [17]:
try:
    mlflow.create_experiment("iris_lr")
except:
    print('The experiment may already exist.')

The experiment may already exist.


## Training Loop

In [18]:
mlflow.set_experiment("iris_lr")

with mlflow.start_run(nested=True):

    log_param("MAX_ITER", MAX_ITER)
    
    clf = LogisticRegression(max_iter=MAX_ITER)
    clf.fit(x_train, y_train)

    y_pred = clf.predict(x_test)
    acc = accuracy_score(y_test, y_pred)

    log_metric("Accuracy", acc)
    log_model(clf, "Model")

In [29]:
print(classification_report(y_test, y_pred, target_names=le.classes_))

              precision    recall  f1-score   support

      setosa       1.00      0.95      0.97        20
  versicolor       0.69      0.56      0.62        16
   virginica       0.61      0.79      0.69        14

    accuracy                           0.78        50
   macro avg       0.77      0.77      0.76        50
weighted avg       0.79      0.78      0.78        50

