# MLOps Example with Iris Dataset

The **api.ifood.mlops** package is the Python SDK to interact with the apps of the ML platform: sandbox, database, pipeline and serving. It allows data scientists to develop and deploy ML models through experiments.

## 1. Setup

Let's start by importing packages, including the **api.ifood.mlops**.

In [1]:
import json

import requests
import pandas as pd
from sklearn import *

from api.ifood import mlops

The first thing to do is to create a project! Let's create one called **iris** using the *create_project* method.

In [2]:
project_id = mlops.create_project(name='iris')

print(project_id)

1


You can easily get all projects using the *get_projects* method.

In [3]:
projects = mlops.get_projects()

for project in projects:
    for k, v in project.items():
        print(f"{k}: {v}")
    print("")

id: 1
name: iris
created_at: 2021-04-27 22:19:43
updated_at: 2021-04-27 22:19:43



## 2. Experiment

Models can be trained and tested through experiments. Let's load the **iris** dataset from the simulated feature store and tweek a little bit with its data.

In [4]:
dataset = pd.read_csv('./feature-store/iris.csv')

In [5]:
dataset["target"] = dataset["class"].astype("category").cat.codes
dataset.drop("class", axis=1, inplace=True)
dataset.head()

Unnamed: 0,sepal-length,sepal-width,petal-length,petal-width,target
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0


Let's then divide the dataset into train and test datasets.

In [6]:
train, test = model_selection.train_test_split(dataset, stratify=dataset["target"], random_state=42)

Now its time to create our firsts experiments using the *run_experiment* method. The method requires the following:

 - **project_id**: the id of the project we have just created;
 - **engine**: the Python ML engine to be used to develop the model (sklearn is the only available currently);
 - **model**: the engine model object initialized with its hyper params;
 - **metrics**: a dict with engine metrics functions;
 - **target_col**: the target column on the dataset;
 - **train_data**: the train dataset;
 - **test_data**: the test dataset.

We will create two experiments to predict the iris class: **Logistic Regression** model and **Perceptron** model, both will be assessed by their classification accuracy.

In [7]:
lr_experiment_id = mlops.run_experiment(
    project_id=project_id,
    engine='sklearn',
    model=linear_model.LogisticRegression(),
    metrics=dict(accuracy=metrics.accuracy_score),
    target_col='target',
    train_data=train,
    test_data=test
)

print(lr_experiment_id)

1


In [8]:
nn_experiment_id = mlops.run_experiment(
    project_id=project_id,
    engine='sklearn',
    model=linear_model.Perceptron(),
    metrics=dict(accuracy=metrics.accuracy_score),
    target_col='target',
    train_data=train,
    test_data=test
)

print(nn_experiment_id)

2


After running the *run_experiment* method, the **api.ifood.mlops** package will submit the request to the pipeline app. You can check out the experiment progress directly on the pipeline [dashboard](http://localhost:8080/) or using the *get_experiment* or *get_experiments* method. Run the cell bellow multiple times until both experiments status are equal to 'finished' then continue.

In [9]:
experiments = mlops.get_experiments()

for experiment in experiments:
    for k, v in experiment.items():
        print(f"{k}: {v}")
    print("")

id: 1
project_id: 1
engine: sklearn
hyperparams: b'{"C": 1.0, "class_weight": null, "dual": false, "fit_intercept": true, "intercept_scaling": 1, "l1_ratio": null, "max_iter": 100, "multi_class": "auto", "n_jobs": null, "penalty": "l2", "random_state": null, "solver": "lbfgs", "tol": 0.0001, "verbose": 0, "warm_start": false}'
metrics: b'{"accuracy": 0.9473684210526315}'
status: finished
created_at: 2021-04-27 22:19:53
updated_at: 2021-04-27 22:21:20

id: 2
project_id: 1
engine: sklearn
hyperparams: b'{"alpha": 0.0001, "class_weight": null, "early_stopping": false, "eta0": 1.0, "fit_intercept": true, "l1_ratio": 0.15, "max_iter": 1000, "n_iter_no_change": 5, "n_jobs": null, "penalty": null, "random_state": 0, "shuffle": true, "tol": 0.001, "validation_fraction": 0.1, "verbose": 0, "warm_start": false}'
metrics: b'{"accuracy": 0.6578947368421053}'
status: finished
created_at: 2021-04-27 22:19:54
updated_at: 2021-04-27 22:21:32



Let's compare them both:

In [10]:
lr_experiment = mlops.get_experiment(experiment_id=lr_experiment_id)
print(f"LR status: {lr_experiment['status']}")
print(f"LR metric: {lr_experiment['metrics']}")

LR status: finished
LR metric: b'{"accuracy": 0.9473684210526315}'


In [11]:
nn_experiment = mlops.get_experiment(experiment_id=nn_experiment_id)
print(f"NN status: {nn_experiment['status']}")
print(f"NN metric: {nn_experiment['metrics']}")

NN status: finished
NN metric: b'{"accuracy": 0.6578947368421053}'


Since the **Logistic Regression** accuracy is higher, lets deploy it!

## 3. Deploy

To the deploy an experiment, you can use the *deploy_experiment* method.

In [12]:
mlops.deploy_experiment(experiment_id=lr_experiment_id)

True

After running the *deploy_experiment* method, the **api.ifood.mlops** package will submit the request to the pipeline app. You can check out the deployment progress directly on the pipeline [dashboard](http://localhost:8080/) or using the *get_experiment* or *get_experiments* method. Run the cell bellow multiple times until both experiments status are equal to 'deployed' then continue.

In [13]:
lr_experiment = mlops.get_experiment(experiment_id=lr_experiment_id)
print(f"LR status: {lr_experiment['status']}")

LR status: deployed


You can check the deployed experiments with the *get_deployments* method.

In [14]:
deployments = mlops.get_deployments()

for deployment in deployments:
    for k, v in deployment.items():
        print(f"{k}: {v}")
    print("")

id: 1
project_id: 1
engine: sklearn
hyperparams: b'{"C": 1.0, "class_weight": null, "dual": false, "fit_intercept": true, "intercept_scaling": 1, "l1_ratio": null, "max_iter": 100, "multi_class": "auto", "n_jobs": null, "penalty": "l2", "random_state": null, "solver": "lbfgs", "tol": 0.0001, "verbose": 0, "warm_start": false}'
metrics: b'{"accuracy": 0.9473684210526315}'
status: deployed
created_at: 2021-04-27 22:19:53
updated_at: 2021-04-27 22:22:58



## 4. Predict

Let's predict some cases! You can check the serving API docs [here](http://localhost:8000/docs).

 - Iris-setosa:

In [15]:
data = dict(model="iris", features={"sepal-length": 5.7, "sepal-width": 3.8, "petal-length": 1.7, "petal-width": 0.3})

In [16]:
try:
    response = requests.post(url='http://localhost:8000/predictions/', data=json.dumps(data), headers={'Content-Type': 'application/json', 'x-api-key': 'FfNxK6NF9L'})
    response.raise_for_status()
except Exception as exc:
    raise exc
else:
    print(response.text) # expecting "0" for Iris-setosa

{"id":1,"prediction":0,"timestamp":"2021-04-27T22:23:49+00:00"}


 - Iris-versicolor:

In [17]:
data = dict(model="iris", features={"sepal-length": 5.8, "sepal-width": 2.7, "petal-length": 4.1, "petal-width": 1.0})

In [18]:
try:
    response = requests.post(url='http://localhost:8000/predictions/', data=json.dumps(data), headers={'Content-Type': 'application/json', 'x-api-key': 'FfNxK6NF9L'})
    response.raise_for_status()
except Exception as exc:
    raise exc
else:
    print(response.text) # expecting "1" for Iris-versicolor

{"id":2,"prediction":1,"timestamp":"2021-04-27T22:23:54+00:00"}


 - Iris-virginica:

In [19]:
data = dict(model="iris", features={"sepal-length": 7.7, "sepal-width": 3.0, "petal-length": 6.1, "petal-width": 2.3})

In [20]:
try:
    response = requests.post(url='http://localhost:8000/predictions/', data=json.dumps(data), headers={'Content-Type': 'application/json', 'x-api-key': 'FfNxK6NF9L'})
    response.raise_for_status()
except Exception as exc:
    raise exc
else:
    print(response.text) # expecting "2" for Iris-virginica

{"id":3,"prediction":2,"timestamp":"2021-04-27T22:24:00+00:00"}


You can get all the predictions for a project with the *get_predictions* method.

In [21]:
predictions = mlops.get_predictions(project_id=project_id)

for prediction in predictions:
    for k, v in prediction.items():
        print(f"{k}: {v}")
    print("")

id: 1
project_id: 1
experiment_id: 1
payload: b'0'
api_key: FfNxK6NF9L
created_at: 2021-04-27 22:23:49
updated_at: 2021-04-27 22:23:49

id: 2
project_id: 1
experiment_id: 1
payload: b'1'
api_key: FfNxK6NF9L
created_at: 2021-04-27 22:23:54
updated_at: 2021-04-27 22:23:54

id: 3
project_id: 1
experiment_id: 1
payload: b'2'
api_key: FfNxK6NF9L
created_at: 2021-04-27 22:24:00
updated_at: 2021-04-27 22:24:00



## 5. Deploy a new model

The **Logistic Regression model** is outdated, let's deploy a new **Support Vector Machine** model.

In [22]:
svm_experiment_id = mlops.run_experiment(
    project_id=project_id,
    engine='sklearn',
    model=svm.SVC(),
    metrics=dict(accuracy=metrics.accuracy_score),
    target_col='target',
    train_data=train,
    test_data=test
)

print(svm_experiment_id)

3


Run the cell bellow multiple times until both experiments status are equal to 'deployed' then continue.

In [23]:
experiment = mlops.get_experiment(experiment_id=svm_experiment_id)

for k, v in experiment.items():
    print(f"{k}: {v}")

id: 3
project_id: 1
engine: sklearn
hyperparams: b'{"C": 1.0, "break_ties": false, "cache_size": 200, "class_weight": null, "coef0": 0.0, "decision_function_shape": "ovr", "degree": 3, "gamma": "scale", "kernel": "rbf", "max_iter": -1, "probability": false, "random_state": null, "shrinking": true, "tol": 0.001, "verbose": false}'
metrics: b'{"accuracy": 0.9210526315789473}'
status: finished
created_at: 2021-04-27 22:24:11
updated_at: 2021-04-27 22:25:00


Then deploy the model.

In [24]:
mlops.deploy_experiment(experiment_id=svm_experiment_id)

True

As usual, run the cell bellow multiple times until both experiments status are equal to 'deployed' then continue.

In [25]:
svm_experiment = mlops.get_experiment(experiment_id=svm_experiment_id)
print(f"SVM status: {svm_experiment['status']}")

SVM status: deployed


Let's get the deployed models again.

In [26]:
deployments = mlops.get_deployments()

for deployment in deployments:
    for k, v in deployment.items():
        print(f"{k}: {v}")
    print("")

id: 3
project_id: 1
engine: sklearn
hyperparams: b'{"C": 1.0, "break_ties": false, "cache_size": 200, "class_weight": null, "coef0": 0.0, "decision_function_shape": "ovr", "degree": 3, "gamma": "scale", "kernel": "rbf", "max_iter": -1, "probability": false, "random_state": null, "shrinking": true, "tol": 0.001, "verbose": false}'
metrics: b'{"accuracy": 0.9210526315789473}'
status: deployed
created_at: 2021-04-27 22:24:11
updated_at: 2021-04-27 22:26:01



Nice, then predict some data.

In [27]:
data = dict(model="iris", features={"sepal-length": 5.7, "sepal-width": 3.8, "petal-length": 1.7, "petal-width": 0.3})

In [28]:
try:
    response = requests.post(url='http://localhost:8000/predictions/', data=json.dumps(data), headers={'Content-Type': 'application/json', 'x-api-key': 'FfNxK6NF9L'})
    response.raise_for_status()
except Exception as exc:
    raise exc
else:
    print(response.text) # expecting "0" for Iris-setosa

{"id":4,"prediction":0,"timestamp":"2021-04-27T22:26:40+00:00"}


To finish, let's get all predictions again.

In [29]:
predictions = mlops.get_predictions(project_id=project_id)

for prediction in predictions:
    for k, v in prediction.items():
        print(f"{k}: {v}")
    print("")

id: 1
project_id: 1
experiment_id: 1
payload: b'0'
api_key: FfNxK6NF9L
created_at: 2021-04-27 22:23:49
updated_at: 2021-04-27 22:23:49

id: 2
project_id: 1
experiment_id: 1
payload: b'1'
api_key: FfNxK6NF9L
created_at: 2021-04-27 22:23:54
updated_at: 2021-04-27 22:23:54

id: 3
project_id: 1
experiment_id: 1
payload: b'2'
api_key: FfNxK6NF9L
created_at: 2021-04-27 22:24:00
updated_at: 2021-04-27 22:24:00

id: 4
project_id: 1
experiment_id: 3
payload: b'0'
api_key: FfNxK6NF9L
created_at: 2021-04-27 22:26:40
updated_at: 2021-04-27 22:26:40

