# Exercice 4 : registry : register a retrained model into MLFLOW

**In this exercice, you will have to  :**
- Locally load data from : `bucket = "exam-assets" minio_path = 'datasets/wine/wine.parquet'`
- Locally load model from : `bucket = "exam-assets" minio_path = 'model/wine/model.joblib'`
- Retrain model in an Mlflow context.
- Register the model bound to this context

The url for Mlflow is : https://ml-registry.aiengineer.polytech.sandbox-atos.com/

**This exercice can give you 5 points, here is how :** 


| criteria  | description  | score  |  
|---|---|---|
| Mlflow Experiment |Retrain informations are available in the students's Mlflow experiment  |  3 |
|  Register model |  New model is stored on the registry, bound to the run | 2  |


![expml](./images/expml.png)

![run](./images/rundtl.png)

![mdl](./images/mdl.png)

In [None]:
you need to install mlflow
#!pip install mlflow boto boto3

In [None]:
# Here is what depandancies you might need
import numpy
import os
import urllib3
import pyarrow
import pandas as pd
import mlflow.sklearn
from io import BytesIO
from minio import Minio
from joblib import load,dump
from sklearn.metrics import accuracy_score, roc_curve
from sklearn import metrics, preprocessing, tree
from sklearn.model_selection import train_test_split
from mlflow.store.artifact.runs_artifact_repo import RunsArtifactRepository
from mlflow import MlflowClient

In [None]:
### Mlflow config
os.environ["AWS_ACCESS_KEY_ID"] = "mlflow-storage"
os.environ["AWS_SECRET_ACCESS_KEY"] = "mlflow-storage"
os.environ["MLFLOW_S3_ENDPOINT_URL"] = "https://storage-api.aiengineer.polytech.sandbox-atos.com"
mlflow.set_tracking_uri('http://mlflow.mlflow.svc.cluster.local:5000')

### Local retrain

#### Create a Mlflow experiment

You can start from ML governance practical session notebook 2 to get mlflow examples

In [None]:
### example : 'john-doe'
username=''

In [2]:
### mlflow sa : mlflow-sa-storage
mlflow.sklearn.autolog()

In [None]:
experiment_name = f"{username} experiments"
experiment_id = ...

In [None]:
experiment = mlflow.get_experiment(experiment_id)
print("Name: {}".format(experiment.name))
print("Experiment_id: {}".format(experiment.experiment_id))
print("Artifact Location: {}".format(experiment.artifact_location))
print("Tags: {}".format(experiment.tags))
print("Lifecycle_stage: {}".format(experiment.lifecycle_stage))

#### Retrain the model from Minio

In [None]:
## Create a client with the access key and the secret key given
client = ...

In [None]:
### get data locally
minio_path = 'datasets/wine/wine.parquet'
bucket = "exam-assets"
try:
    ...
finally:
    ...
    
### pass dataset to component output
winedata.to_parquet('wine.parquet')

In [None]:
### get model locally
minio_path = 'model/wine/model.joblib'
bucket = "exam-assets"
try:
    ...
finally:
    ...
    
### pass dataset to component output
dump(winemodel,'model.joblib')

To make the retrain useful : add a standard scaler before fit : 

```python
scaler = preprocessing.StandardScaler().fit(X_train)
X_scaled = scaler.transform(X_train)
```

In [None]:
### this context manager will bind the process to an experiment
with mlflow.start_run(experiment_id=experiment_id) as run:
    
    ### get data
    ...
    
    ### split data ###
    ...
    
    ## Here, we add a standard scaler. then we fit
    ...
    

    ## Now we have predicted the output by passing X_test and also stored real target in expected_y.

    ...
    predicted_y = model.predict(X_test)

Now the run is available on your Mlflow experiment!

### Register the model

In [None]:
client = MlflowClient(tracking_uri='http://mlflow.mlflow.svc.cluster.local:5000')

In [None]:
username = ''  # john-doe
name=f"{username}-wine-exam" # john-doe-wine-exam

In [None]:
client.create_registered_model(name)

Create a model version that will bind your model to your experiment run

In [None]:
desc = "Wine color classification"
runs_uri =...
model_src = ...
mv = ...
print("Name: {}".format(mv.name))
print("Version: {}".format(mv.version))
print("Description: {}".format(mv.description))
print("Status: {}".format(mv.status))
print("Stage: {}".format(mv.current_stage))

## End of the exercice 