## Scenario 1: Local to Remote with MLFlow and AML CLI 

Doc: https://microsoft-my.sharepoint.com/:w:/p/osomorog/Ed7l1SLKac9Irz_PY-XnXaQB90-WeAosazcQOT24PRd3-g?e=hKS09V 

1)	User trains locally using MLFlow using MLFlow logging / autologging

2)	Move to AML by setting the tracking URI in the backend (not in my training code), packaging as a project and using the AzureML CLIv2 or MLFlow CLI. The user doesn’t have to change their training or scoring code.

    a.	Help users set MLFLOW_TRACKING_URI via CLI (can we get TRACKING URI from the CLI?)

        i.	Use: az ml workspace show | grep "mlflow_tracking_uri"

    b.	User should be able to run this code from a local dir or GitHub (if the repo has a MLproject file)

        i.	python train.py

        ii.	mlflow run https://github.com/mlflow/mlflow-example.git -P alpha=5.

    c.	The user can run that same training script locally and still have everything tracked in AML without updating code

3)	All AML parameters, metrics, models and logged artifacts should show up in AML run history

**NOTE** This only works if the user has set an experiment in their code, if not it break. We need to be able to set a default experiment for the user with now experiment set.

In [1]:
!cat train.py

# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.

import numpy as np
from sklearn.datasets import load_diabetes
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
import mlflow
import mlflow.sklearn

import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

mlflow.set_experiment("test-abe")
with mlflow.start_run():
    X, y = load_diabetes(return_X_y=True)
    columns = ['age', 'gender', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
    data = {
        "train": {"X": X_train, "y": y_train},
        "test": {"X": X_test, "y": y_test}}

    mlflow.log_metric("Training samples", len(data['train']['X']))
    mlflow.log_metric("Test samples", len(data['test']['X']))

    # Log the algorithm parameter alpha to the run
    mlflow.log_me

## Run MLFlow Experiements in the Terminal
Run `python train.py` in the terminal

## Get and Set Tracking URI
Using following command in the Terminal.
```
export MLFLOW_TRACKING_URI=$(az ml workspace show --query mlflow_tracking_uri | sed 's/"//g')
```
or 

```
export MLFLOW_TRACKING_URI=$(python -c "from azureml.core import Workspace; print(Workspace.from_config().get_mlflow_tracking_uri())") 
```
Double check the MLFLOW_TRACKING_URI variable was assigned:
```
echo $MLFLOW_TRACKING_URI
```


# Run again with AzureML as Tracking Server

Run `python train.py` in the terminal

## Scenario 2: Train and deploy with MLFlow and AML

1)	Train baseline model and log / autolog with MLFlow and submit job with AML CLI/MLFLow CLI
    
    a.	Normal Jobs

    b.	AutoML Jobs

    c.	Sweep Job   
2)	Test model locally with v2 CLI and manually validate results

    a.	[GAP] Retrieve model from a run

        i.	az ml job download -n <job_id>, doesn’t work for local MLFLow Experiements, because these are tracked runs and not jobs. Every Job has a run, not every run is a job.
   
    b.	[GAP] Local online deployment for manual testing and NCD on v2
   
        i.	Not working, need to add support for –local and MLFlow
3)	[GAP] Register the model from the run 
    
    a.	MLflow allows registering model in 3 different ways. Refer mlflow docs here on how to achieve it. We need to support all 3 well
        i.	mlflow.sklearn.log_model()
        ii.	mlflow.register_model(“runs://”)
        iii.	client.create_registered_model()
    b.	[GAP] Via CLI: Use v2 CLI to register from a run 
    c.	[GAP] Via UI: Right-click the model in a job [Output+Logs] and register as a model
4)	Test your deployed model, validate, and change the model stage to “Production” and discuss with team before deploying to production
    
    a.	Easily validate model test inputs via AML UI, CLI or SDK (look at Databricks)
        i.	CLI: az ml online-endpoint invoke --name sklearn-deployment --request-file sample-request-sklearn.json
        ii.	UI: There is a test tab.
        iii.	[GAP] Get a sample input or signature from MLFLow/AML via CLI or UI
    b.	[GAP] Change stage, via Model stages and discussions or organization-wide feed
5)	After user is satisfied with the model, deploy model on AML to predict all the result from a dataset. 
   
    a.	Via Online or Batch endpoints and NCD 
        i.	Deploy to AML using the AML CLI v2, [GAP] Get the Studio UI link to deployment from CLI output
        ii.	[GAP] Deploy using MLFLow CLI
    b.	Deploy to MIR, AML Arc [GAP], and Spark [GAP]


In [4]:
pip list

Package                                 Version
--------------------------------------- -------------------
absl-py                                 0.13.0
adal                                    1.2.7
adlfs                                   2021.9.1
aiohttp                                 3.7.4.post0
alembic                                 1.4.1
ansiwrap                                0.8.4
antlr4-python3-runtime                  4.7.2
applicationinsights                     0.11.10
arch                                    4.14
argcomplete                             1.12.3
argon2-cffi                             21.1.0
asgiref                                 3.4.1
astunparse                              1.6.3
async-timeout                           3.0.1
attrs                                   21.2.0
azure-appconfiguration                  1.1.1
azure-batch                             10.0.0
azure-cli                               2.24.0
azure-cli-core                          2.24.0
a

In [2]:
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.

import numpy as np
from sklearn.datasets import load_diabetes
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
import mlflow
import mlflow.sklearn

import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

## User currently needs to add an experiment name. Should not be neccesary in a future relase
from azureml.core import Workspace;
mlflow.set_tracking_uri(Workspace.from_config().get_mlflow_tracking_uri())
mlflow.set_experiment("test-abe")
########

with mlflow.start_run():
    X, y = load_diabetes(return_X_y=True)
    columns = ['age', 'gender', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
    data = {
        "train": {"X": X_train, "y": y_train},
        "test": {"X": X_test, "y": y_test}}

    mlflow.log_metric("Training samples", len(data['train']['X']))
    mlflow.log_metric("Test samples", len(data['test']['X']))

    # Log the algorithm parameter alpha to the run
    mlflow.log_metric('alpha', 0.03)
    # Create, fit, and test the scikit-learn Ridge regression model
    regression_model = Ridge(alpha=0.03)
    regression_model.fit(data['train']['X'], data['train']['y'])
    preds = regression_model.predict(data['test']['X'])

    # Log mean squared error
    print('Mean Squared Error is', mean_squared_error(data['test']['y'], preds))
    mlflow.log_metric('mse', mean_squared_error(data['test']['y'], preds))

    # Save the model to the outputs directory for capture
    mlflow.sklearn.log_model(regression_model, "model")

    # Plot actuals vs predictions and save the plot within the run
    fig = plt.figure(1)
    idx = np.argsort(data['test']['y'])
    plt.plot(data['test']['y'][idx], preds[idx])
    fig.savefig("actuals_vs_predictions.png")
    mlflow.log_artifact("actuals_vs_predictions.png")

Mean Squared Error is 3424.900315896017


### Query last run from this experiement

In [9]:
print(mlflow.get_experiment_by_name("test-abe"))

<Experiment: artifact_location='', experiment_id='c7c757e4-cbd5-4eda-885d-6247016ec650', lifecycle_stage='active', name='test-abe', tags={}>


In [21]:
from mlflow.tracking.client import MlflowClient
from mlflow.entities import ViewType
mlflow.set_tracking_uri(Workspace.from_config().get_mlflow_tracking_uri())
experiment_name="test-abe"
current_experiment=mlflow.get_experiment_by_name(experiment_name)
runs = MlflowClient().search_runs(experiment_ids=current_experiment.experiment_id, run_view_type=ViewType.ALL)


In [19]:
runs.tail(1)["run_id"]

8    075f4c08-c7d1-4f9b-aff3-be64b43d9e6d
Name: run_id, dtype: object

## Download or retrieve the model from the run for testing

In [51]:
from mlflow.tracking.client import MlflowClient
client = MlflowClient()
#client.download_artifacts(runs[-1].info.run_id,"model",".")

'/mnt/batch/tasks/shared/LS_root/mounts/clusters/osomorog2/code/Users/osomorog/MLFlow_Scenarios/model'

In [22]:
artifact_uri = "runs:/" + runs[-1].info.run_id + "/model"
model = mlflow.sklearn.load_model("runs:/" + runs[-1].info.run_id + "/model")

In [None]:
artifact_uri

In [66]:
#columns = ['age', 'gender', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
test_data = [[ 12,  1, 12,  0.07007254, -0.03596778,-0.0266789 , -0.02499266, -0.00259226,  0.00371174,  0.04034337]]

In [67]:
sample_dataset = test_data
model.predict(sample_dataset)

array([6282.61850114])

## Register Model with MLFLow

In [71]:
mlflow.register_model(artifact_uri,"scenario_2_model")

Successfully registered model 'scenario_2_model'.
2021/10/27 19:17:39 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: scenario_2_model, version 1
Created version '1' of model 'scenario_2_model'.


<ModelVersion: creation_timestamp=1635362259614, current_stage='None', description='', last_updated_timestamp=1635362259614, name='scenario_2_model', run_id='13c5faef-788f-439d-ba6c-cb8d280e708d', run_link='', source='azureml://experiments/test-abe/runs/13c5faef-788f-439d-ba6c-cb8d280e708d/artifacts/model', status='READY', status_message='', tags={}, user_id='', version='1'>

4)	Discuss with other members and stage the model before deployment. Change the model stage to “Production” and discuss with team before deploying to production via AML CLI v2

a.	Via AML CLI, UI or SDK: [GAP] Change stage, via Model stages and discussions or organization-wide feed



### ----- USER SWITCHES TO CONTROL PLAN ACTIONS, HANDOFF FROM MLFLOW TO AML ONLY ---------

In [6]:
from mlflow.deployments import get_deploy_client
import mlflow
import mlflow.sklearn
from azureml.core import Workspace;
mlflow.set_tracking_uri(Workspace.from_config().get_mlflow_tracking_uri())
# set the tracking uri as the deployment client
client = get_deploy_client(mlflow.get_tracking_uri())

# set the model path 
model_path = "model"
run_id= "13c5faef-788f-439d-ba6c-cb8d280e708d"
# define the model path and the name is the service name
# the model gets registered automatically and a name is autogenerated using the "name" parameter below 
# set the deployment config
deploy_path = "Scenario2/deployment_config.json"
test_config = {'deploy-config-file': deploy_path}

client.create_deployment(model_uri='runs:/{}/{}'.format(run_id, model_path),
                         config=test_config,
                         name="mlflow-test-mir")

Registered model 'mlflow-test-mir-model' already exists. Creating a new version of this model...
2021/10/29 00:03:39 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation.                     Model name: mlflow-test-mir-model, version 5
Created version '5' of model 'mlflow-test-mir-model'.


MlflowException: Failed to parse provided deployment config file

`az ml online-endpoint invoke --name sklearn-deployment --request-file sample-request-sklearn.json`

# Scenario 2b: Collaboration and Compare - AML Run Analysis and Model comparison
Search runs with alpha > 0

1.	User A has access to User B’s training metrics and want to compare it their training results. OR User A has done a sweep job and wants to find the best model
2.	Analyze runs (metrics, parameters, etc) interactively 
3.	Use mlflow.search_runs() and other commands to load into Pandas dataframe and do common analysis (mlflow.search_runs())

    a.	compare job A to job B

    b.	find best model in sweep based on primary metric

    c.	find best model in sweep based on some other logged metric etc




### Compare Runs

In [73]:
from azureml.core import Workspace;
from mlflow.tracking.client import MlflowClient
from mlflow.entities import ViewType

mlflow.set_tracking_uri(Workspace.from_config().get_mlflow_tracking_uri())

all_experiments = [exp.experiment_id for exp in MlflowClient().list_experiments()]
query = "metrics.alpha > 0"
runs = MlflowClient().search_runs(experiment_ids=["c3732977-6d76-4192-bccf-38cac682b7b7", "c24a63cd-22b3-482c-818a-e532d96362f6"], filter_string=query, run_view_type=ViewType.ALL)

# print(runs)
runs.head(10)

### List Models

In [20]:
from pprint import pprint

client = MlflowClient()
for rm in client.list_registered_models():
    pprint(dict(rm), indent=4)

{   'creation_timestamp': 1633124474397,
    'description': '',
    'last_updated_timestamp': 1633124474397,
    'latest_versions': [   <ModelVersion: creation_timestamp=1633124474429, current_stage='None', description='', last_updated_timestamp=1633124474429, name='my-model', run_id='experiment_with_mlflow_remote_1633124216_c95519e1', run_link='', source='', status='READY', status_message='', tags={}, user_id='', version='1'>],
    'name': 'my-model',
    'tags': {}}
{   'creation_timestamp': 1633728453446,
    'description': '',
    'last_updated_timestamp': 1633728453446,
    'latest_versions': [   <ModelVersion: creation_timestamp=1633728453472, current_stage='None', description='', last_updated_timestamp=1633728453472, name='4d525968-a1af-4503-968b-061a46855da5', run_id='', run_link='', source='', status='READY', status_message='', tags={}, user_id='', version='1'>],
    'name': '4d525968-a1af-4503-968b-061a46855da5',
    'tags': {}}
{   'creation_timestamp': 1635204959134,
    'd

## Query artifacts from a run (including Models)

In [21]:
model = mlflow.sklearn.load_model("runs:/experiment_with_mlflow_remote_1633124216_c95519e1/model")



In [35]:
client = MlflowClient()

def print_artifact_info(artifact):
    print("artifact: {}".format(artifact.path))
    print("is_dir: {}".format(artifact.is_dir))
    print("size: {}".format(artifact.file_size))

#run
# Fetch the logged artifacts
artifacts = client.list_artifacts("experiment_with_mlflow_remote_1633124216_c95519e1")
for artifact in artifacts:
    #print_artifact_info(artifact)
    print(artifact)

client.download_artifacts("experiment_with_mlflow_remote_1633124216_c95519e1","model",".")

<FileInfo: file_size=-1, is_dir=False, path='actuals_vs_predictions.png'>
<FileInfo: file_size=-1, is_dir=True, path='azureml-logs'>
<FileInfo: file_size=-1, is_dir=True, path='azureml-logs'>
<FileInfo: file_size=-1, is_dir=True, path='azureml-logs'>
<FileInfo: file_size=-1, is_dir=True, path='azureml-logs'>
<FileInfo: file_size=-1, is_dir=True, path='azureml-logs'>
<FileInfo: file_size=-1, is_dir=True, path='azureml-logs'>
<FileInfo: file_size=-1, is_dir=True, path='logs'>
<FileInfo: file_size=-1, is_dir=True, path='logs'>
<FileInfo: file_size=-1, is_dir=True, path='logs'>
<FileInfo: file_size=-1, is_dir=True, path='model'>
<FileInfo: file_size=-1, is_dir=True, path='model'>
<FileInfo: file_size=-1, is_dir=True, path='model'>
<FileInfo: file_size=-1, is_dir=True, path='model'>


'/mnt/batch/tasks/shared/LS_root/mounts/clusters/osomorog2/code/Users/osomorog/MLFlow_Scenarios/model'