# Production deployment to AKS from Databricks using Azure ML SDK
## Train, register model in Databricks, Azure ML, Deploy Azure ML model to AKS

Goal of this notebook is to show the steps involved in deploying the model once it is built, although the example used is using scikit learn the same methodology can be applied to other machine learning models. This python notebook is tested on Databricks runtime 7.3 LTS ML | Spark 3.0.1 | Scala 2.12

## Steps:
* Train a model (In this notebook scikit learn using sample dataset)
* Use MLflow to log the model in Databricks.
* Download MLflow to the local Databricks environment.
* Register MLflow model to Azure ML workspace
* Deploy registered model to AKS. 

## Setup
* If you are using a cluster running Databricks Runtime, you must install mlflow library from PyPI. See Cmd 3.
* If you are using a cluster running Databricks Runtime ML, mlflow library is already installed.
* Create Azure ML workspace

Install the mlflow library. 
This is required for Databricks Runtime clusters only. If you are using a cluster running Databricks Runtime ML, skip to Cmd 4.

In [0]:
# If you are running Databricks Runtime version 7.1 or above, uncomment this line and run this cell:
# %pip install mlflow
# %pip install azureml-sdk[databricks]
# %pip install azureml-mlflow

# If you are running Databricks Runtime version 6.4 to 7.0, uncomment this line and run this cell:
# dbutils.library.installPyPI("mlflow")
# dbutils.library.installPyPI("azureml-sdk[databricks]")
# dbutils.library.installPyPI("azureml-mlflow")

In [0]:
# %pip install msrest==0.6.21
# Install this library if you see error/warnings during azureml-sdk import

Import the required libraries.

In [0]:
import mlflow
import mlflow.sklearn
from mlflow.tracking.client import MlflowClient
from mlflow.entities import ViewType

import pandas as pd
import matplotlib.pyplot as plt

from numpy import savetxt

from sklearn.model_selection import train_test_split
from sklearn.datasets import load_diabetes

from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error

* API reference: [Workspace Class](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.workspace.workspace?view=azure-ml-py)
* NOTE: To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code received in the output cell to authenticate

In [0]:
import azureml
from azureml.core import Workspace

subscription_id = "xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx" #you should be owner or contributor
resource_group = "xxxxxx" #you should be owner or contributor
workspace_name = "xxxxx" #your workspace name
workspace_region = "southeastasia" #your region (if workspace need to be created)

workspace = Workspace.create(name = workspace_name,
                             location = workspace_region,
                             resource_group = resource_group,
                             subscription_id = subscription_id,
                             exist_ok=True)
# workspace.write_config()

Import the dataset from scikit-learn and create the training and test datasets.
* Docs link [sklearn diabetes dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes.html)

In [0]:
db = load_diabetes()
X = db.data
y = db.target
X_train, X_test, y_train, y_test = train_test_split(X, y)

Create a random forest model and log parameters, metrics, and the model using `mlflow.sklearn.autolog()`.
* Enable autolog() For details about what information is logged with `autolog()`, refer to the [MLflow documentation](https://mlflow.org/docs/latest/index.html).

In [0]:
mlflow.sklearn.autolog()

# With autolog() enabled, all model parameters, a model score, and the fitted model are automatically logged.  
with mlflow.start_run():
  
  # Set the model parameters. 
  n_estimators = 100
  max_depth = 6
  max_features = 3
  
  # Create and train model.
  rf = RandomForestRegressor(n_estimators = n_estimators, max_depth = max_depth, max_features = max_features)
  rf.fit(X_train, y_train)
  
  # Use the model to make predictions on the test dataset.
  predictions = rf.predict(X_test)

Get the current notebook path using scala dbutils api and pass it to widget. This widget value can then be retrived from python

In [0]:
%scala
val notebookPath = dbutils.notebook.getContext.notebookPath.getOrElse("")
dbutils.widgets.text("Current_Notebook_Path",notebookPath)

In [0]:
nb_path = dbutils.widgets.get("Current_Notebook_Path")
print(nb_path)

In [0]:
# Get current experiment ID, this is based on the notebook path
notebook_path = nb_path
exp_id = mlflow.get_experiment_by_name(notebook_path).experiment_id

best_run = MlflowClient().search_runs(
    experiment_ids=exp_id,
    run_view_type=ViewType.ACTIVE_ONLY,
    max_results=1,
    order_by=["metrics.areaUnderROC DESC"]
)[0]

In [0]:
best_run

In [0]:
model_name = "my-model-ss" # Can be anything 
artifact_path = "model" # Folder where the model is saved, Recommended to keep it as model
model_stage = 'Staging' # model stage, default is None, this example we are going to move the model to staging
model_uri = f"runs:/{best_run.info.run_id}/{artifact_path}" # use the run_id from the best run
# artifact_uri = best_run.info.artifact_uri
# image_name = f"{model_name}-image"

In [0]:
import mlflow.azureml
from mlflow.azureml import mlflow_register_model
from azureml.core.model import Model
from mlflow.store.artifact.models_artifact_repo import ModelsArtifactRepository
from mlflow.tracking.client import MlflowClient
import os

client = MlflowClient()

# Register the model in Databricks, NOTE: It might take upto 300 sec for it to be register.
databricks_mlflow_model = mlflow_register_model(name=model_name, model_uri=model_uri)

# Moving the model to Staging in Databricks MLFlow
client.transition_model_version_stage(
  name=databricks_mlflow_model.name,
  version=databricks_mlflow_model.version,
  stage=model_stage,
) 

model_path = f"models:/{model_name}/{model_stage}"

os.makedirs("model", exist_ok=True)

local_path = ModelsArtifactRepository(model_path).download_artifacts("",dst_path="model") # Do not change the dst_path

azureml_model =Model.register(workspace=workspace,
                              model_path=local_path,
                              model_name=model_name,
                              description="Test model registry")

Un-comment the below cell to create a new AKS cluster

In [0]:
# Create new AKS compute
# from azureml.core.compute import AksCompute, ComputeTarget
# prov_config = AksCompute.provisioning_configuration()
# aks_name = 'aks-mlflow'

# # Create the cluster
# aks_target = ComputeTarget.create(workspace=workspace, 
#                                   name=aks_name, 
#                                   provisioning_configuration=prov_config)

# aks_target.wait_for_completion(show_output = True)

# print(aks_target.provisioning_state)
# print(aks_target.provisioning_errors)

Use existing AKS cluster (NOTE: you need to add this AKS cluster as inference target in Azure ML)

In [0]:
# Connect to existing AKS
from azureml.core.compute import AksCompute, ComputeTarget

# Give the cluster a local name
aks_compute_name = "aks-mlflow"

aks_target = AksCompute(workspace,aks_compute_name)

Sample `score.py` script for the entry point to the container

In [0]:
score_py = """import joblib
import numpy as np
import os

from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType


# The init() method is called once, when the web service starts up.
#
# Typically you would deserialize the model file, as shown here using joblib,
# and store it in a global variable so your run() method can access it later.
def init():
    global model

    # The AZUREML_MODEL_DIR environment variable indicates
    # a directory containing the model file you registered.
    model_filename = 'model.pkl'
    model_path = os.path.join(os.environ['AZUREML_MODEL_DIR'],'model',model_filename)
    model = joblib.load(model_path)


# The run() method is called each time a request is made to the scoring API.
#
# Shown here are the optional input_schema and output_schema decorators
# from the inference-schema pip package. Using these decorators on your
# run() method parses and validates the incoming payload against
# the example input you provide here. This will also generate a Swagger
# API document for your web service.
@input_schema('data', NumpyParameterType(np.array([[0.1, 1.2, 2.3, 3.4, 4.5, 5.6, 6.7, 7.8, 8.9, 9.0]])))
@output_schema(NumpyParameterType(np.array([4429.929236457418])))
def run(data):
    # Use the model object loaded by init().
    result = model.predict(data)

    # You can return any JSON-serializable object.
    return result.tolist()
"""

with open('score.py','w') as f:
    f.write(score_py)

Deployment to AKS can take upto 10 min, if this step take longer than that troubleshoot using the api [link](https://aka.ms/debugimage#dockerlog) or from Azure ML deploymnet logs

In [0]:
from azureml.core.model import InferenceConfig
from azureml.core.webservice import Webservice, AksWebservice
from azureml.core.environment import Environment,CondaDependencies
from azureml.core.model import InferenceConfig, Model
from azureml.core.webservice.aks import AksServiceDeploymentConfiguration

# Choosing AzureML-Minimal and customizing as required ( see the last cell to display available environments)
env = Environment.get(workspace=workspace, name="AzureML-Minimal")
curated_clone = env.clone("customize_curated")

conda_dep_pkgs=['joblib','scikit-learn']
pip_pkgs=['azureml-defaults', 'inference-schema']
conda_dep = CondaDependencies()

# Install additional packages as required
for conda_dep_pkg in conda_dep_pkgs:
  conda_dep.add_conda_package(conda_package=conda_dep_pkg)

for pip_pkg in pip_pkgs:
  conda_dep.add_pip_package(pip_package=pip_pkg)

curated_clone.python.conda_dependencies=conda_dep

prod_webservice_name = "diabetes-model-prod"
prod_webservice_deployment_config = AksWebservice.deploy_configuration()

# NOTE: score.py is created in the previous cell and save to driver local path
inference_config = InferenceConfig(entry_script='score.py', environment=curated_clone)

service = Model.deploy(workspace=workspace,
                      name=prod_webservice_name,
                      models=[azureml_model],
                      inference_config=inference_config,
                      deployment_config=prod_webservice_deployment_config,
                      deployment_target = aks_target,
                      overwrite=True)

service.wait_for_deployment(show_output=True)

In [0]:
print(service.get_logs())

Print the scoring url ( NOTE: This is also available from Azure ML studio under Endpoints section)

In [0]:
service.scoring_uri

In [0]:
# envs = Environment.list(workspace=workspace)
# # Print list of environments available in Azure ML
# for env in envs:
#     if env.startswith("AzureML"):
#         print("Name",env)
#         print("packages", envs[env].python.conda_dependencies.serialize_to_string())

In [0]:
# Remove the widgets
dbutils.widgets.removeAll()