## Deploy churn prediction model
In this notebook we will demonstrate how to get the model generated [here]() to deploy it. We need to follow these steps:

- Get an already trained model
- Instantiate an Azure ML Workspace
- Build an image with the best model packaged
- Deploy the model to ACI (Azure Container Instance)
- Deploy the model to AKS (Azure Kubernetes Services)

## First lets get the model
Return the best model from `churn-prediction` experiment. We will use the same notebook **model-churn-prediction** and return the `model_uri`.

In [0]:
%run ./model-churn-prediction

And load the `xgboost` using the `model_uri` returned from MLFlow tracking.

In [0]:
import mlflow

model = mlflow.xgboost.load_model(model_uri)

## Get Azure Machine Learning Workspace
We will use Azure Machine Learning to deliver the API `endpoints` that will consume the Machine Learning models. To be able to interact with Azure ML we will use [Azure Machine Learning Python SDK](https://docs.microsoft.com/en-us/python/api/overview/azure/ml/?view=azure-ml-py), with it its possible to create new workspaces (or use existing ones) to facilitate the deployment process.

Its required to fill the variables `WORKSPACE_NAME`, `WORKSPACE_LOCATION`, `RESOURCE_GROUP` and `SUBSCRIPTION_ID` with your subscription data.

As default will be required the `Interactive Login` auth. For production scenarios an app registration with `Service Principal` is required. In the [documentation] (https://docs.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication#set-up-service-principal-authentication) we have more details about the different kind of authentications.

First install the [`azureml-sdk`](https://pypi.org/project/azureml-sdk/)

In [0]:
%sh

pip install azureml-sdk

And now we can use it to instantiate the Azure ML Workspace

In [0]:
import azureml
from azureml.core import Workspace
import mlflow.azureml

workspace_name = '<YOUR-WORKSPACE-NAME>'
resource_group = '<YOUR-RESOURCE-GROUP>'
subscription_id = '<YOUR-SUBSCRIPTION-ID>'

workspace = Workspace.get(name = workspace_name,
                          resource_group = resource_group,
                          subscription_id = subscription_id)

## Register the model
Now we instantiate the Azure ML Workspace we can register the model. First we will persist it to the dbfs (to be able to pass the path as a parameters to Azure ML Register)

In [0]:
import shutil
model_path = '/dbfs/models/churn-prediction'

# Delete old files if necessary
try:
  shutil.rmtree(model_path)
except FileNotFoundError:
  print ("Cleanig model directory: {} \nThis directory hasn't been created yet.".format(model_path))
else:
    print ("Cleanig model directory: {} \nDirectory successfully cleaned.".format(model_path))
  
# Persist the XGBoost model
mlflow.xgboost.save_model(model, model_path)

In [0]:
from azureml.core.model import Model

model_name = 'churn-model'
model_description = 'Modelo de predição de churn utilizando XGBoost'

model_azure = Model.register(model_path = model_path,
                             model_name = model_name,
                             description = model_description,
                             workspace = workspace,
                             tags={'Framework': "XGBoost", 'Tipo': "Classificação"}
                             )

A new model version was generated in the Azure ML Workspace. We can use it to deploy an API with ACI or AKS.

#Deploy
Now with the model registered we can choose between two deployment types: `ACI` (Azure Container Instance) or `AKS` (Azure Kubernetes Service).

For development scenarios it is better to use `ACI` and for production `AKS` will have more options related to scalability and security. Please see more details in this [page](https://docs.microsoft.com/en-us/azure/architecture/reference-architectures/ai/mlops-python).

### Entry script
But before deploy the model, it is important to define an **`entry script`** named score.py. It will be responsible to load the model when the deployed service starts and for receiving data, passing it to the model, and then returning a response as well (see this [link](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-existing-model#define-inference-configuration)).

In [0]:
%%writefile /dbfs/models/churn-prediction/score.py

import mlflow
import json
import pandas as pd
import os
import xgboost as xgb
import time

# Called when the deployed service starts
def init():
    global model
    global train_stats

    # Get the path where the deployed model can be found.
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), './churn-prediction')
    
    # Load model
    model = mlflow.xgboost.load_model(model_path)

# Handle requests to the service
def run(data):
  # JSON request.
  # {"Cylinders":0, "Displacement":0.0, "Horsepower":0.0, "Weight":0.0, "Acceleration":0.5, "Model Year":0, "USA":0.0, "Europe":0.0, "Japan":0.0}
  
  info = {"payload": data}
  print(json.dumps(info))
    
  data = pd.read_json(data, orient = 'split')
  data_xgb = xgb.DMatrix(data)

  # Return the prediction
  prediction = predict(data_xgb)
  print ("Prediction created at: " + time.strftime("%H:%M:%S"))
  
  return prediction

def predict(data):
  prediction = model.predict(data)[0]
  return {"churn-prediction": str(prediction)}

### Inference config
We must now add some inference configs to be used in the endpoint. We can add required packages and an environment that can be registered in the Azure ML Workspace.

Here we will use the same `conda.yaml` file that is already registered from MLFlow process. We will add the `azureml-defaults` package that can be used in the inference process.

In [0]:
from azureml.core.model import InferenceConfig
from azureml.core.environment import Environment
from azureml.core.conda_dependencies import CondaDependencies

# Create the environment
env = Environment(name='xgboost_env')

conda_dep = CondaDependencies('/dbfs/models/churn-prediction/conda.yaml')

# Define the packages needed by the model and scripts
conda_dep.add_pip_package("azureml-defaults")

# Adds dependencies to PythonSection of myenv
env.python.conda_dependencies=conda_dep

inference_config = InferenceConfig(entry_script="/dbfs/models/churn-prediction/score.py",
                                   environment=env)

Now with the inference config we can proceed with the deployment

###ACI - Azure Container Instance
Follow we will demonstrate how to create an `endpoint` using the image created before and delivering with `ACI`.

In [0]:
from azureml.core.webservice import AciWebservice, Webservice
from azureml.exceptions import WebserviceException
from azureml.core.model import Model

endpoint_name = 'api-churn-dev'

deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)
service = Model.deploy(workspace, endpoint_name, [model_azure], inference_config, deployment_config, overwrite=True)
service.wait_for_deployment(show_output = True)

print('A API {} foi gerada no estado {}'.format(service.scoring_uri, service.state))

## Load some data to test the endpoint
We will use the same dataset used to train the model only for testing purposes.

In [0]:
import requests

payload1='{"columns":["Idade","RendaMensal","PercentualUtilizacaoLimite","QtdTransacoesNegadas","AnosDeRelacionamentoBanco","JaUsouChequeEspecial","QtdEmprestimos","NumeroAtendimentos","TMA","IndiceSatisfacao","Saldo","CLTV"],"data":[[21,9703,1.0,5.0,12.0,0.0,1.0,100,300,2,6438,71]]}'

payload2='{"columns":["Idade","RendaMensal","PercentualUtilizacaoLimite","QtdTransacoesNegadas","AnosDeRelacionamentoBanco","JaUsouChequeEspecial","QtdEmprestimos","NumeroAtendimentos","TMA","IndiceSatisfacao","Saldo","CLTV"],"data":[[21,9703,1.0,5.0,12.0,0.0,1.0,1,5,5,6438,71]]}'

## Call the API
Make a request to the API using `query_input`. The API url can be obtained throught `dev_webservice.scoring_uri` generated from deployment process.

In [0]:
headers = {
  'Content-Type': 'application/json'
}

response1 = requests.request("POST", service.scoring_uri, headers=headers, data=payload1)
response2 = requests.request("POST", service.scoring_uri, headers=headers, data=payload2)

print(response1.text)
print(response2.text)

It is also possible to use API using any client to make HTTP requests (curl, postman, etc.).

## Azure Kubernetes Services (AKS)
For production scenarios it is better to deploy using AKS because we have more benefits about security and scalability.

In this scenario is possible to follow two ways: Creating a new AKS cluster or targeting to an existing one. In this tutorial we will create a new cluster.

In [0]:
from azureml.core.webservice import AksWebservice
from azureml.core.compute import AksCompute, ComputeTarget

aks_name = 'aks-e2e-ds'

prov_config = AksCompute.provisioning_configuration()

aks_target = ComputeTarget.create(workspace = workspace, name = aks_name, provisioning_configuration = prov_config)

#If you want to use an existing AKS cluster, comment the previous command line e un-comment the next one:
#aks_target = AksCompute(workspace, aks_name)

aks_target.wait_for_completion(show_output = True)


In [0]:
# Deleting aks cluster
#aks_target = ComputeTarget(workspace=workspace, name=aks_name)
#aks_target.delete()

In [0]:
endpoint_name = 'api-churn-prod'

aks_config = AksWebservice.deploy_configuration(compute_target_name=aks_name)

aks_service = Model.deploy(workspace=workspace,
                           name=endpoint_name,
                           models=[model_azure],
                           inference_config=inference_config,
                           deployment_config=aks_config,
                           deployment_target=aks_target)

aks_service.wait_for_deployment(show_output = True)
print(aks_service.state)

## Call the API (with AKS)

In [0]:
prod_service_key = aks_service.get_keys()[0] if len(aks_service.get_keys()) > 0 else None

headers["Authorization"] = "Bearer {service_key}".format(service_key=prod_service_key)

response1 = requests.request("POST", aks_service.scoring_uri, headers=headers, data=payload1)
response2 = requests.request("POST", aks_service.scoring_uri, headers=headers, data=payload2)

print(response1.text)
print(response2.text)