## Deploy churn prediction model
In this notebook we will demonstrate how to get the model generated [here]() to deploy it. We need to follow these steps:

- Get an already trained model
- Instantiate an Azure ML Workspace
- Build an image with the best model packaged
- Deploy the model to ACI (Azure Container Instance)
- Deploy the model to AKS (Azure Kubernetes Services)

## First lets get the model
Return the best model from `churn-prediction` experiment. We will use the same notebook **model-churn-prediction** and return the `model_uri`.

In [None]:
%run ./model-churn-prediction

And load the `xgboost` using the `model_uri` returned from MLFlow tracking.

In [None]:
import mlflow

model = mlflow.xgboost.load_model(model_uri)

## Get Azure Machine Learning Workspace
We will use Azure Machine Learning to deliver the API `endpoints` that will consume the Machine Learning models. To be able to interact with Azure ML we will use [Azure Machine Learning Python SDK](https://docs.microsoft.com/en-us/python/api/overview/azure/ml/?view=azure-ml-py), with it its possible to create new workspaces (or use existing ones) to facilitate the deployment process.

Its required to fill the variables `WORKSPACE_NAME`, `WORKSPACE_LOCATION`, `RESOURCE_GROUP` and `SUBSCRIPTION_ID` with your subscription data.

As default will be required the `Interactive Login` auth. For production scenarios an app registration with `Service Principal` is required. In the [documentation] (https://docs.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication#set-up-service-principal-authentication) we have more details about the different kind of authentications.

First install the [`azureml-sdk`](https://pypi.org/project/azureml-sdk/)

In [None]:
!pip install azureml-sdk

And now we can use it to instantiate the Azure ML Workspace

In [None]:
import azureml
from azureml.core import Workspace
import mlflow.azureml

workspace_name = '<YOUR-WORKSPACE-NAME>'
resource_group = '<YOUR-RESOURCE-GROUP>'
subscription_id = '<YOUR-SUBSCRIPTION-ID>'

workspace = Workspace.get(name = workspace_name,
                          resource_group = resource_group,
                          subscription_id = subscription_id)

## Register the model
Now we instantiate the Azure ML Workspace we can register the model. First we will persist it to the dbfs (to be able to pass the path as a parameters to Azure ML Register)

In [None]:
import shutil
model_path = '/dbfs/models/churn-prediction'

# Delete old files if necessary
try:
  shutil.rmtree(model_path)
except FileNotFoundError:
  print ("Cleanig model directory: {} \nThis directory hasn't been created yet.".format(model_path))
else:
    print ("Cleanig model directory: {} \nDirectory successfully cleaned.".format(model_path))
  
# Persist the XGBoost model
mlflow.xgboost.save_model(model, model_path)

In [None]:
from azureml.core.model import Model

model_name = 'churn-model'
model_description = 'Modelo de predição de churn utilizando XGBoost'

model_azure = Model.register(model_path = model_path,
                             model_name = model_name,
                             description = model_description,
                             workspace = workspace,
                             tags={'Framework': "XGBoost", 'Tipo': "Classificação"}
                             )

A new model version was generated in the Azure ML Workspace. We can use it to deploy an API with ACI or AKS.

# Deploy
Now with the registered model we can choose between two deployment types: `ACI` (Azure Container Instance) or `AKS` (Azure Kubernetes Service).

For development scenarios it is better to use `ACI` and for production `AKS` will have more options related to scalability and security. Please see more details in this [page](https://docs.microsoft.com/en-us/azure/architecture/reference-architectures/ai/mlops-python).

### Entry script
But before deploy the model, it is important to define an **`entry script`** named score.py. It will be responsible to load the model when the deployed service starts and for receiving data, passing it to the model, and then returning a response as well (see this [link](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-existing-model#define-inference-configuration)).

The package [`inference-schema`](https://github.com/Azure/InferenceSchema) will be used to help to set some schema decorators. It is very useful specially to automatically generated a [**swagger ready**](https://en.wikipedia.org/wiki/Swagger_(software) documentation.

In [None]:
!pip install inference-schema

In [None]:
%%writefile /dbfs/models/churn-prediction/score.py

import mlflow
import json
import pandas as pd
import os
import xgboost as xgb
import time
import numpy as np

from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType

# Called when the deployed service starts
def init():
    global model
    global train_stats
    
    # Get the path where the deployed model can be found.
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), './churn-prediction')
    
    # Load model
    model = mlflow.xgboost.load_model(model_path)

# Our sample payload (to be able to automatically generate swagger file)
input_sample = pd.DataFrame(data=[{"Idade": 21,
                                   "RendaMensal": 9703, 
                                   "PercentualUtilizacaoLimite": 1.0, 
                                   "QtdTransacoesNegadas": 5.0, 
                                   "AnosDeRelacionamentoBanco": 12.0, 
                                   "JaUsouChequeEspecial": 0.0, 
                                   "QtdEmprestimos": 1.0, 
                                   "NumeroAtendimentos": 100, 
                                   "TMA": 300, 
                                   "IndiceSatisfacao": 2, 
                                   "Saldo": 6438, 
                                   "CLTV": 71}])

# This is an integer type sample. Use the data type that reflects the expected result.
output_sample = np.array([0])

@input_schema('data', PandasParameterType(input_sample))
@output_schema(NumpyParameterType(output_sample))
def run(data):
    try:
        print("receiving input_data....")
        print(data.columns)

        data_xgb = xgb.DMatrix(data)
        
        print("predicting....")
        result = model.predict(data_xgb)

        print("result.....")
        print(result)
    # You can return any data type, as long as it can be serialized by JSON.
        return result.tolist()
    except Exception as e:
        error = str(e)
        return error

### Inference config
We must now add some inference configs to be used in the endpoint. We can add required packages and an environment that can be registered in the Azure ML Workspace.

Here we will use the same `conda.yaml` file that is already registered from MLFlow process. We will add the `azureml-defaults` package that can be used in the inference process.

In [None]:
from azureml.core.model import InferenceConfig
from azureml.core.environment import Environment
from azureml.core.conda_dependencies import CondaDependencies

# Create the environment
env = Environment(name='xgboost_env')

conda_dep = CondaDependencies('/dbfs/models/churn-prediction/conda.yaml')

# Define the packages needed by the model and scripts
conda_dep.add_pip_package("azureml-defaults")
conda_dep.add_pip_package("inference-schema")

# Adds dependencies to PythonSection of myenv
env.python.conda_dependencies=conda_dep

inference_config = InferenceConfig(entry_script="/dbfs/models/churn-prediction/score.py",
                                   environment=env)

Now with the inference config we can proceed with the deployment

###ACI - Azure Container Instance
Follow we will demonstrate how to create an `endpoint` using the image created before and delivering with `ACI`.

In [None]:
from azureml.core.webservice import AciWebservice, Webservice
from azureml.exceptions import WebserviceException
from azureml.core.model import Model

endpoint_name = 'api-churn-dev'

deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1, auth_enabled=True)
service = Model.deploy(workspace, endpoint_name, [model_azure], inference_config, deployment_config, overwrite=True)
service.wait_for_deployment(show_output = True)

print('A API {} foi gerada no estado {}'.format(service.scoring_uri, service.state))

## Load some data to test the endpoint
We will use the same dataset used to train the model only for testing purposes.

In [None]:
import requests
import json

# A churn payload
payload1 = {
    "data":
    [
        {
            'Idade': "21",
            'RendaMensal': "9703",
            'PercentualUtilizacaoLimite': "1",
            'QtdTransacoesNegadas': "5",
            'AnosDeRelacionamentoBanco': "12",
            'JaUsouChequeEspecial': "0",
            'QtdEmprestimos': "1",
            'NumeroAtendimentos': "100",
            'TMA': "300",
            'IndiceSatisfacao': "2",
            'Saldo': "6438",
            'CLTV': "71",
        },
    ],
}

payload1 = str.encode(json.dumps(payload1))

# A non-churn payload
payload2 = {
    "data":
    [
        {
            'Idade': "48",
            'RendaMensal': "9703",
            'PercentualUtilizacaoLimite': "1",
            'QtdTransacoesNegadas': "5",
            'AnosDeRelacionamentoBanco': "12",
            'JaUsouChequeEspecial': "0",
            'QtdEmprestimos': "1",
            'NumeroAtendimentos': "1",
            'TMA': "300",
            'IndiceSatisfacao': "5",
            'Saldo': "6438",
            'CLTV': "71",
        },
    ],
}

payload2 = str.encode(json.dumps(payload2))

## Call the API
Make two requests to the API using `payload1` and `payload2`. The API url can be obtained throught `service.scoring_uri` generated from deployment process.

In [None]:
dev_service_key = service.get_keys()[0] if len(service.get_keys()) > 0 else None

headers = {'Content-Type': 'application/json'}
headers["Authorization"] = "Bearer {service_key}".format(service_key=dev_service_key)

response1 = requests.request("POST", service.scoring_uri, headers=headers, data=payload1)
response2 = requests.request("POST", service.scoring_uri, headers=headers, data=payload2)

print(response1.text)
print(response2.text)

It is also possible to use API using any client to make HTTP requests (curl, postman, etc.).

## Azure Kubernetes Services (AKS)
For production scenarios it is better to deploy using AKS because we have more benefits about security and scalability.

In this scenario is possible to follow two ways: Creating a new AKS cluster or targeting to an existing one. In this tutorial we will create a new cluster.

In [None]:
from azureml.core.webservice import AksWebservice
from azureml.core.compute import AksCompute, ComputeTarget

aks_name = 'aks-e2e-ds'

prov_config = AksCompute.provisioning_configuration()

aks_target = ComputeTarget.create(workspace = workspace, name = aks_name, provisioning_configuration = prov_config)

#If you want to use an existing AKS cluster, comment the previous command line e un-comment the next one:
#aks_target = AksCompute(workspace, aks_name)

aks_target.wait_for_completion(show_output = True)


In [None]:
from azureml.core.compute import AksCompute, ComputeTarget

aks_name = 'aks-e2e-ds'
aks_target = ComputeTarget(workspace=workspace, name=aks_name)

# Deleting aks cluster
#aks_target.delete()

In [None]:
from azureml.core.webservice import AksWebservice

endpoint_name = 'api-churn-prod'

aks_config = AksWebservice.deploy_configuration(compute_target_name=aks_name)

aks_service = Model.deploy(workspace=workspace,
                           name=endpoint_name,
                           models=[model_azure],
                           inference_config=inference_config,
                           deployment_config=aks_config,
                           deployment_target=aks_target,
                           overwrite=True
                          )

aks_service.wait_for_deployment(show_output = True)
print(aks_service.state)

## Call the API (with AKS)

In [None]:
prod_service_key = aks_service.get_keys()[0] if len(aks_service.get_keys()) > 0 else None

headers = {'Content-Type': 'application/json'}
headers["Authorization"] = "Bearer {service_key}".format(service_key=prod_service_key)

response1 = requests.request("POST", aks_service.scoring_uri, headers=headers, data=payload1)
response2 = requests.request("POST", aks_service.scoring_uri, headers=headers, data=payload2)

print(response1.text)
print(response2.text)