Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/automated-machine-learning/manymodels/03_Forecasting/03_Forecasting_Pipeline.png)

# Real-time Forecasting Webservice Deployment - Automated ML
---

In this notebook we deploy multiple webservices to forecast sales in real-time with the models we trained in the last step.

Models are grouped based on their tags and each group is deployed together to the same webservice. You can customize your grouping strategy by simply playing with the model tags. 

### Prerequisites 
At this point, you should have already: 
1. Created your AML Workspace using the [00_Setup_AML_Workspace notebook](../../00_Setup_AML_Workspace.ipynb)
2. Run [01_Data_Preparation.ipynb](../../01_Data_Preparation.ipynb) to create the dataset
3. Run [02_AutoML_Training_Pipeline.ipynb](../02_AutoML_Training_Pipeline/02_AutoML_Training_Pipeline.ipynb) to train the models

## 1.0 Connect to workspace

In [None]:
import azureml.core
from azureml.core import Workspace, Datastore
import pandas as pd

# set up workspace
ws= Workspace.from_config() 

# Take a look at Workspace
ws.get_details()

# set up datastores
dstore = ws.get_default_datastore()

output = {}
output['SDK version'] = azureml.core.VERSION
output['Subscription ID'] = ws.subscription_id
output['Workspace'] = ws.name
output['Resource Group'] = ws.resource_group
output['Location'] = ws.location
output['Default datastore name'] = dstore.name
pd.set_option('display.max_colwidth', -1)
outputDf = pd.DataFrame(data = output, index = [''])
outputDf.T

## 2.0 Get models to be deployed

### 2.1 Get models registered in the workspace that had been trained by a run

In [None]:
from azureml.core import Model
runid =  '<update pipeline run id>' # update the pipeline run 
tags = [['ModelType', 'AutoML'], ['RunId', runid]]

models = Model.list(ws, tags=tags, latest=True)
print('Got '+str(len(models))+' models from the workspace.')

### 2.2 Group models by store

We will group the models by store. Therefore, each group will contain three models, one for each of the orange juice brands, and all of them corresponding to the same store.

You can change the grouping strategy by modifying the `grouping_tags` variable below and specifying the names of the tags you want to use for grouping. For convenience, we have created two additional grouping tags you can use:
- `StoreGroup10`: groups stores 10 by 10
- `StoreGroup100`: groups stores 100 by 100

To create custom tags, modify the `tags_dict` object in the [training script](scripts/train.py) and run the training again.

In [None]:
grouping_tags = ['Store']

In [None]:
grouped_models = {}
for m in models:
    
    if m.tags['ModelType'] == '_meta_':
        continue
    
    group_name = '/'.join([m.tags[t] for t in grouping_tags])
    group = grouped_models.setdefault(group_name, [])
    group.append(m)

## 3.0 Configure deployment

### 3.1 Define inference environment

In [None]:
from scripts.helper import get_automl_environment
forecast_env = get_automl_environment()

### 3.2 Define inference configuration

In [None]:
from azureml.core.model import InferenceConfig

inference_config = InferenceConfig(
    entry_script='forecast_webservice.py',
    source_directory='./scripts',
    environment=forecast_env
)

### 3.3 [Option A] Define deploy configuration using ACI (dev/test)

Use this option to deploy the models to Azure Container Instances, indicated for dev/test environments.

In [None]:
from azureml.core.webservice import AciWebservice

deployment_type = 'aci'
deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)
deployment_target = None

### 3.3 [Option B] Define deploy configuration using AKS (production)

Use this option to deploy the models to Azure Kubernetes Services, indicated for production environments.

In [None]:
aks_target_name = 'manymodels-aks'

In [None]:
from azureml.core.compute import AksCompute
from azureml.core.compute_target import ComputeTargetException

try:
    aks_target = AksCompute(ws, aks_target_name)
    print('AKS cluster already attached. Skip the optional step below and jump to "Configure AKS"')
except ComputeTargetException:
    print('AKS cluster not attached yet. Run the optional step below to do so')

#### [Optional] Attach AKS cluster

Attach existing AKS cluster as Compute Target in Azure Machine Learning. This needs to be run only the first time.

In [None]:
aks_resource_name = '<my-aks-name>'
aks_resource_group = '<my-aks-resource-group>'

In [None]:
from azureml.core.compute import ComputeTarget

attach_config = AksCompute.attach_configuration(
    resource_group=aks_resource_group,
    cluster_name=aks_resource_name
)

aks_target = ComputeTarget.attach(ws, aks_target_name, attach_config)
aks_target.wait_for_completion(show_output=True)

#### Configure AKS

In [None]:
from azureml.core.webservice import AksWebservice

deployment_type = 'aks'
deployment_config = AksWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)
deployment_target = aks_target

## 4.0 Deploy the models

We will now deploy one webservice for each of the groups of models. Deployment takes some minutes to complete, so we'll request all of them and then wait for them to finish.

In [None]:
deployments = []
for group_name, group_models in grouped_models.items():
    
    service_name = '{prefix}manymodels-{group}'.format(
        prefix='test-' if deployment_type == 'aci' else '',
        group=group_name
    ).lower()
    
    print('Launching deployment of {}...'.format(service_name))
    service = Model.deploy(
        workspace=ws,
        name=service_name,
        models=group_models,
        inference_config=inference_config,
        deployment_config=deployment_config,
        deployment_target=deployment_target,
        overwrite=True
    )
    print('Deployment of {} started'.format(service_name))
    
    deployments.append({ 'service': service, 'group': group_name, 'models': group_models })
    

In [None]:
models_deployed = {}
for deployment in deployments:
    
    service = deployment['service']
    print('Waiting for deployment of {} to finish...'.format(service.name))
    service.wait_for_deployment(show_output=True)
    if service.state != 'Healthy':
        print('DEPLOYMENT FAILED FOR SERVICE {}'.format(service.name))
    
    service_info = {
        'webservice': service.name,
        'state': service.state,
        'endpoint': service.scoring_uri if service.state == 'Healthy' else None,
        'key': service.get_keys()[0] if service.auth_enabled and service.state == 'Healthy' else None
    }

    # Store deployment info for each deployed model
    for m in deployment['models']:
        models_deployed[m.name] = {
            'version': m.version,
            'group': deployment['group'],
            **service_info
        }


### 4.2 Test the webservices

We can query for multiple models into the same request, but all of them need to be from the same store, as each endpoint only contains models corresponding to one particular store.

In [None]:
from azureml.core import Datastore

# Please change the following to point to your own blob container and pass in account_key
blob_datastore_name = "automl_many_models"
container_name = "automl-sample-notebook-data"
account_name = "automlsamplenotebookdata"

oj_datastore = Datastore.register_azure_blob_container(workspace=ws, 
                                                       datastore_name=blob_datastore_name, 
                                                       container_name=container_name,
                                                       account_name=account_name,
                                                       create_if_not_exists=True)

In [None]:
from azureml.core.dataset import Dataset
inference_name_small = 'oj_inference_small'

inference_ds_small = Dataset.Tabular.from_delimited_files(path=oj_datastore.path(inference_name_small + '/'), validate=False)
all_df = inference_ds_small.to_pandas_dataframe()

In [None]:
from scripts.helper import get_model_name
store = 1002
brand = 'dominicks'
tags_dict = {'store':store, 'brand': brand}
model_name = get_model_name(tags_dict)

In [None]:
dominicks_test_data = all_df.loc[(all_df['Store']==store) & (all_df['Brand']==brand)]
print(dominicks_test_data.head(5))

In [None]:
dominicks_test_data_json = dominicks_test_data[:].to_json(orient='records', date_format='iso')

In [None]:
test_data = [{
        "group_column_names": ['Store', 'Brand'], # This is the same list that is passed in the training script
        "time_column_name": "WeekStarting", # This is the same value for time_column_name that is passed in the training script
        "data": dominicks_test_data_json
    }]

In [None]:
import requests
import json

try:
    url = models_deployed[model_name]['endpoint']
    key = models_deployed[model_name]['key']    
except KeyError as e:
    raise ValueError(f'Model for store {store} and brand {brand} has not been deployed')

request_headers = {'Content-Type': 'application/json'}
if key:
    request_headers['Authorization'] = f'Bearer {key}'

response = requests.post(url, json=test_data, headers=request_headers)
response.json()

## 5.0 Group all models into a single routing endpoint

We can now group all the services into a single entry point, so that we don't have to handle each endpoint separately. 
For that, we'll register the `endpoints` object as a model, and deploy it as a webservice. This webservice will receive the incoming requests and route them to the appropiate model service, acting as the unique entry point for outside requests.

### 5.1 Register endpoints dict as an AML model

In [None]:
import joblib

joblib.dump(models_deployed, 'models_deployed.pkl')

dep_model = Model.register(
    workspace=ws, 
    model_path ='models_deployed.pkl', 
    model_name='deployed_models_info',
    tags={'ModelType': '_meta_'},
    description='Dictionary of the service endpoint where each model is deployed'
)

### 5.2 Deploy routing webservice

In [None]:
from azureml.core import Environment
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.runconfig import DEFAULT_CPU_IMAGE
routing_env = Environment(name="many_models_routing_environment")
routing_env_deps = CondaDependencies.create(pip_packages=['azureml-defaults', 'joblib'])
routing_env.python.conda_dependencies = routing_env_deps

routing_infconfig = InferenceConfig(
    entry_script='routing_webservice.py',
    source_directory='./scripts',
    environment=routing_env
)

# Reuse deployment config with lower capacity
deployment_config.cpu_cores = 0.1
deployment_config.memory_gb = 0.5

routing_service = Model.deploy(
    workspace=ws,
    name='routing-manymodels',
    models=[dep_model],
    inference_config=routing_infconfig,
    deployment_config=deployment_config,
    deployment_target=deployment_target,
    overwrite=True
)
routing_service.wait_for_deployment(show_output=True)

assert routing_service.state == 'Healthy'

print('Routing endpoint deployed with URL: {}'.format(routing_service.scoring_uri))

### 5.3 Test the webservice

This new endpoint can be called with data from different stores or brands, and it will automatically route the request to the appropiate model endpoint.

In [None]:
import requests
import json
url = routing_service.scoring_uri

request_headers = {'Content-Type': 'application/json'}
if routing_service.auth_enabled:
    keys = routing_service.get_keys()
    request_headers['Authorization'] = 'Bearer {}'.format(keys[0])

response = requests.post(url, json=test_data, headers=request_headers)
response.json()