**<center><h1>Introduction</h1></center>**

After training your model on Azure Databricks Compute, you may want to deploy your model so that it can be consumed by your business or end user. You can easily deploy your model by using Azure Machine Learning. In this module, you will learn how to deploy models using Azure Databricks and Azure Machine Learning.


**<h2>Learning Objectives</h2>**

After completing this module, you'll be able to:

- Describe considerations for model deployment.
- Plan for deployment endpoints.
- Deploy a model as an inferencing webservice.
- Troubleshoot model deployment.

<hr>

**<center><h1>Describe considerations for model deployment</h1></center>**

In machine learning, Model Deployment can be considered as a process by which you integrate your trained machine learning models into a production environment such that your business or end-user applications can use the model predictions to make decisions or gain insights into your data. The most common way you deploy a model using Azure Machine Learning from Azure Databricks, is to deploy the model as a real-time inferencing service. Here the term inferencing refers to the use of a trained model to make predictions on new input data on which the model has not been trained.

**<h2>What is Real-Time Inferencing?</h2>**

The model is deployed as part of a service that enables applications to request immediate, or real-time, predictions for individual, or small numbers of data observations.

<img src="images/04-02-01-real-time.jpg" />

In Azure Machine learning, you can create real-time inferencing solutions by deploying a model as a real-time service, hosted in a containerized platform such as Azure Kubernetes Services (AKS).




<hr>

**<center><h1>Plan for Azure Machine Learning deployment endpoints</h1></center>**


After you have trained your machine learning model and evaluated it to the point where you are ready to use it outside your own development or test environment, you need to deploy it somewhere. Azure Machine Learning service simplifies this process. You can use the service components and tools to register your model and deploy it to one of the available **compute targets** so it can be made available as a web service in the Azure cloud, or on an IoT Edge device.


**<h2>Available compute targets</h2>**

You can use the following compute targets to host your web service deployment:


<img src="images/image5.png" />

**<h2>Deploy a model to Azure Machine Learning</h2>**

As we discussed in the previous unit, you can deploy a model to several kinds of compute target: including local compute, an Azure Container Instance (ACI), an Azure Kubernetes Service (AKS) cluster, or an Internet of Things (IoT) module. Azure Machine Learning uses containers as a deployment mechanism, packaging the model and the code to use it as an image that can be deployed to a container in your chosen compute target.

To deploy a model as an inferencing webservice, you must perform the following tasks:

1. Register a trained model.
2. Define an Inference Configuration.
3. Define a Deployment Configuration.
4. Deploy the Model.


**<h2>1. Register a trained model</h2>**

After successfully training a model, you must register it in your Azure Machine Learning workspace. Your real-time service will then be able to load the model when required.

To register a model from a local file, you can use the **register** method of the **Model** object as shown here:

```
from azureml.core import Model

model = Model.register(workspace=ws, 
                       model_name='nyc-taxi-fare',
                       model_path='model.pkl', # local path
                       description='Model to predict taxi fares in NYC.')
```

**<h2>2. Define an Inference Configuration</h2>**

The model will be deployed as a service that consists of:

- A script to load the model and return predictions for submitted data.
- An environment in which the script will be run.

You must therefore define the script and environment for the service.

**<h3>Creating an Entry Script</h3>**

Create the entry script (sometimes referred to as the scoring script) for the service as a Python (.py) file. It must include two functions:

- **init():** Called when the service is initialized.
- **run(raw_data):** Called when new data is submitted to the service.
Typically, you use the **init** function to load the model from the model registry, and use the **run** function to generate predictions from the input data. The following example script shows this pattern:

```
import json
import joblib
import numpy as np
from azureml.core.model import Model

# Called when the service is loaded
def init():
    global model
    # Get the path to the registered model file and load it
    model_path = Model.get_model_path('nyc-taxi-fare')
    model = joblib.load(model_path)

# Called when a request is received
def run(raw_data):
    # Get the input data as a numpy array
    data = np.array(json.loads(raw_data)['data'])
    # Get a prediction from the model
    predictions = model.predict(data)
    # Return the predictions as any JSON serializable format
    return predictions.tolist()
```

**<h3>Creating an Environment</h3>**

Azure Machine Learning environments are an encapsulation of the environment where your machine learning training happens. They define Python packages, environment variables, Docker settings and other attributes in declarative fashion. The below code snippet shows an example of how you can create an environment for your deployment:

```
from azureml.core import Environment
from azureml.core.environment import CondaDependencies

my_env_name="nyc-taxi-env"
myenv = Environment.get(workspace=ws, name='AzureML-Minimal').clone(my_env_name)
conda_dep = CondaDependencies()
conda_dep.add_pip_package("numpy==1.18.1")
conda_dep.add_pip_package("pandas==1.1.5")
conda_dep.add_pip_package("joblib==0.14.1")
conda_dep.add_pip_package("scikit-learn==0.24.1")
conda_dep.add_pip_package("sklearn-pandas==2.1.0")
myenv.python.conda_dependencies=conda_dep
```

**<h3>Combining the Script and Environment in an InferenceConfig</h3>**

After creating the entry script and environment, you can combine them in an **InferenceConfig** for the service like this:

```
from azureml.core.model import InferenceConfig

from azureml.core.model import InferenceConfig
inference_config = InferenceConfig(entry_script='score.py', 
                                   source_directory='.', 
                                   environment=myenv)
```

**<h2>3. Define a Deployment Configuration</h2>**

Now that you have the entry script and environment, you need to configure the compute to which the service will be deployed. If you are deploying to an AKS cluster, you must create the cluster and a compute target for it before deploying:

```
from azureml.core.compute import ComputeTarget, AksCompute

cluster_name = 'aks-cluster'
compute_config = AksCompute.provisioning_configuration(location='eastus')
production_cluster = ComputeTarget.create(ws, cluster_name, compute_config)
production_cluster.wait_for_completion(show_output=True)
```

With the compute target created, you can now define the deployment configuration, which sets the target-specific compute specification for the containerized deployment:

```
from azureml.core.webservice import AksWebservice

deploy_config = AksWebservice.deploy_configuration(cpu_cores = 1,
                                                   memory_gb = 1)
```

The code to configure an ACI deployment is similar, except that you do not need to explicitly create an ACI compute target, and you must use the **deploy_configuration** class from the **azureml.core.webservice.AciWebservice** namespace. Similarly, you can use the **azureml.core.webservice.LocalWebservice** namespace to configure a local Docker-based service.

**<h2>4. Deploy the Model</h2>**

After all of the configuration is prepared, you can deploy the model. The easiest way to do this is to call the deploy method of the **Model** class, like this:

```
from azureml.core.model import Model

service = Model.deploy(workspace=ws,
                       name = 'nyc-taxi-service',
                       models = [model],
                       inference_config = inference_config,
                       deployment_config = deploy_config,
                       deployment_target = production_cluster)
service.wait_for_deployment(show_output = True)
```

For ACI or local services, you can omit the **deployment_target** parameter (or set it to **None**).
<hr>

**<center><h1>Troubleshoot model deployment</h1></center>**

There are a lot of elements to a service deployment, including the trained model, the runtime environment configuration, the scoring script, the container image, and the container host. Troubleshooting a failed deployment, or an error when consuming a deployed service can be complex.


**<h2>Check the service state</h2>**

As an initial troubleshooting step, you can check the status of a service by examining its **state:**

```
from azureml.core.webservice import AksWebservice

# Get the deployed service
service = AksWebservice(name='classifier-service', workspace=ws)

# Check its state
print(service.state)

```

<mark>**Note:** To view the state of a service, you must use the compute-specific service type (for example AksWebservice) and not a generic WebService object.</mark>

For an operational service, the state should be Healthy.

**<h2>Review service logs</h2>**

If a service is not healthy, or you are experiencing errors when using it, you can review its logs:

```
print(service.get_logs())
```

The logs include detailed information about the provisioning of the service, and the requests it has processed; and can often provide an insight into the cause of unexpected errors.

**<h2>Deploy to a local container</h2>**

Deployment and runtime errors can be easier to diagnose by deploying the service as a container in a local Docker instance, like this:

```
from azureml.core.webservice import LocalWebservice

deployment_config = LocalWebservice.deploy_configuration(port=8890)
service = Model.deploy(ws, 'test-svc', [model], inference_config, deployment_config)
```
You can then test the locally deployed service using the SDK:
```
print(service.run(input_data = json_data))
```
You can then troubleshoot runtime issues by making changes to the scoring file that is referenced in the inference configuration, and reloading the service without redeploying it (something you can only do with a local service):
```
service.reload()
print(service.run(input_data = json_data))
```



<hr>

**<center><h1>Exercise - Deploy an Azure Databricks model in Azure Machine Learning</h1></center>**

Now, you will learn to train models in Azure Databricks and then deploy models in Azure Machine Learning.

In this exercise, you will:

- Register a databricks-trained model in AML.
- Deploy a service that uses the model.
- Consume the deployed service.

**<h2>Instructions</h2>**
Follow these instructions to complete the exercise:

1. Open the exercise instructions at https://aka.ms/mslearn-dp090.
2. Complete the **Deploying Models in Azure Machine Learning** exercises.




<hr>