# Exercise08 : Publish as a Web Service

Finally we publish our model as a web service.

Before running this code, **complete the model registration in "[Exercise04 : Train on Remote GPU Virtual Machine](./exercise04_train_remote.ipynb)"**.

*back to [index](https://github.com/tsmatz/azureml-tutorial/)*

## Get workspace settings

Before starting, you must read your configuration settings. (See "[Exercise01 : Prepare Config Settings](./exercise01_prepare_config.ipynb)".)

In [1]:
from azureml.core import Workspace
import azureml.core

ws = Workspace.from_config()

## Get a registered model

Get the trained model by MNIST dataset.<br>
**Before running this code, complete the model training in "[Exercise04 : Train on Remote GPU Virtual Machine](./exercise04_train_remote.ipynb)".**

In [2]:
from azureml.core.model import Model

registered_model = Model(ws, 'mnist_model_test')

## Create entry script (.py)

In order to deploy as web service, first we generate the following scoring code.<br>
This entry script in AML should include both ```init()``` and ```run()```.

In [3]:
import os
script_folder = './inference_script'
os.makedirs(script_folder, exist_ok=True)

In [4]:
%%writefile inference_script/score.py
import os
import json
import numpy as np
import tensorflow as tf
from azureml.core.model import Model

def init():
    global loaded_model
    model_path = Model.get_model_path(model_name='mnist_model_test')
    loaded_model = tf.keras.models.load_model(model_path)

def run(raw_data):
    try:
        data = json.loads(raw_data)["data"]
        pred_output = loaded_model(np.array(data))
        pred_list = tf.math.argmax(pred_output, axis=-1).numpy().tolist()
        return pred_list
    except Exception as e:
       result = str(e)
       return 'Internal Exception : ' + result

Writing inference_script/score.py


## Deploy as web service

Here we deploy a registered model as web service.

> Note : When you build a container image without deploying (such as, for deploying model on Edge devices), package a registered model with Docker. (See [here](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-package-models).)<br>
> For debugging purpose, you can also deploy a model on your local computer running docker runtime.

Create deploy configuration for preparation.<br>
In this tutorial, we will deploy our serving in Azure Container Instance (ACI) with anonymous access permission.

In [5]:
from azureml.core.webservice import AciWebservice, Webservice

aci_conf = AciWebservice.deploy_configuration(
    cpu_cores=1,
    memory_gb=1, 
    description='This is a tensorflow example.')

Next, create inference configuration for preparation.

In [7]:
from azureml.core.model import InferenceConfig
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.environment import Environment

# Generate conda dependency
conda_dependency = CondaDependencies.create()
conda_dependency.add_pip_package('tensorflow==2.10.0')
conda_dependency.add_pip_package('numpy')
### Or you can also write as follows (make sure to insert 'azureml-defaults' module)
#conda_dependency = CondaDependencies.create(pip_packages=['azureml-defaults', 'tensorflow==2.10.0'])

# Create environment and set the previous conda dependency
myenv = Environment(name="myenv")
myenv.python.conda_dependencies = conda_dependency

# Create inference config with score.py
inf_conf = InferenceConfig(
    entry_script="score.py",
    source_directory='./inference_script',
    environment=myenv)

Now, deploy as a web service !

In [8]:
svc = Model.deploy(
    name='my-mnist-service',
    deployment_config=aci_conf,
    models=[registered_model],
    inference_config=inf_conf,
    workspace=ws)
svc.wait_for_deployment(show_output=True)

To leverage new model deployment capabilities, AzureML recommends using CLI/SDK v2 to deploy models as online endpoint, 
please refer to respective documentations 
https://docs.microsoft.com/azure/machine-learning/how-to-deploy-managed-online-endpoints /
https://docs.microsoft.com/azure/machine-learning/how-to-deploy-managed-online-endpoint-sdk-v2 /
https://docs.microsoft.com/azure/machine-learning/how-to-attach-kubernetes-anywhere 
For more information on migration, see https://aka.ms/acimoemigration. 
  svc = Model.deploy(


Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2022-10-05 06:54:16+00:00 Creating Container Registry if not exists.
2022-10-05 06:54:16+00:00 Registering the environment.
2022-10-05 06:54:17+00:00 Building image..
2022-10-05 07:04:31+00:00 Generating deployment configuration..
2022-10-05 07:04:32+00:00 Submitting deployment to compute.
2022-10-05 07:04:37+00:00 Checking the status of deployment my-mnist-service..
2022-10-05 07:07:18+00:00 Checking the status of inference endpoint my-mnist-service.
Succeeded
ACI service creation operation finished, operation "Succeeded"


In [7]:
# See details, if error has occured
print(svc.get_logs())

2021-08-23T08:25:17,379725500+00:00 - iot-server/run 
2021-08-23T08:25:17,386434100+00:00 - rsyslog/run 
2021-08-23T08:25:17,395805700+00:00 - gunicorn/run 
Dynamic Python package installation is disabled.
Starting HTTP server
2021-08-23T08:25:17,398135200+00:00 - nginx/run 
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2021-08-23T08:25:17,764669100+00:00 - iot-server/finish 1 0
2021-08-23T08:25:17,770766800+00:00 - Exit code 1 is normal. Not restarting iot-server.
Starting gunicorn 20.1.0
Listening at: http://127.0.0.1:31311 (65)
Using worker: sync
worker timeout is set to 300
Booting worker with pid: 93
SPARK_HOME not set. Skipping PySpark Initialization.
Initializing logger
2021-08-23 08:25:23,364 | root | INFO | Starting up app insights client
logging socket was found. logging is available.
logging socket was found. logging is available.
2021-08-23 08:25:23,364 | root | INFO | Starting up request id generator
2021-08-23 08:25:23,364 | root | INFO | Star

Check service url

In [10]:
svc.scoring_uri

'http://43e695df-70bd-4d61-88e4-d15dd23d45ec.eastus.azurecontainer.io/score'

## Test your web service

Let's invoke your web service and check the returned results in Python.

In [11]:
import requests
import json

import tensorflow as tf

# Read data by tensor
test_data = tf.data.Dataset.load("./data/test")

# Generate data
image_arr = []
label_arr = []
for image, label in test_data.take(3):
    image_arr.append(image.numpy().tolist())
    label_arr.append(label.numpy().item())

# Invoke web service !
headers = {
    'Content-Type':'application/json'
} 
# for AKS deployment (in production), provide service key in the header as below :
# api_key1, api_key2 = svc.get_keys()
# headers = {'Content-Type':'application/json',  'Authorization':('Bearer '+ api_key1)} 
values = json.dumps(image_arr)
input_data = "{\"data\": " + values + "}"
http_res = requests.post(
    svc.scoring_uri,
    input_data,
    headers = headers)
print('Predicted : ', http_res.text)
print('Actual    : ', label_arr)

2022-10-05 07:47:34.497224: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-05 07:47:34.648137: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-10-05 07:47:34.648169: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-10-05 07:47:34.681609: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-10-05 07:47:35.441531: W tensorflow/stream_executor/pla

Predicted :  [7, 2, 1]
Actual    :  [7, 2, 1]


2022-10-05 07:47:35.976473: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-10-05 07:47:35.976516: W tensorflow/stream_executor/cuda/cuda_driver.cc:263] failed call to cuInit: UNKNOWN ERROR (303)
2022-10-05 07:47:35.976547: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (client1005): /proc/driver/nvidia/version does not exist
2022-10-05 07:47:35.976776: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


## Remove service

In [12]:
svc.delete()