# Prerequisites

* After training a custom estimator, the model is saved in an exportable format using `tf.estimator.Estimator.export_saved_model()` method (see "Tensorflow Estimator/Custom Estimator.ipynb" for more details).
* SavedModel directory structure:
![SegmentLocal](resources/1.png)
* Model inputs and outputs:

`The given SavedModel SignatureDef contains the following input(s):
  inputs['PetalLength'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: Placeholder_2:0
  inputs['PetalWidth'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: Placeholder_3:0
  inputs['SepalLength'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: Placeholder:0
  inputs['SepalWidth'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: Placeholder_1:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['classes'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: ArgMax:0
  outputs['probabilities'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 3)
      name: Softmax:0
Method name is: tensorflow/serving/predict`

# Overview

The following is an overview of the deployment process:
* Step 1: Create a workspace
* Step 2: Register the model
* Step 3: Define image/inference configuration
* Step 4: Define deploy configuration
* Step 5: Deploy the model

# Create a machine learning service workspace

https://docs.microsoft.com/en-us/azure/machine-learning/service/setup-create-workspace

1. Sign in to Azure Portal https://portal.azure.com
2. Create and configure your new workspace by providing the workspace name, subscription, resource group, and location.
 * workspace name: deploy-tensorflow-iris
 * resource group: deploy-tensorflow-iris
3. In the newly created workspace, select <b>Overview</b> and download `config.json` file. Use this file to load the workspace configuration in your Azure ML SDK Notebook or Python script

# Register the model

Registering a model allows you to store, version, and track metadata about models in your workspace.

1. Go to Azure Notebook server https://notebooks.azure.com and create a new project
 * project name: deploy-tensorflow-iris
2. Upload `config.json` file and your model (the entire `export` directory)
3. Use the following script to register your model

```python
from azureml.core import Workspace
from azureml.core.model import Model

ws = Workspace.from_config() # load workspace from config.json file
model = Model.register(model_path = "export", # the path on the local file system where the model assets are located.
                       model_name = "tensorflow-iris", # the name to register the model with
                       description = "Iris tensorflow custom estimator trained outside Azure Machine Learning service",
                       workspace = ws) # the workspace to register the model under
```
4. After registering the model, go back to the portal. Select your workspace and then select `Assets/Models` to check whether the model is registered successfully.

# Define image/inference configuration

The image/inference configuration describes how to configure the model to make predictions. This configuration specifies the runtime, the entry script, and (optionally) the conda environment file.

The configuration references the following files that are used to run the model when it's deployed:
* The runtime. The only valid value for runtime currently is Python.
* An entry script. This file (named `score.py`) loads the model when the deployed service starts. It is also responsible for receiving data, passing it to the model, and then returning a response.

  The script contains two functions that load and run the model:

  * init(): Typically this function loads the model into a global object. This function is run only once when the Docker container for your web service is started.

  * run(input_data): This function uses the model to predict a value based on the input data. Inputs and outputs to the run typically use JSON for serialization and de-serialization. You can also work with raw binary data. You can transform the data before sending to the model, or before returning to the client.

  The following is the Python code for the entry script `score.py`. 

```python
import json
import numpy as np
import tensorflow as tf
from azureml.core.model import Model

# Called when the deployed service starts
def init():
    """
    `inputs` and `outputs` are dictionaries of tensors.
    `inputs` has the following format:
    {
        "SepalWidth": corresponding placeholder,
        "SepalLength": corresponding placeholder,
        "PetalLength": corresponding placeholder,
        "PetalWidth": corresponding placeholder
    }
    `outputs` has the following format:
    {
        "probabilities": corresponding tensor,
        "classes": corresponding tensor
    }
    """
    global inputs, outputs, sess # These variables are global because the `run()` function uses them 
    model_dir = Model.get_model_path('tensorflow-iris') # return the path to the model 
                                                        # registered with the name `tensorflow-iris`
    # load the model, and then load the input and output tensors into `inputs` and `outputs` variables
    sess = tf.Session()
    meta_graph_def = tf.saved_model.load(sess, [tf.saved_model.tag_constants.SERVING], model_dir)
    signature_def = meta_graph_def.signature_def['serving_default']
    inputs = {}
    for k in signature_def.inputs.keys():
        inputs[k] = tf.get_default_graph().get_tensor_by_name(signature_def.inputs[k].name)
    outputs = {}
    for k in signature_def.outputs.keys():
        outputs[k] = tf.get_default_graph().get_tensor_by_name(signature_def.outputs[k].name)

# Handle requests to the service
def run(raw_data):
    """
    `raw_data` has the following format:
    {
        "SepalWidth": [1d-array],
        "SepalLength": [1d-array],
        "PetalLength": [1d-array],
        "PetalWidth": [1d-array]
    }
    """
    data_dict = json.loads(raw_data)
    feed_dict = {}
    for k in data_dict.keys():
        feed_dict[inputs[k]] = np.array(data_dict[k])
    # make prediction
    probs, classes = sess.run([outputs['probabilities'], outputs['classes']], feed_dict=feed_dict)
    # you can return any datatype as long as it is JSON-serializable
    return {'probs': probs.tolist(), 'classes': classes.tolist()} # numpy array are not JSON-serializable
                                                                  # so have to convert to list first
```
* A conda environment file. This file defines the Python packages needed to run the model and entry script.

  The following YAML describes the conda environment needed to run the model and entry script:
  
  `myenv.yml`
  
  ```yaml
  name: project_environment
  dependencies:
  - python=3.6.2
  - pip:
    - azureml-defaults
  - tensorflow=1.13
  - numpy
  channels:
  - conda-forge
  ```

  Python script to generate `myenv.yml`:
  
```python
from azureml.core.conda_dependencies import CondaDependencies

myenv = CondaDependencies()
myenv.add_tensorflow_conda_package(core_type='cpu', version='1.13')
myenv.add_conda_package('numpy')

with open('myenv.yml', 'w') as f:
    f.write(myenv.serialize_to_string())
```

Finally, define the image configuration.

```python
from azureml.core.image import ContainerImage
image_config = ContainerImage.image_configuration(execution_script='score.py',
                                                 runtime='python',
                                                 conda_file='myenv.yml')
```

# Define deploy configuration

Before deploying, you must define the deployment configuration. The deployment configuration is specific to the compute target that will host the web service.

The following table provides an example of creating a deployment configuration for each compute target:

| Compute target | Deployment configuration example |
| --- | --- |
| Local | `deployment_config = LocalWebservice.deploy_configuration(port=8890)` |
| Azure Container Instance | `deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)` |
| Azure Kubernetes Service | `deployment_config = AksWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)` |

In this example, we will deploy the model on Azure Container Instance:

```python
from azureml.core.webservice import AciWebservice
deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)
```

# Deploy the model

During deployment, the image/inference configuration and deployment configuration are used to create and configure the service environment.

```python
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core.webservice import Webservice

ws = Workspace.from_config()
model = Model(workspace=ws, name='tensorflow-iris')
service = Webservice.deploy_from_model(workspace=ws,
                                      name='tensorflow-iris',
                                      deployment_config=deployment_config,
                                      models=[model],
                                      image_config=image_config)
```

# Request-response consumption

* After deployment, the scoring URI is displayed. This URI can be used by clients to submit requests to the service.

* Note: How to get the scoring URI
 * Go to the portal
 * Select your machine learning service workspace
 * Select `Assets/Deployments`
 * Choose your model and get the scoring URI

* Create a POST request with the following body
```json
{
    "SepalWidth": [
        2.8
    ],
    "SepalLength": [
        6.4
    ],
    "PetalLength": [
        5.6
    ],
    "PetalWidth": [
        2.2
    ]
}
```

* Use Postman to test the api
![SegmentLocal](resources/2.png)

* Result
![SegmentLocal](resources/3.png)