## Testing Deployment

In this notebook, we access the deployed model via its REST endpoint and make predictions using a small subset of test data. 

### Set-up
---
To begin, we'll need to import all necessary modules. This should come installed with the virtual environment provided by [`environment.yml`](../environment.yml).

If not, please install the modules with the following commands:

```bash
pip install <module_name>
```

or 

```bash
conda install <module_name>
```

In [1]:
import json
import sys
from tqdm.auto import trange, tqdm
import concurrent.futures
import yaml
import os
from json import JSONEncoder
import requests

import numpy as np
import pandas as pd
import cv2
import imageio

from azure.storage.blob import BlobServiceClient, ContainerClient, BlobClient
from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    Model as AzureMLModel,
    Environment,
    CodeConfiguration,
)
from azureml.core.webservice import AciWebservice, Webservice
from azureml.core import Workspace, Model, Experiment, Run
from azureml.core.model import InferenceConfig

import tensorflow as tf
from tensorflow.keras.models import Model as KerasModel, model_from_json
from tensorflow.keras.optimizers import Adam
from keras.callbacks import Callback

import mlflow
import mlflow.keras

from utils import model_utils
from utils import sql_utils

2024-02-06 19:32:51.130703: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


### 1. Log a new model on MLflow and connect to AzureML

In [None]:
# Load the model architecture from JSON file
json_file_path = './ml-workflow/model_ckpt/model-cnn-v1-b3.json'
with open(json_file_path, 'r') as json_file:
    loaded_model_json = json_file.read()

loaded_model = model_from_json(loaded_model_json)

# Load the model weights from H5 file
h5_file_path = './ml-workflow/model_ckpt/model-cnn-v1-b3.h5'
loaded_model.load_weights(h5_file_path)

ml_client = MLClient.from_config(credential=DefaultAzureCredential())
run_name = 'basemodel' # change for each run

mlflow_log.log_model(loaded_model, ml_client, run_name) 

### 2. Deploy logged model as stream endpoint on Azure ML

In [None]:
# Add a tag to model to associate with model_id

scoring_file = './model_serving/score.py' # update this
deploy.deploy(model_name, scoring_file)

# Optional: create a test deployment and test that deployment is active
# test_deployment.test_deployment(model_name)

#### 2.1 Get AzureML model info from model_id (get endpoint_name from model_id)

Access the model information from the model ID stored in database

In [8]:
ml_client = MLClient(subscription_id="156ffac2-0545-4d4e-aab3-f89b83635d04",
resource_group="defaultresourcegroup-wus2",
workspace_name="pivot", credential=DefaultAzureCredential())
ws = Workspace.from_config('../model_serving/config.json')

In [9]:
experiment_name = config['experiment_name']
run_name = 'basemodel'

experiment = Experiment(workspace=ws, name=experiment_name)
# run = Run(experiment=experiment, run_id=run_name)

# Get a list of all models in the experiment
# models = Model.list(workspace=ws, experiment_name=experiment_name)
runs = experiment.get_runs()
models = []
for run in runs:
    run_models = run.get_models()
    models.extend(run_models)

AttributeError: 'Run' object has no attribute 'get_models'

In [10]:
experiment

Name,Workspace,Report Page,Docs Page
adt-pivot,pivot,Link to Azure Machine Learning studio,Link to Documentation


In [11]:
# get_runs() returns all runs in reverse chronological order (first is most recent)
for i in experiment.get_runs():
    print(i)

Run(Experiment: adt-pivot,
Id: 3ce8d4e5-6343-4d44-be1e-e9c99b3df968,
Type: None,
Status: Completed)
Run(Experiment: adt-pivot,
Id: 0f5b3a2b-fde9-401b-bdaa-31ed72ecc95c,
Type: None,
Status: Completed)


### 3. Call endpoint and make predictions

Input data is a pd.DataFrame with cols: IMAGE_ID, BLOB_FILEPATH. Here we manually create a small subset of data for testing.

In [3]:
# sample input
df = pd.DataFrame({'IMAGE_ID': [1, 2, 3], 
                   'BLOB_FILEPATH': ['D20160524T225721_IFCB107/IFCB107D20160524T225721P00213.png', 
                                     'D20160524T225721_IFCB107/IFCB107D20160524T225721P00575.png', 
                                     'D20160524T225721_IFCB107/IFCB107D20160524T225721P00561.png']})

In [4]:
df

Unnamed: 0,IMAGE_ID,BLOB_FILEPATH
0,1,D20160524T225721_IFCB107/IFCB107D20160524T2257...
1,2,D20160524T225721_IFCB107/IFCB107D20160524T2257...
2,3,D20160524T225721_IFCB107/IFCB107D20160524T2257...


In [22]:
# df['cloud_urls'] = df.BLOB_FILEPATH.apply(lambda x: config['cloud_url'].format(filepath=x))
# df

Unnamed: 0,IMAGE_ID,BLOB_FILEPATH,cloud_urls
0,1,D20160524T225721_IFCB107/IFCB107D20160524T2257...,https://ifcb.blob.core.windows.net/naames/NAAM...
1,2,D20160524T225721_IFCB107/IFCB107D20160524T2257...,https://ifcb.blob.core.windows.net/naames/NAAM...
2,3,D20160524T225721_IFCB107/IFCB107D20160524T2257...,https://ifcb.blob.core.windows.net/naames/NAAM...


### 4. Get predictions 

This is done in model_utils.predict().

In [None]:
# endpoint_name = get_model_info(m_id)
endpoint_name = 'basemodel-endpoint'

scoring_uri = f'https://{endpoint_name}.westus2.inference.ml.azure.com/score'.format(
    endpoint_name=endpoint_name)
api_key = CONFIG['api_key']

cloud_urls = df.cloud_urls.values
data = []
for c_url in cloud_urls:
    data.append(du.preprocess_input(np.expand_dims(imageio.v2.imread(c_url), axis=-1)))

data_dic = {"input_data": [i.reshape((128, 128)).tolist() for i in data]}
json_payload = json.dumps(data_dic, cls=NumpyArrayEncoder)

# The azureml-model-deployment header will force the request to go to a specific deployment.
headers = {'Content-Type':'application/json',
           'Authorization':('Bearer '+ api_key),
           'azureml-model-deployment': CONFIG['deployment_name']}

# Make the prediction request
response = requests.post(scoring_uri,
                         data=json_payload,
                         headers=headers,
                         timeout=10)

# Check the response status code
if response.status_code == 200:
    result = response.json()
else:
    print("Prediction request failed with status code:", response.status_code)
    print(response.text)

preds = pd.DataFrame({'i_id': df.I_ID.values,
                   'probs': result})

In [30]:
response

<Response [200]>
[[0.0031898769084364176, 2.0788617355327332e-11, 8.588602213421836e-05, 0.00017344176012557, 8.921755068058701e-08, 0.0032710987143218517, 8.093526048469357e-06, 0.9575332403182983, 0.03550129383802414, 0.00023710746609140188], [0.007550970651209354, 5.126950100020622e-07, 0.00035927206045016646, 0.002601270331069827, 2.5344577352370834e-06, 0.20422795414924622, 0.0010179296368733048, 0.7188515663146973, 0.06120515614748001, 0.004182853270322084], [0.00013524151290766895, 1.200628485520383e-11, 1.257793337572366e-05, 0.00022038634051568806, 2.6521347535890527e-05, 0.00010345028567826375, 1.8474192131634481e-07, 0.390915185213089, 0.00037548429099842906, 0.6082109212875366]]


Unnamed: 0,i_id,probs
0,1,"[0.0031898769084364176, 2.0788617355327332e-11..."
1,2,"[0.007550970651209354, 5.126950100020622e-07, ..."
2,3,"[0.00013524151290766895, 1.200628485520383e-11..."


### 5. Convert predictions to correct format

Reformat the predictions so that they can be inserted into the database.

(m_id, i_id, class_prob, predlabel) --> dic

This is done in model_utils.get_predictions().

In [None]:
classes = ['Chloro',
          'Cilliate',
          'Crypto',
          'Diatom',
          'Dictyo',
          'Dinoflagellate',
          'Eugleno',
          'Other',
          'Prymnesio',
          'Unidentifiable']

preds['class_prob'] = preds.probs.apply(lambda x: x[pd.Series(x).idxmax()])
preds['predlabel'] = preds.probs.apply(lambda x: classes[pd.Series(x).idxmax()])
preds['m_id'] = [m_id] * len(preds)

In [43]:
preds = get_predictions(df, 1)

In [44]:
preds

[{'i_id': 1,
  'class_prob': 0.9575332403182983,
  'predlabel': 'Other',
  'm_id': 1},
 {'i_id': 2,
  'class_prob': 0.7188515663146973,
  'predlabel': 'Other',
  'm_id': 1},
 {'i_id': 3,
  'class_prob': 0.6082109212875366,
  'predlabel': 'Unidentifiable',
  'm_id': 1}]

### 6. Getting test data

Use the data from test_images to test further.

In [23]:
test = sql_utils.run_sql_query(
"""
WITH test_images AS (
    SELECT DISTINCT I_Id
    FROM metrics
    WHERE m_id = 0
)
SELECT I.I_ID, I.filepath
FROM images AS I
INNER JOIN test_images AS TI ON TI.I_Id = I.I_ID;
"""
)

In [24]:
test

Unnamed: 0,I_ID,filepath
0,399781,NAAMES_ml/D20160524T225721_IFCB107/IFCB107D201...
1,399802,NAAMES_ml/D20160524T225721_IFCB107/IFCB107D201...
2,399803,NAAMES_ml/D20160524T225721_IFCB107/IFCB107D201...
3,399808,NAAMES_ml/D20160524T225721_IFCB107/IFCB107D201...
4,399815,NAAMES_ml/D20160524T225721_IFCB107/IFCB107D201...
...,...,...
99995,1382617,NAAMES_ml/D20151105T224631_IFCB107/IFCB107D201...
99996,1382633,NAAMES_ml/D20151105T224631_IFCB107/IFCB107D201...
99997,1382642,NAAMES_ml/D20151105T224631_IFCB107/IFCB107D201...
99998,1382654,NAAMES_ml/D20151105T224631_IFCB107/IFCB107D201...


In [89]:
test['cloud_urls'] = test.filepath.apply(lambda x: "https://ifcb.blob.core.windows.net/naames/{filepath}".format(filepath=x))

cloud_urls = df.cloud_urls.values
data = []
for c_url in cloud_urls:
    data.append(preprocess_input(np.expand_dims(imageio.v2.imread(c_url), axis=-1)))

In [97]:
data[0].shape

(128, 128, 1)

In [51]:
model_utils.get_predictions(test[:3], 1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


[{'i_id': 399781,
  'class_prob': 0.6453595161437988,
  'predlabel': 'Other',
  'm_id': 1},
 {'i_id': 399802,
  'class_prob': 0.6450332403182983,
  'predlabel': 'Other',
  'm_id': 1},
 {'i_id': 399803,
  'class_prob': 0.9911853671073914,
  'predlabel': 'Dinoflagellate',
  'm_id': 1}]