# Test deployed web application

This notebook uses some duplicate questions and tests them against the deployed web application on AKS.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import requests
import json

from utilities import text_to_json

Get the external url for the web application running on AKS cluster.

In [None]:
service_json = !kubectl get service azure-ml -o json
service_dict = json.loads(''.join(service_json))
app_url = service_dict['status']['loadBalancer']['ingress'][0]['ip']
app_url

Quickly check if the web application is working.

In [None]:
scoring_url = 'http://{}/score'.format(app_url)
version_url = 'http://{}/version'.format(app_url)
health_url = 'http://{}/'.format(app_url)

In [None]:
scoring_url

In [None]:
!curl $health_url

In [None]:
!curl $version_url # Reports the lightgbm version

Let's use one of the duplicate questions to test our web service.

In [None]:
dupes_test_path = 'dupes_test.tsv'
dupes_test = pd.read_csv(dupes_test_path, sep='\t', encoding='latin1')
text_to_score = dupes_test.iloc[0,4]
text_to_score

In [None]:
jsontext = text_to_json(text_to_score)
jsontext[:100]

In [None]:
headers = {'content-type': 'application/json'}
r = requests.post(scoring_url, data=jsontext, headers=headers) # Run the request twice since the first time takes a 
%time r = requests.post(scoring_url, data=jsontext, headers=headers) # little longer due to the loading of the model
print(r)
r.json()

Let's try a few more duplicate questions and display their top 3 original matches.

In [None]:
dupes_to_score = dupes_test.iloc[:5,4]

In [None]:
results = [requests.post(scoring_url, data=text_to_json(text), headers=headers) for text in dupes_to_score]

Let's print top 3 matches for each duplicate question.

In [None]:
[results[i].json()['result'][0][0:3] for i in range(0, len(results))]

Next let's quickly check what the request response performance is for the deployed model on AKS cluster.

In [None]:
text_data = list(map(text_to_json, dupes_to_score)) # Retrieve the text data

In [None]:
timer_results = list()
for text in text_data:
    res=%timeit -r 1 -o -q requests.post(scoring_url, data=text, headers=headers)
    timer_results.append(res.best)

In [None]:
timer_results

In [None]:
print('Average time taken: {0:4.2f} ms'.format(10**3 * np.mean(timer_results)))

We have tested that the model works and we can now move on to the [next notebook to get a sense of its throughput](07_Speed_Test_WebApp.ipynb).