![logo](../imgs/MLU_Logo.png)

---

# Stress Test

The idea of this notebook is to see how the [production endpoint](https://console.aws.amazon.com/sagemaker/home?region=us-west-2#/endpoints/iris-model-production) will behave when a **bunch** of requests arrive it.


We will simulate hundreds of users to do the predictions at the same time

In [1]:
import threading
import boto3
import numpy as np
import time
import math

from multiprocessing.pool import ThreadPool
from sklearn import datasets

In [2]:
sm = boto3.client("sagemaker-runtime")

endpoint_name_mask='iris-model-%s'

iris = datasets.load_iris()
dataset = np.insert(iris.data, 0, iris.target,axis=1)

In [3]:
from sagemaker.serializers import CSVSerializer

def predict(payload):
    csv_serializer = CSVSerializer()
    payload = payload
    X = payload[1:]
    y = payload[0]
    
    elapsed_time = time.time()
    resp = sm.invoke_endpoint(
        EndpointName=endpoint_name_mask % env,
        ContentType='text/csv',
        Accept='text/csv',
        Body=csv_serializer.serialize(X)
    )
    elapsed_time = time.time() - elapsed_time
    resp = float(resp['Body'].read().decode('utf-8').strip())
    return (resp == y, elapsed_time)

In [4]:
def run_test(max_threads, max_requests):
    num_batches = math.ceil(max_requests / len(dataset))
    requests = []
    for i in range(num_batches):
        batch = dataset.copy()
        np.random.shuffle(batch)
        requests += batch.tolist()
    len(requests)

    pool = ThreadPool(max_threads)
    result = pool.map(predict, requests)
    pool.close()
    pool.join()
    
    correct_random_forest=0
    elapsedtime_random_forest=0
    for i in result:
        correct_random_forest += i[0]
        elapsedtime_random_forest += i[1]
    print("Score classifier: {}".format(correct_random_forest/len(result)))

    print("Elapsed time: {}s".format(elapsedtime_random_forest))

In [5]:
env='production'

### Test 1: 1000 requests

In [6]:
%%time
print("Starting test 1")
run_test(10, 1000)

Starting test 1
Score classifier: 0.96
Elapsed time: 18.000831127166748s
CPU times: user 2.33 s, sys: 161 ms, total: 2.49 s
Wall time: 2.39 s


### Test 2: 10,000 requests

In [7]:
%%time
print("Starting test 2")
run_test(100, 10000)

Starting test 2
Score classifier: 0.96
Elapsed time: 2403.5203816890717s
CPU times: user 30.9 s, sys: 2.13 s, total: 33 s
Wall time: 25.4 s


### Test 3: 100,000 requests

Note this test may take around **5 minutes**

In [8]:
%%time
print("Starting test 3")
run_test(150, 100000)

Starting test 3
Score classifier: 0.96
Elapsed time: 35163.71126294136s
CPU times: user 6min 43s, sys: 38.7 s, total: 7min 22s
Wall time: 5min 14s


## Cloudwatch Monitoring

> **Action**: While this test is running, go to the [**Sagemaker Endpoints**](https://console.aws.amazon.com/sagemaker/home?region=us-west-2#/endpoints/iris-model-production), then click on
the `View invocation metrics` to see the endpoint behavior on CloudWatch.

<img src="../imgs/sagemaker_endpoints.png" alt="Drawing" style="width: 400px;"/>

In CloudWatch, check the following three checkboxes:

<img src="../imgs/all_metrics.png" alt="Drawing" style="width: 600px;"/>


Then, change the config (marked in RED) as following:


<img src="../imgs/invocation_point.png" alt="Drawing" style="width: 600px;"/>

### Auto Scaling Alarm

Now, while your stress test 3 is still running, you will see the **Auto Scaling Alarm** like this, after 3 datapoints above 750 Invocations Per Instance

<img src="../imgs/alarm.png" alt="Drawing" style="width: 600px;"/>

When this happens, the endpoint **autoscaling** will start adding more instances to your cluster. You can observe in the Graph from the previous image that, after new instances are added to the cluster, the invocations metrics grows.

<img src="../imgs/autoscaling.png" alt="Drawing" style="width: 600px;"/>


## Bonus Resources:

For more information about CloudWatch metrics, check the [documentation](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/working_with_metrics.html) for more details!

---
![logo](../imgs/MLU_Logo.png)
