# Stress Test

The idea of this code is to see how the production Endpoint will behave when a **bunch** of requests arrive it.
Let's simulate several users doing predictions at the same time

In [None]:
import threading
import boto3
import numpy as np
import time
import math

from multiprocessing.pool import ThreadPool
from sklearn import datasets

In [None]:
sm = boto3.client("sagemaker-runtime")

endpoint_name_mask='iris-model-%s'

iris = datasets.load_iris()
dataset = np.insert(iris.data, 0, iris.target,axis=1)

In [None]:
from sagemaker.serializers import CSVSerializer

def predict(payload):
    csv_serializer = CSVSerializer()
    payload = payload
    X = payload[1:]
    y = payload[0]
    
    elapsed_time = time.time()
    resp = sm.invoke_endpoint(
        EndpointName=endpoint_name_mask % env,
        ContentType='text/csv',
        Accept='text/csv',
        Body=csv_serializer.serialize(X)
    )
    elapsed_time = time.time() - elapsed_time
    resp = float(resp['Body'].read().decode('utf-8').strip())
    return (resp == y, elapsed_time)

In [None]:
def run_test(max_threads, max_requests):
    num_batches = math.ceil(max_requests / len(dataset))
    requests = []
    for i in range(num_batches):
        batch = dataset.copy()
        np.random.shuffle(batch)
        requests += batch.tolist()
    len(requests)

    pool = ThreadPool(max_threads)
    result = pool.map(predict, requests)
    pool.close()
    pool.join()
    
    correct_random_forest=0
    elapsedtime_random_forest=0
    for i in result:
        correct_random_forest += i[0]
        elapsedtime_random_forest += i[1]
    print("Score classifier: {}".format(correct_random_forest/len(result)))

    print("Elapsed time: {}s".format(elapsedtime_random_forest))

In [None]:
env='production'

In [None]:
%%time
print("Starting test 1")
run_test(10, 1000)

In [None]:
%%time
print("Starting test 2")
run_test(100, 10000)

In [None]:
%%time
print("Starting test 3")
run_test(150, 100000000)

> While this test is running, go to the **AWS Console** -> **Sagemaker**, then click on the **Endpoint** and then click on the **CloudWatch** monitoring logs to see the Endpoint Behavior

## In CloudWatch, mark the following three checkboxes
![CloudWatchA](../../imgs/CloudWatchA.png)

## Then, change the following config, marked in RED

![CloudWatchB](../../imgs/CloudWatchB.png)

## Now, while your stress test is still running, you will see the Auto Scaling Alarm like this, after 3 datapoints above 750 Invocations Per Instance

![CloudWatchC](../../imgs/CloudWatchC.png)

When this happens, the Endpoint Autoscaling will start adding more instances to your cluster. You can observe in the Graph from the previous image that, after new instances are added to the cluster, the **Invocations** metrics grows.

## Well done!