# Test analysis

Statistical analysis for 10 repetitions of a 5 minute test and for 10 repetitions of a 1 minute test of the **Robot-Shop** application

### Workloads

- **5 minute test**:
    - simulated day length: 5 minutes = 300 seconds
    - 20 requests per second
    - about 6000 total requests

- **1 minute test** (compressed test):
    - simulated day length: 1 minutes = 60 seconds
    - 100 requests per second
    - about 6000 total requests

### Imports and functions

In [1]:
import os
import pathlib
import sys

import numpy as np
import pandas as pd
from IPython.display import HTML, display, display_html

mod_path = os.path.abspath(os.path.join("../../src/alyslib"))
if mod_path not in sys.path:
    sys.path.append(mod_path)

import alyslib

In [2]:
# function that returns a list that contains
# the mean of the TimeDeltas for every dataframe
def get_means(df):
    l = []
    for d in df:
        l.append(d.TimeDelta.mean())
    return l


# function that calculates the confidence interval
# (with `z_score`=1.96, returns a 95% confidence interval)
def conf_interval(data, z_score=1.96):
    mean = np.mean(data)
    std = np.std(data)
    size = len(data)
    err = z_score * (std / np.sqrt(size))
    return (mean - err, mean + err)

### Datasets - import

In [3]:
l = alyslib.import_data("./data", "net.gen")

### DataFrames - building

In [4]:
d0, d1 = alyslib.build_dfs(l)

In [5]:
dfmerge = d0 + d1

### DataFrames - cleaning network noise

For the analysis of the tests we cannot have **SendIP** and **RecvIP** differences. We clean the network noise for every pair of tests.

In [6]:
alyslib.clean_network_noise(dfmerge)

dataframe 140487927738144, removed 140 items, [2262, 2263, 2265, 2267, 2269]
dataframe 140487927738144, removed 120 items, [2264, 2266, 2268, 2271, 2272]
dataframe 140487930215632, removed 140 items, [3537, 3538, 3540, 3542, 3544]
dataframe 140487930215632, removed 121 items, [3539, 3541, 3543, 3546, 3547]
dataframe 140487930208192, removed 3 items, [13476, 13477, 13479]
dataframe 140487930208192, removed 1 items, [13478]
dataframe 140487930209632, removed 3 items, [3304, 3306, 3307]
dataframe 140487930209632, removed 1 items, [3305]
dataframe 140487930214288, removed 3 items, [16847, 16849, 16848]
dataframe 140487930214288, removed 1 items, [16850]
dataframe 140487927738144, removed 3 items, [24474, 24476, 24477]
dataframe 140487927738144, removed 1 items, [24475]
dataframe 140487930208528, removed 3 items, [8572, 8574, 8573]
dataframe 140487930208528, removed 1 items, [8575]
dataframe 140487927740304, removed 3 items, [4677, 4679, 4678]
dataframe 140487927740304, removed 1 items, [46

### Dataframes - sorting by Timestamp

In [7]:
alyslib.sort_by_key(dfmerge, "Timestamp")

### Dataframes - generating column Elapsed time

In [8]:
alyslib.cmp_elapsed(dfmerge)

### Analysis - 5 minutes test

In [9]:
d0m = get_means(d0)

In [10]:
display(pd.DataFrame(d0m, columns=["means"]))
display(
    pd.DataFrame(
        [[np.mean(d0m), np.std(d0m), conf_interval(d0m)]],
        columns=["mean of means", "std of means", "95% conf interval"],
    )
)

Unnamed: 0,means
0,0.018329
1,0.01874
2,0.01798
3,0.01899
4,0.018123
5,0.018425
6,0.019003
7,0.018031
8,0.018411
9,0.018879


Unnamed: 0,mean of means,std of means,95% conf interval
0,0.018491,0.00037,"(0.0182618713257612, 0.01872026641485766)"


1. we calculated the **TimeDelta mean** for every DataFrame generated for the current test (we have 10 repetitions, so we have 10 DataFrames).
2. we calculated the **mean of the means** calculated above
3. we calculated the **standard deviation of the means** calculated above
4. we calculated the **95% confidence interval of the means** calculated above

### Analysis - 1 minute test

In [11]:
d1m = get_means(d1)

In [12]:
display(pd.DataFrame(d1m, columns=["means"]))
display(
    pd.DataFrame(
        [[np.mean(d1m), np.std(d1m), conf_interval(d1m)]],
        columns=["mean of means", "std of means", "95% conf interval"],
    )
)

Unnamed: 0,means
0,0.006616
1,0.004599
2,0.004792
3,0.004981
4,0.005199
5,0.005243
6,0.004381
7,0.005071
8,0.004641
9,0.00491


Unnamed: 0,mean of means,std of means,95% conf interval
0,0.005043,0.000585,"(0.0046807860864946736, 0.005405812325444093)"


We performed the same calculations as the previous test with this new test (again we have 10 repetitions, so we have 10 DataFrames for the current test)

### Analysis - comparison between the tests

In [13]:
display(
    pd.DataFrame(
        [
            [np.mean(d0m), np.std(d0m), conf_interval(d0m)],
            [np.mean(d1m), np.std(d1m), conf_interval(d1m)],
        ],
        columns=["mean of means", "std of means", "95% conf interval"],
        index=["5m test", "1m test"],
    )
)

Unnamed: 0,mean of means,std of means,95% conf interval
5m test,0.018491,0.00037,"(0.0182618713257612, 0.01872026641485766)"
1m test,0.005043,0.000585,"(0.0046807860864946736, 0.005405812325444093)"


In conclusion, we present above the achieved results to compare the results of both tests simultaneously