In [68]:
import pandas as pd
import numpy as np
import scipy.stats

In [69]:
sheets = ['Single-pod-1', 'Single-pod-2', 'Scale-bound-1', 'Scale-bound-2', 'Scale-to-zero-1', \
          'Scale-to-zero-2', 'Kn-Scale-Bound', 'Kn-Target-Bound']

In [70]:
excel_filename = 'data.xlsx'

We want to check if the means of the two data sets are equal. We use the $t$-test for that purpose. The null hypothesis of the $t$-test is $H_0:$ 'the two means are equal'. Therefore if the $p$-value of the test is greater than the critical $p$-value then we cannot reject $H_0$. That is, the test concludes that there is not enough evidence to suspect that the two means are different from each other. Equivalently, we say that the two means are the same. 

[$t$-test](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html) requires us to check if the variances of the two data sets are the same. We use [Levene's test](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.levene.html#scipy.stats.levene) to check the equality of variances.

In [75]:
def do_t_test(sheet_name):
    data = pd.read_excel(excel_filename, sheet_name = sheet_name, engine='openpyxl')
    col1 = data.columns[1]
    col2 = data.columns[2]
    print(f'Analyzing data in the sheet {sheet_name}.')
    print(f'Carrying out t-test between {col1} and {col2}.')
    
    X = data.iloc[:, 1].values.flatten()
    Y = data.iloc[:, 2].values.flatten()
    
    mean_X = np.mean(X)
    mean_Y = np.mean(Y)
    print('The two means are {:0.3f} and {:0.3f}'.format(mean_X, mean_Y))
    
    # Check if the two data have the same variance.
    results = scipy.stats.levene(X, Y)
    p_value = results[1]
    
    equal_var = True
    if p_value > 0.05:
        print('The two data sets have the same variance.')
    else:
        print('The two data sets have different variance.')
        equal_var = False
        
    # Do the t-test
    results = scipy.stats.ttest_ind(X, Y, equal_var = equal_var)
    p_value = results[1]
    if p_value > 0.05:
        print('We cannot reject the null hypothesis that the means of the two data sets are equal.')
    else:
        print('We reject the null hypothesis that the means of the two data sets are equal.')

In [76]:
for s in sheets:
    do_t_test(s)
    print()
    print('-' * 80)
    print()

Analyzing data in the sheet Single-pod-1.
Carrying out t-test between Knative and Openfaas.
The two means are 317.600 and 314.400
The two data sets have the same variance.
We cannot reject the null hypothesis that the means of the two data sets are equal.

--------------------------------------------------------------------------------

Analyzing data in the sheet Single-pod-2.
Carrying out t-test between Knative and Openfaas.
The two means are 643.200 and 636.000
The two data sets have the same variance.
We cannot reject the null hypothesis that the means of the two data sets are equal.

--------------------------------------------------------------------------------

Analyzing data in the sheet Scale-bound-1.
Carrying out t-test between Knative and Openfaas.
The two means are 309.600 and 310.100
The two data sets have the same variance.
We cannot reject the null hypothesis that the means of the two data sets are equal.

----------------------------------------------------------------