# Hypothesis Testing

In this notebook, we are going to represent the logic of our hypothesis testing for comparing NATS config changes. Everytime that we make a change, we benchmark the before and after change cases. After that, we change the groups names based on ```tmp``` directory names. Then we run this notebook in order to check the compare results.

In [24]:
# notebook variables
GROUP_A="pull-client.1d" # change this
GROUP_B="push-client.b2" # change this

P_VALUE_BOUND = 0.05

In [25]:
# import in-use libraries
import pandas as pd
from scipy.stats import ttest_ind

In [26]:
# read csv to create groups datasets
dfA = pd.read_csv(f'../tmp/{GROUP_A}/dataset.csv')
dfB = pd.read_csv(f'../tmp/{GROUP_B}/dataset.csv')

dfA.head()

Unnamed: 0,pub-stats,sub-stats,overall-stats
0,404.19,560.73,808.39
1,405.27,561.46,810.55
2,422.85,593.15,845.69
3,420.13,599.29,840.27
4,390.53,594.87,781.05


In [27]:
# describe datasets
dfA.describe()

Unnamed: 0,pub-stats,sub-stats,overall-stats
count,10.0,10.0,10.0
mean,400.203,561.239,800.407
std,23.679675,44.874121,47.359996
min,338.95,442.97,677.9
25%,397.0125,560.9125,794.0275
50%,405.63,569.995,811.265
75%,410.315,588.305,820.625
max,422.85,599.29,845.69


In [28]:
# using a function for hypothesis testing logic
# which gets two columns data to compare
def hypo_test(groupA, groupB):
    # get mean values
    mean_a = groupA.mean()
    mean_b = groupB.mean()
    
    t_statistic, p_value = ttest_ind(groupA, groupB)
    
    print(f"\tmean of `{GROUP_A}`: {mean_a}")
    print(f"\tmean of `{GROUP_B}`: {mean_b}")
    print(f"\tt-statistic: {t_statistic}")
    print(f"\tp-value: {p_value}")
    
    if p_value < P_VALUE_BOUND:
        print("the difference is statistically significant at 95% confidence level.")
        if mean_a > mean_b:
            print(f"`{GROUP_A}` is better by {100 * float((mean_a-mean_b)/mean_a)}%.")
        else:
            print(f"`{GROUP_B}` is better by {100 * float((mean_b-mean_a)/mean_b)}%.")
    else:
        print("the difference is not statistically significant at 95% confidence level.")

In [29]:
# defining dataset columns
columns = ["pub-stats", "sub-stats", "overall-stats"]

In [30]:
#  hypothesis for comparing two groups
for column in columns:
    print(f"\ntesting `{column}` field:")
    hypo_test(dfA[column], dfB[column])


testing `pub-stats` field:
	mean of `pull-client.1d`: 400.203
	mean of `push-client.b2`: 253.776
	t-statistic: 12.792186297733219
	p-value: 1.791119486687319e-10
the difference is statistically significant at 95% confidence level.
`pull-client.1d` is better by 36.58818149788981%.

testing `sub-stats` field:
	mean of `pull-client.1d`: 561.239
	mean of `push-client.b2`: 325.448
	t-statistic: 11.832787005661107
	p-value: 6.323474368276264e-10
the difference is statistically significant at 95% confidence level.
`pull-client.1d` is better by 42.0125828746755%.

testing `overall-stats` field:
	mean of `pull-client.1d`: 800.4069999999999
	mean of `push-client.b2`: 507.55
	t-statistic: 12.79240040987597
	p-value: 1.7906307563517974e-10
the difference is statistically significant at 95% confidence level.
`pull-client.1d` is better by 36.588510595234666%.
