# Hypothesis Testing
A statistical hypothesis test is a procedure in statistical analysis aimed at determining whether the available data provides enough evidence to support a specific hypothesis. This process usually entails computing a test statistic based on the data. Subsequently, a decision is made by comparing the test statistic to a critical value or by assessing a p-value derived from the test statistic.

## T-test
The t-test assesses the difference in means between two groups of data. It's a hypothesis test conducted using random samples from each group. Through this test, analysts determine if the same treatment yields consistent results in both groups or if there are differences.

Accepted hypotheses are:
- Ho: No difference between the groups.
- Ha: Difference exists despite the same treatment

## Z-test
The z-test compares means or proportions between two groups when the sample size is large (typically n > 30) and the population standard deviation is known. It's akin to the t-test but relies on the standard normal distribution (Z-distribution). Commonly used for hypothesis testing when the population standard deviation is known.

Accepted hypotheses are:
- Ho: There is no significant difference between the groups.
- Ha: A significant difference exists despite the same treatment.

In [1]:
import pandas as pd
from scipy import stats
from statsmodels.stats import weightstats as stests

In [2]:
df=pd.read_csv('Islander_data.csv')

An experiment on the `effects of anti-anxiety medicine on memory recall when being primed with happy or sad memories`. The participants were done on novel Islanders whom mimic real-life humans in response to external factors.

Drugs of interest (known-as) [Dosage 1, 2, 3]:

A - Alprazolam (Xanax, Long-term) [1mg/3mg/5mg]

T - Triazolam (Halcion, Short-term) [0.25mg/0.5mg/0.75mg]

S- Sugar Tablet (Placebo) [1 tab/2tabs/3tabs]

In [3]:
df.head()

Unnamed: 0,first_name,last_name,age,Happy_Sad_group,Dosage,Drug,Mem_Score_Before,Mem_Score_After,Diff
0,Bastian,Carrasco,25,H,1,A,63.5,61.2,-2.3
1,Evan,Carrasco,52,S,1,A,41.6,40.7,-0.9
2,Florencia,Carrasco,29,H,1,A,59.7,55.1,-4.6
3,Holly,Carrasco,50,S,1,A,51.7,51.2,-0.5
4,Justin,Carrasco,52,H,1,A,47.0,47.1,0.1


- `first_name` : First name of Islander
- `last_name` : Last. name of Islander
- `age` : Age of Islander
- `Happy_Saf_group` : Happy or. Sad Memory priming block
- `Dosage` : 1-3 to indicate the level of dosage (low - medium - over recommended daily intake)
- `Drug` : Type of Drug administered to Islander
- `Mem_Score_Before` : Seconds - how long it took to finish a memory test before drug exposure
- `Mem_Score_After` : Seconds - how long it took to finish a memory test after addiction achieved
- `Diff` : Seconds - difference between memory score before and after

In [4]:
df[[ 'Mem_Score_Before','Mem_Score_After']].describe()

Unnamed: 0,Mem_Score_Before,Mem_Score_After
count,198.0,198.0
mean,57.967677,60.922222
std,15.766007,18.133851
min,27.2,27.1
25%,46.525,47.175
50%,54.8,56.75
75%,68.4,73.25
max,110.0,120.0


In [5]:
df.head(5)

Unnamed: 0,first_name,last_name,age,Happy_Sad_group,Dosage,Drug,Mem_Score_Before,Mem_Score_After,Diff
0,Bastian,Carrasco,25,H,1,A,63.5,61.2,-2.3
1,Evan,Carrasco,52,S,1,A,41.6,40.7,-0.9
2,Florencia,Carrasco,29,H,1,A,59.7,55.1,-4.6
3,Holly,Carrasco,50,S,1,A,51.7,51.2,-0.5
4,Justin,Carrasco,52,H,1,A,47.0,47.1,0.1


In [6]:
df.shape

(198, 9)

## T-test

In [7]:
ttest,pval = stats.ttest_rel(df['Mem_Score_Before'], df['Mem_Score_After'])
print(pval)

0.00015035646624295915


In [8]:
if pval<0.05:
    print("reject null hypothesis")
else:
    print("accept null hypothesis")

reject null hypothesis


##### At a significance level of 5%, drug exposure has had a significant impact on the time required to complete the memory test.