<a href="https://colab.research.google.com/github/slegro97/hypothesis-testing/blob/main/Hypothesis_Testing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Libraries and data

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
%cd /content/drive/MyDrive/PythonProjects/Statistics with Python/Inferential Statistics/Hypothesis Testing

/content/drive/MyDrive/PythonProjects/Statistics with Python/Inferential Statistics/Hypothesis Testing


In [None]:
# Libraries
import pandas as pd
import numpy as np
import scipy.stats as st
import statsmodels.stats.weightstats as sm

#Useful functions

In [None]:
# Function to interpret the p-value
def p_value_interpreter(p_value, alpha):
  if p_value < alpha:
    print('Reject the Null Hypothesis')
  else:
    print('Fail to reject the Null Hypothesis')

In [None]:
# Function to run a z-test
def z_test(sample_mean, pop_mean, pop_sd, sample_size, tails, alpha):

  # Calculate Z-score
  z_score = (sample_mean - pop_mean) / (pop_sd / np.sqrt(sample_size))
  print(f'The z-score of the sample statistic is: {z_score}')

  # Calculate p-value
  p_value = st.norm.sf(abs(z_score)) * tails
  print(f'The p-value is: {p_value}')

  # Interpret p-value
  p_value_interpreter(p_value, alpha)

In [None]:
# 2 sample t-test function that depends on the outcome of Levene's Test
def ttest_2sample(sample1, sample2, alpha=0.05, alternative='two-sided'):

  # Perform Levene's Test
  stat, pvalue = st.levene(sample1, sample2)
  print('Levene\'s Test:')
  print(f'The test statistic is: {stat}')
  print(f'The p-value is: {pvalue}')

  # Determine if Welch's test or Independent Two Sample Test will be used
  if pvalue < alpha:
    # Set parameter for Welch's Test
    print('\nVariances are significantly different. Performing Welch\'s Test.')
    equal_var = False

  else:
    # Set parameter for Independent Two Sample Test
    print('\nVariances are not significantly different. Performing Independent Two Sample T-test.')
    equal_var = True

  tscore, pvalue = st.ttest_ind(sample1,
                                sample2,
                                equal_var = equal_var,
                                alternative = alternative)
  print(f'The t-score is: {tscore}')
  print(f'The p-value is: {pvalue}')
  p_value_interpreter(pvalue, alpha)

# 2-Tailed tests with known variance

You have invested thousands of dollars per employee to improve their satisfaction and productivity. Your goal is to improve from the average of 54 cars produced so far, with a corresponding standard deviation (of the population) of 2.
Bruno believes the opposite. That the benefits and other factors like the constant raining are hurting production due to constant sickness. The agreed confidence level between you both is 95%

**Null Hypothesis**: The average number of cars produced is 54

**Alternative Hypothesis**: The average number of cars produced is not 54

In [None]:
# Load data
df_main = pd.read_csv('tesla_main.csv')
df_main.head()

Unnamed: 0,Production Date,Defects Found,Cars Produced,Weather Condition,Workers on Shift
0,2023-01-01,3,55,Rainy,20
1,2023-01-02,2,57,Rainy,19
2,2023-01-03,1,54,Rainy,21
3,2023-01-04,0,56,Rainy,22
4,2023-01-05,2,59,Rainy,20


In [None]:
# Info
mean_pop = 54
sd_pop = 2
confidence = 0.95
alpha = 1 - confidence
mean_sample = df_main['Cars Produced'].mean()
sample_size = df_main['Cars Produced'].count()

print(f'The sample mean is {mean_sample}')
print(f'The sample size is {sample_size}')

The sample mean is 55.10909090909091
The sample size is 55


In [None]:
# Z-score formula (sample mean - population mean) / (population SD / (sqrt(sample size)))
z_score = calc_z_score(mean_sample, mean_pop, sd_pop, sample_size)
print(f'The z-score of the sample statistic is: {z_score}')

The z-score of the sample statistic is: 4.112619161025777


In [None]:
# Calculate the p-value of the z-score (two tails)
p_value = calc_p_value(z_score, 2)
print(f'The p-value is: {p_value}')

The p-value is: 3.9119543361101206e-05


In [None]:
# Interpret the p-value
p_value_interpreter(p_value, alpha)

Reject the Null Hypothesis


In [None]:
# Run full; z-test function
z_test(mean_sample, mean_pop, sd_pop, sample_size, 2, 0.05)

The z-score of the sample statistic is: 4.112619161025777
The p-value is: 3.9119543361101206e-05
Reject the Null Hypothesis


# 2-Tailed tests with unknown variance

Social Media has been all over Tesla. The engines from a couple of cars started to catch smoke. Even worse, the cars were from high profile customers. You talk to your employees who tell you that the number of defects is within normal average of 2.2. Bruno asked you to investigate the situation yourself. Since the car production has suffered many changes in the past few months, there is no data about the population.

**Null Hypothesis**: The average number of defects is 2.2

**Alt Hypothesis**: The average number of defects is not 2.2

In [None]:
# Data
df_main.head()

Unnamed: 0,Production Date,Defects Found,Cars Produced,Weather Condition,Workers on Shift
0,2023-01-01,3,55,Rainy,20
1,2023-01-02,2,57,Rainy,19
2,2023-01-03,1,54,Rainy,21
3,2023-01-04,0,56,Rainy,22
4,2023-01-05,2,59,Rainy,20


In [None]:
# Information
target_mean = 2.2
sample_mean = df_main['Defects Found'].mean()
sample_sd = df_main['Defects Found'].std()
sample_size = df_main['Defects Found'].count()
confidence = 0.95
alpha = 1 - confidence

print(f'The sample mean is {sample_mean}')
print(f'The sample SD is {sample_sd}')
print(f'The sample size is {sample_size}')

The sample mean is 2.3636363636363638
The sample SD is 1.0777829844714388
The sample size is 55


In [None]:
# Calculate the t-score
t_score = (sample_mean - target_mean) / (sample_sd / np.sqrt(sample_size))
print(f'The t-score is: {t_score}')

The t-score is: 1.1259778359082033


In [None]:
# Calculate the p-value
tails = 2
p_value = st.t.sf(abs(t_score), df = (sample_size - 1)) * tails
print(f'The p-value is {p_value}')

The p-value is 0.2651542493629725


In [None]:
# Interpret the p_value
p_value_interpreter(p_value, alpha)

Reject the Null Hypothesis


In [None]:
# How to do a 2-tailed t-test using scipy function
t_score, p_value = st.ttest_1samp(a = df_main['Defects Found'],
                                       popmean = target_mean,
                                       alternative = 'two-sided')
print(f'T-score: {t_score}')
print(f'P-value: {p_value}')
p_value_interpreter(p_value, alpha)

T-score: 1.1259778359082033
P-value: 0.2651542493629725
Fail to reject the Null Hypothesis


#2-Tailed Paired T-Test

The Sales department has been very critical of Tesla recently, saying that they have been getting many complains from customers that claim that the car is taking longer than expected. They have even voiced their concerns to Bruno!

Your department, on the other hand, says that the production has been stable over time and going according to plan. They fire back and say that the sales department has been selling too much. You decide to take initiative to see if the production is stable in the last 2 months.







**Null Hypothesis**: Production of month 1 equals production of month 2

**Alternative Hypothesis**: Production of month 1 does not equal production of month 2

In [None]:
# Data
df_paired = pd.read_csv('tesla_paired.csv')
differences = df_paired['Month 1'] - df_paired['Month 2']

In [None]:
# Info
tails = 2
confidence = 0.95
alpha = 1 - confidence
mean_differences = differences.mean()
sd_differences = differences.std()
sample_size = differences.count()
dof = sample_size - 1

print(f'The mean difference is: {mean_differences}')
print(f'The SD of the differences is: {sd_differences}')
print(f'The sample_size is: {sample_size}')
print(f'The degrees of freedom is: {dof}')

The mean difference is: 1.1
The SD of the differences is: 1.9000907419347652
The sample_size is: 30
The degrees of freedom is: 29


In [None]:
# Computing the t-score
t_score = mean_differences / (sd_differences / np.sqrt(sample_size))
print(f'The t-score is: {t_score}')

The t-score is: 3.1708738954340316


In [None]:
# Calculate the p-value
tails = 2
p_value = st.t.sf(abs(t_score), df = dof) * tails
print(f'The p-value is {p_value}')

The p-value is 0.0035743342552951936


In [None]:
# Interpret the p-value
p_value_interpreter(p_value, alpha)

Reject the Null Hypothesis


In [None]:
# How to do a 2-tailed paired t-test using scipy function
t_score, p_value = st.ttest_rel(a = df_paired['Month 1'],
                                b = df_paired['Month 2'],
                                alternative = 'two-sided')
print(f'T-score: {t_score}')
print(f'P-value: {p_value}')
p_value_interpreter(p_value, alpha)

T-score: 3.170873895434031
P-value: 0.0035743342552951992
Reject the Null Hypothesis


#2-Tailed Two Sample T-Test

The Tesla factory you manage has two shifts during the day. You are present during shift 2, but not shift 1. Your second in command who you that shift 2 is doing great.

Of course, you are aware that, to prove to Bruno that both shifts work with similar productivity, you need to show them numbers. Of course, nothing is better than hypothesis testing

**Null Hypothesis**: There is no difference between shifts

**Alternative Hypothesis**: There is a difference between shifts

In [None]:
# Data
df_2sample = pd.read_csv('tesla_2sample.csv')
df_2sample.head()

Unnamed: 0,Day,Shift 1,Shift 2
0,1,53,49.0
1,2,61,57.0
2,3,72,68.0
3,4,59,47.0
4,5,62,60.0


In [None]:
# Summary statistics
df_2sample.describe()

Unnamed: 0,Day,Shift 1,Shift 2
count,30.0,30.0,29.0
mean,15.5,61.166667,55.0
std,8.803408,6.664799,8.647873
min,1.0,51.0,42.0
25%,8.25,55.25,48.0
50%,15.5,61.0,57.0
75%,22.75,66.75,62.0
max,30.0,72.0,72.0


In [None]:
# Isolate the samples
sample1 = df_2sample['Shift 1'].dropna()
sample2 = df_2sample['Shift 2'].dropna()

In [None]:
# Performs Levene's test to check if variances are significantly different
stat, pvalue = st.levene(sample1, sample2)
print(f'The test statistic is: {stat}')
print(f'The p-value is: {pvalue}')

The test statistic is: 4.214392047138876
The p-value is: 0.044682721966871876


In [None]:
# Interpret p-value
alpha = 0.05
if pvalue < alpha:
  print('Variances are significantly different. Perform Welch\'s Test.')
else:
  print('Variances are not significantly different. Perform Two Sample T-test.')

Variances are significantly different. Perform Welch's Test.


In [None]:
# Perform Welch's T-test
tscore, pvalue = st.ttest_ind(sample1,
                              sample2,
                              equal_var = False,
                              alternative = 'two-sided')
print(f'The t-score is: {tscore}')
print(f'The p-value is: {pvalue}')
p_value_interpreter(pvalue, alpha)

The t-score is: 3.0606654074470834
The p-value is: 0.0034724013986656174
Reject the Null Hypothesis


In [None]:
# Perform Two Sample T-test (independent samples)
tscore, pvalue = st.ttest_ind(sample1,
                              sample2,
                              equal_var = True,
                              alternative = 'two-sided')
print(f'The t-score is: {tscore}')
print(f'The p-value is: {pvalue}')
p_value_interpreter(pvalue, alpha)

The t-score is: 3.074142672919775
The p-value is: 0.003237334319433138
Reject the Null Hypothesis


In [None]:
# Perform test using function which automatically differentiates between Welch's Test and Independent test
ttest_2sample(sample1,
              sample2)

Levene's Test:
The test statistic is: 4.214392047138876
The p-value is: 0.044682721966871876

Variances are significantly different. Performing Welch's Test.
The t-score is: 3.0606654074470834
The p-value is: 0.0034724013986656174
Reject the Null Hypothesis


# 1-Tailed Tests with Known Variance

You have invested thousands of dollars per employee to improve their satisfaction and productivity. Your goal is to improve from the average of 54.5 cars produced so far, with a corresponding standard deviation (of the population) of 2.

Bruno does not believe it and asks for proof. Statistical proof of course :)

**Null Hypothesis**: There is no improvement in productivity

**Alternative Hypothesis**: There is improvement (The sample mean is bigger than the population mean).

In [None]:
# Data
df_main = pd.read_csv('tesla_main.csv')
df_main.head()

Unnamed: 0,Production Date,Defects Found,Cars Produced,Weather Condition,Workers on Shift
0,2023-01-01,3,55,Rainy,20
1,2023-01-02,2,57,Rainy,19
2,2023-01-03,1,54,Rainy,21
3,2023-01-04,0,56,Rainy,22
4,2023-01-05,2,59,Rainy,20


In [None]:
# Info
pop_mean = 54.5
pop_sd = 2
confidence = 0.95
alpha = 1 - confidence
sample_mean = df_main['Cars Produced'].mean()
sample_size = df_main['Cars Produced'].count()

print(f'The sample mean is {sample_mean}')
print(f'The sample size is {sample_size}')

The sample mean is 55.10909090909091
The sample size is 55


In [None]:
# Apply Z-test function
z_test(sample_mean, pop_mean, pop_sd, sample_size, 1, alpha)

The z-score of the sample statistic is: 2.258569539251862
The p-value is: 0.011955087194577932
Reject the Null Hypothesis


# 1-Tailed Test with Unknown Variance

Social Media has been all over Tesla. They say that more and more people are complaining about defects. They claim that improvements are urgently needed. You talk to your employees who tell you that the number of defects is within normal average of 2.4, maybe even better than that.

You decide to investigate the situation yourself. Since the car production has suffered many changes in the past few months, there is no data about the population.

**Null Hypothesis**: The average number of defects is less than or equal to 2.4

**Aternative Hypothesis**: The average number of defects is greater than 2.4

In [None]:
# Data
df_main.head()

Unnamed: 0,Production Date,Defects Found,Cars Produced,Weather Condition,Workers on Shift
0,2023-01-01,3,55,Rainy,20
1,2023-01-02,2,57,Rainy,19
2,2023-01-03,1,54,Rainy,21
3,2023-01-04,0,56,Rainy,22
4,2023-01-05,2,59,Rainy,20


In [None]:
# One sample, one tailed t-test
t_score, p_value = st.ttest_1samp(a = df_main['Defects Found'],
                                       popmean = target_mean,
                                       alternative = 'greater')
print(f'T-score: {t_score}')
print(f'P-value: {p_value}')
p_value_interpreter(p_value, alpha)

T-score: 1.1259778359082033
P-value: 0.13257712468148625
Fail to reject the Null Hypothesis


# 1-Tailed Paired T-Test

The Sales department has been very critical of Tesla recently, saying that they have been getting many complains from customers that claim that the car is taking longer than expected and that the production slowed down last month. They have even voiced their concerns to Bruno.

Your department, on the other hand, says that the production is at least as good as before. The historical pattern is that productivity increases over time

**Null Hypothesis**: Production from month 2 is equal or better than month 1

**Alternative Hypothesis**: Production from month 2 is worse than month 1

In [None]:
# Data
df_paired.describe()

Unnamed: 0,Day,Month 1,Month 2
count,30.0,30.0,30.0
mean,15.5,55.6,54.5
std,8.803408,1.588754,1.196259
min,1.0,54.0,52.0
25%,8.25,54.0,54.0
50%,15.5,55.0,55.0
75%,22.75,57.0,55.0
max,30.0,58.0,56.0


In [None]:
# Info
tails = 1
confidence = 0.95
alpha = 1 - confidence
mean_differences = differences.mean()
sd_differences = differences.std()
sample_size = differences.count()
dof = sample_size - 1

print(f'The mean difference is: {mean_differences}')
print(f'The SD of the differences is: {sd_differences}')
print(f'The sample_size is: {sample_size}')
print(f'The degrees of freedom is: {dof}')

The mean difference is: 1.1
The SD of the differences is: 1.9000907419347652
The sample_size is: 30
The degrees of freedom is: 29


In [None]:
# Perform a 1-tailed, paired t-test
t_score, p_value = st.ttest_rel(a = df_paired['Month 1'],
                                b = df_paired['Month 2'],
                                alternative = 'greater')
print(f'T-score: {t_score}')
print(f'P-value: {p_value}')
p_value_interpreter(p_value, alpha)

T-score: 3.170873895434031
P-value: 0.0017871671276475996
Reject the Null Hypothesis


# 1-Tailed Two-Sample T-Test

The Tesla factory you manage has two shifts during the day. You are present during shift 2, but not shift 1. You have your second in command who tells you that shift 1 is doing great. In fact, her recent incentives led to higher efficiency during the second shift. Let's if that is true.

**Null Hypothesis**: Shift 1 production is less than or equal to Shift 2 production

**Alternative Hypothesis**: Shift 1 production is greater than Shift 2 production

In [None]:
df_2sample.describe()

Unnamed: 0,Day,Shift 1,Shift 2
count,30.0,30.0,29.0
mean,15.5,61.166667,55.0
std,8.803408,6.664799,8.647873
min,1.0,51.0,42.0
25%,8.25,55.25,48.0
50%,15.5,61.0,57.0
75%,22.75,66.75,62.0
max,30.0,72.0,72.0


In [None]:
# Isolate the samples
sample1 = df_2sample['Shift 1'].dropna()
sample2 = df_2sample['Shift 2'].dropna()

In [None]:
# Perform 1-tailed, 2-sample t-test
ttest_2sample(sample1,
              sample2,
              alternative = 'greater')

Levene's Test:
The test statistic is: 4.214392047138876
The p-value is: 0.044682721966871876

Variances are significantly different. Performing Welch's Test.
The t-score is: 3.0606654074470834
The p-value is: 0.0017362006993328087
Reject the Null Hypothesis


# Chisquare Test

You are a car production manager in charge of two factories, Factory A and Factory B. Both factories have recently implemented different quality control measures to reduce the number of defective cars produced in each of the three car categories: sedan, SUV, and truck. You want to determine if the quality control measures have had any significant impact on reducing the proportion of defective cars across the three categories.

**Null Hypothesis**: There is no difference between defective car production in Factory A vs. Factory B

**Alternative Hypothesis**: There is a difference between defective car production in Factory A vs. Factory B

In [None]:
# Load the Data
df_chisquare = pd.read_csv('tesla_chisquare.csv')
df_chisquare.head()

Unnamed: 0.1,Unnamed: 0,Day,Factory,Category,Count
0,0,1,Factory A,Sedan,48
1,1,2,Factory A,Sedan,38
2,2,3,Factory A,Sedan,24
3,3,4,Factory A,Sedan,17
4,4,5,Factory A,Sedan,30


In [None]:
# Actual Frequency
observed_table = pd.crosstab(index = df_chisquare['Factory'],
                             columns = df_chisquare['Category'],
                             values = df_chisquare['Count'],
                             aggfunc = np.sum)

  observed_table = pd.crosstab(index = df_chisquare['Factory'],


In [None]:
# Perform Chi-square test
stat, pvalue, dof, expected_freq = st.chi2_contingency(observed = observed_table)

In [None]:
# Print and Interpret P-value
print(f'The test statistic is: {stat}')
print(f'The p-value is: {pvalue}')
p_value_interpreter(pvalue, alpha)

The test statistic is: 2.174705796739059
The p-value is: 0.33710767193470104
Fail to reject the Null Hypothesis
