We will implement our first real life problem via python

In [None]:
import numpy as np
from scipy import stats

# Data
before_treatment = np.array([120, 122, 118, 130, 125, 128, 115, 121, 123, 119])
after_treatment = np.array([115, 120, 112, 128, 122, 125, 110, 117, 119, 114])

In [None]:
# Step 1: Null and Alternate Hypotheses
null_hypothesis = "The new drug has no effect on blood pressure."
alternate_hypothesis = "The new drug has an effect on blood pressure."

In [None]:
# Step 2: Significance Level
alpha = 0.05

In [None]:
# Step 3: Paired T-test
t_statistic, p_value = stats.ttest_rel(after_treatment, before_treatment)

In [None]:
# Step 4: Calculate T-statistic manually
m = np.mean(after_treatment - before_treatment)
s = np.std(after_treatment - before_treatment, ddof=1)  # using ddof=1 for sample standard deviation
n = len(before_treatment)
t_statistic_manual = m / (s / np.sqrt(n))

In [None]:
# Step 5: Decision
if p_value <= alpha:
    decision = "Reject"
else:
    decision = "Fail to reject"

# Conclusion
if decision == "Reject":
    conclusion = "There is statistically significant evidence that the average blood pressure before and after treatment with the new drug is different."
else:
    conclusion = "There is insufficient evidence to claim a significant difference in average blood pressure before and after treatment with the new drug."

In [None]:
# Display results
print("T-statistic (from scipy):", t_statistic)
print("P-value (from scipy):", p_value)
print("T-statistic (calculated manually):", t_statistic_manual)
print(f"Decision: {decision} the null hypothesis at alpha={alpha}.")
print("Conclusion:", conclusion)

T-statistic (from scipy): -9.0
P-value (from scipy): 8.538051223166285e-06
T-statistic (calculated manually): -9.0
Decision: Reject the null hypothesis at alpha=0.05.
Conclusion: There is statistically significant evidence that the average blood pressure before and after treatment with the new drug is different.


Here's a breakdown of what this means in the context of hypothesis testing:

Null Hypothesis (
H
0
​
 ): This is a statement of no effect or no difference. It is the hypothesis that the test aims to provide evidence against.

Alternative Hypothesis (
H
A
​
 ): This is what you might believe to be true or hope to prove true. It is the statement that there is an effect or a difference.

Significance Level (
α): When you set
α=0.05, you're stating that you are willing to accept a 5% chance of incorrectly rejecting the null hypothesis. In other words, there is a 5% chance of concluding there is an effect or difference when there is none.

P-value: After conducting the t-test, you'll compute a p-value, which is the probability of observing your data (or something more extreme) if the null hypothesis is true.

The significance level (
α) is then used as a benchmark to compare against the p-value:

If
p≤α, you reject the null hypothesis in favor of the alternative hypothesis. This means that the observed data is unlikely under the assumption that the null hypothesis is true, and there is evidence to suggest an effect or difference exists at the specified significance level.

If
p>α, you fail to reject the null hypothesis. This does not necessarily prove the null hypothesis is true, only that there isn't enough evidence to conclude a significant effect or difference at the specified significance level.

Choosing a significance level of 0.05 is conventional, but the appropriate level can depend on the context of the test and the consequences of making a Type I error. In fields where the cost of a Type I error is particularly high, a more stringent significance level (like 0.01) might be chosen.

In the above example, given the T-statistic of approximately -9 and an extremely small p-value, the results indicate a strong case to reject the null hypothesis at a significance level of 0.05.


In [None]:
import scipy.stats as stats
import math
import numpy as np

# Given data
sample_data = np.array(
    [205, 198, 210, 190, 215, 205, 200, 192, 198, 205, 198, 202, 208, 200, 205, 198, 205, 210, 192, 205, 198, 205, 210, 192, 205])
population_std_dev = 5
population_mean = 200
sample_size = len(sample_data)

# Step 1: Define the Hypotheses
# Null Hypothesis (H0): The average cholesterol level in a population is 200 mg/dL.
# Alternate Hypothesis (H1): The average cholesterol level in a population is different from 200 mg/dL.

# Step 2: Define the Significance Level
alpha = 0.05  # Two-tailed test

# Critical values for a significance level of 0.05 (two-tailed)
critical_value_left = stats.norm.ppf(alpha/2)
critical_value_right = -critical_value_left

# Step 3: Compute the test statistic
sample_mean = sample_data.mean()
z_score = (sample_mean - population_mean) / \
    (population_std_dev / math.sqrt(sample_size))

# Step 4: Result
# Check if the absolute value of the test statistic is greater than the critical values
if abs(z_score) > max(abs(critical_value_left), abs(critical_value_right)):
    print("Reject the null hypothesis.")
    print("There is statistically significant evidence that the average cholesterol level in the population is different from 200 mg/dL.")
else:
    print("Fail to reject the null hypothesis.")
    print("There is not enough evidence to conclude that the average cholesterol level in the population is different from 200 mg/dL.")


Reject the null hypothesis.
There is statistically significant evidence that the average cholesterol level in the population is different from 200 mg/dL.
