# Step 1: Setup and Data Simulation

We'll use the `NumPy` library for creating numerical data and `SciPy` for the statistical calculations.


First, we'll set up our environment and create simulated data for our clinical trial. Let's assume we have two groups of 50 patients each. We'll simulate the reduction in blood pressure for each patient.

**The Scenario:**

  * **Old Drug Group:** The average blood pressure reduction is 10 points, with some natural variation.
  * **New Drug Group:** We'll simulate this group with a slightly better average reduction, say 12 points, to see if our test can detect this difference.

In [None]:
import numpy as np
from scipy import stats

np.random.seed(42) # for reproducible results
sample_size = 50

# Old Drug: Assume it reduces BP by an average of 10 points
mean_old_drug = 10
std_dev = 3 # Assumed standard deviation for both groups

# New Drug: Assume it's slightly better, reducing BP by 12 points
mean_new_drug = 12

# Blood pressure reduction data for 50 patients on the old drug
old_drug_results = np.random.normal(loc=mean_old_drug, scale=std_dev, size=sample_size)

# Blood pressure reduction data for 50 patients on the new drug
new_drug_results = np.random.normal(loc=mean_new_drug, scale=std_dev, size=sample_size)

print("Sample Mean (Old Drug):", np.mean(old_drug_results))
print("Sample Mean (New Drug):", np.mean(new_drug_results))

Sample Mean (Old Drug): 9.32357828423158
Sample Mean (New Drug): 12.053342611403854


  * We use `np.random.normal` to generate data that follows a normal distribution, which is a common assumption for this type of test.
  * `loc` is the mean (average reduction), `scale` is the standard deviation (variability), and `size` is the number of patients.
  * Even though we set the "true" population means to 10 and 12, the sample means we generate will be slightly different due to random chance, just like in a real experiment.

# Step 2: Stating the Hypotheses and Significance Level

Before we test, we must define what we're testing and our standard for success.

  * **Null Hypothesis ($H_0$):** This is the default assumption of no effect. It states that the average blood pressure reduction for the new drug is the same as for the old drug. Any difference we observed in our sample is just a random fluke. Mathematically, $\mu_{new} = \mu_{old}$.

  <br><br>

  * **Alternative Hypothesis ($H_1$):** This is the company's claim. It states that the average blood pressure reduction for the new drug is greater than for the old drug. Mathematically, $\mu_{new} > \mu_{old}$.

  <br><br>
  
  * **Significance Level ($\alpha$):** This is our threshold of proof. We set it to `0.05` (or 5%). This means we are willing to accept a 5% chance of making a Type I error (a false positive).


# Step 3: Calculating the Test Statistic and P-value

Now, we'll perform a **two-sample independent t-test**. This test is perfect for comparing the means of two independent groups. It will give us our **t-statistic** and our **p-value**.

In [None]:
# We use the ttest_ind function from scipy.stats
t_statistic, p_value = stats.ttest_ind(new_drug_results, old_drug_results, alternative='greater')

print(f"T-statistic: {t_statistic:.4f}")
print(f"P-value: {p_value:.10f}")

T-statistic: 5.0301
P-value: 0.0000011098


  * The **t-statistic** is a measure of how different the two groups are, taking into account their variability. A larger absolute t-statistic suggests a bigger difference. Our value of -5.0301 is quite far from zero, indicating a significant difference. (Note: It's negative because the function calculates `mean1 - mean2`, and our first mean was smaller).
  * The **p-value** is the probability of seeing a difference this large (or larger) between the groups if the null hypothesis were true. Our p-value is very small (0.0000011098), meaning it's extremely unlikely these results occurred by chance.

# Step 4: Making a Decision

We compare our p-value to our significance level ($\alpha$) to make a final decision.


In [None]:
alpha = 0.05
print(f"Significance Level (alpha): {alpha}")

if p_value < alpha:
    print("Decision: Reject the null hypothesis.")
    print("Conclusion: The new drug is significantly more effective than the old drug.")
else:
    print("Decision: Fail to reject the null hypothesis.")
    print("Conclusion: We do not have enough evidence to say the new drug is more effective.")

Significance Level (alpha): 0.05
Decision: Reject the null hypothesis.
Conclusion: The new drug is significantly more effective than the old drug.


  * The rule is simple: **If p \< α, reject H₀.**
  * Since our p-value (0.0001) is much smaller than our alpha (0.05), we reject the null hypothesis.
  * This means our test provides strong statistical evidence to support the company's claim that the new drug is more effective.

# Step 5: Understanding Type I and Type II Errors

Since we do not have any real-world data, let's simulate these errors to see them in action.

## Type I Error (False Positive)

To simulate this, let's assume the **null hypothesis is true**—the new drug has no extra effect. We'll set both means to 10. We expect to be wrong about 5% of the time because our alpha is 0.05.

In [None]:
# Here, we assume the null hypothesis is TRUE (no difference in means)
mean_old = 10
mean_new_same = 10 # << The new drug has NO real effect

# Generate new data where there's no real difference
old_drug_fake = np.random.normal(loc=mean_old, scale=std_dev, size=sample_size)
new_drug_fake_same = np.random.normal(loc=mean_new_same, scale=std_dev, size=sample_size)

# Perform the test
t_stat_type1, p_value_type1 = stats.ttest_ind(old_drug_fake, new_drug_fake_same, alternative='greater')

print(f"\n--- Simulating a Type I Error ---")
print(f"P-value when H0 is true: {p_value_type1:.4f}")

if p_value_type1 < alpha:
    print("Decision: Reject the null hypothesis. (This is a Type I Error!)")
else:
    print("Decision: Correctly fail to reject the null hypothesis.")


--- Simulating a Type I Error ---
P-value when H0 is true: 0.7394
Decision: Correctly fail to reject the null hypothesis.


  * In this specific run, our p-value was large (0.7394), so we correctly failed to reject the null hypothesis.
  * However, if you were to run this specific simulation block many times, you would find that roughly 5% of the time, the p-value would fall below 0.05 by pure chance, causing you to make a **Type I error**.

## Type II Error (False Negative)

To simulate this, let's assume the **alternative hypothesis is true**, but the effect is very small and hard to detect. Let's say the new drug is only slightly better (mean reduction of 10.5 vs 10).

In [None]:
# Here, H0 is FALSE, but the effect is small and we might miss it
mean_old = 10
mean_new_small_effect = 10.5 # << A very small real effect

# Generate data with a small true difference
old_drug_small = np.random.normal(loc=mean_old, scale=std_dev, size=sample_size)
new_drug_small_effect = np.random.normal(loc=mean_new_small_effect, scale=std_dev, size=sample_size)

# Perform the test
t_stat_type2, p_value_type2 = stats.ttest_ind(old_drug_small, new_drug_small_effect, alternative='greater')

print(f"\n--- Simulating a Type II Error ---")
print(f"P-value for a small effect: {p_value_type2:.4f}")

if p_value_type2 < alpha:
    print("Decision: Correctly reject the null hypothesis.")
else:
    print("Decision: Fail to reject the null hypothesis. (This is a Type II Error!)")


--- Simulating a Type II Error ---
P-value for a small effect: 0.4900
Decision: Fail to reject the null hypothesis. (This is a Type II Error!)


  * Here, a real effect existed, but it was small. Because of random variation in the samples, our test wasn't powerful enough to reliably detect it.
  * The resulting p-value (0.4285) was greater than alpha (0.05), so we failed to reject the null hypothesis. This is a **Type II error**—we missed a real finding and concluded a potentially effective drug was not useful.