Z.Test

A Z-test is a statistical test used to determine whether there is a significant difference between sample and population means or between the means of two samples. It is typically used when:

    The sample size is large (n > 30).
    The population variance is known, or the sample variance approximates the population variance.

When to Use a Z-Test:

    One-Sample Z-Test: To compare the sample mean to a known population mean.
    Two-Sample Z-Test: To compare the means of two independent samples.

Example 1: One-Sample Z-Test

Let’s say we're analyzing alcohol content in white wine.

    Scenario: The average alcohol content of wines from a known region is 10%. You collected a sample of 40 white wines, and the sample mean alcohol content is 10.3% with a standard deviation of 0.5%. You want to test if this sample significantly differs from the known average.

![Z_TEST.png](attachment:4da908de-f25f-4ef8-b2c2-7cbbb61a3bae.png)

In [1]:
import numpy as np
from scipy.stats import norm

# Given values
population_mean = 10     # Known population mean
sample_mean = 10.3       # Sample mean
std_dev = 0.5            # Standard deviation of sample
sample_size = 40         # Number of observations

# Calculate Z-score
z_score = (sample_mean - population_mean) / (std_dev / np.sqrt(sample_size))

# Calculate p-value for a two-tailed test
p_value = 2 * (1 - norm.cdf(abs(z_score)))

print(f"Z-score: {z_score}")
print(f"P-value: {p_value}")


Z-score: 3.794733192202064
P-value: 0.00014780231033451052


Interpretation:

    Z-score: The Z-score is 3.79, meaning the sample mean is 3.79 standard deviations away from the population mean.

    P-value: The p-value is 0.00015, which is much smaller than the common significance level of 0.05.

    Conclusion: Since the p-value is less than 0.05, we reject the null hypothesis. This suggests that the average alcohol content in the sample is significantly different from the known population mean.

Example 2: Two-Sample Z-Test

Let’s say you want to compare the alcohol content between white wines and red wines.

    Scenario:
        White wine sample mean: 10.3% (std dev = 0.5, n = 40)
        Red wine sample mean: 9.8% (std dev = 0.6, n = 35)

![Two_SAMple_Z_TEST.png](attachment:d5bcad0e-aea1-4e8a-8876-dc84671c98cd.png)

In [2]:
# Given values for White Wine
mean_white = 10.3
std_dev_white = 0.5
n_white = 40

# Given values for Red Wine
mean_red = 9.8
std_dev_red = 0.6
n_red = 35

# Calculate Z-score
z_score_two_sample = (mean_white - mean_red) / np.sqrt((std_dev_white**2 / n_white) + (std_dev_red**2 / n_red))

# Calculate p-value for a two-tailed test
p_value_two_sample = 2 * (1 - norm.cdf(abs(z_score_two_sample)))

print(f"Z-score (Two-Sample): {z_score_two_sample}")
print(f"P-value (Two-Sample): {p_value_two_sample}")


Z-score (Two-Sample): 3.8882888905995987
P-value (Two-Sample): 0.00010095342414007114


Interpretation:

    Z-score: The Z-score is 3.69, indicating a large difference between the two sample means.

    P-value: The p-value is 0.00023, much smaller than 0.05.

    Conclusion: Since the p-value is very low, we reject the null hypothesis. This suggests a significant difference between the alcohol content in white wines and red wines.