### **Hypothesis Testing**


1. Null Hypothesis (H₀): Assumes there is no effect or difference.
2. Alternative Hypothesis (H₁): Assumes there is an effect or difference.
3. Type I Error: Rejecting H0
  when it is true (false positive).
4. Type II Error: Failing to reject H0​
  when it is false (false negative).
5. p-value: Probability of observing results as extreme as the sample results, assuming H0​  is true.
7. Significance Level (α): Threshold for rejecting H0​
  (common value: 0.05).

**1. t-test**

A t-test compares the means of two groups. It is used when the sample size is small, and the population standard deviation is unknown.



---



**Code for One-Sample t-test:**

In [3]:
# Import necessary library
from scipy.stats import ttest_1samp

# Sample data (e.g., test scores)
data = [75, 80, 85, 70, 95, 90, 85, 78, 88, 92]

# Hypothesized population mean
pop_mean = 85

# Null Hypothesis (H₀): The mean of the sample is equal to the population mean (85).
# Alternative Hypothesis (H₁): The mean of the sample is not equal to the population mean (85).

# Perform the one-sample t-test
t_stat, p_value = ttest_1samp(data, pop_mean)

# Print results
print("One-Sample t-test:")
print(f"Sample Data: {data}")
print(f"Hypothesized Population Mean: {pop_mean}")
print(f"t-statistic: {t_stat:.3f}")
print(f"p-value: {p_value:.3f}")

# Decision-making at a 5% significance level (α = 0.05)
alpha = 0.05
if p_value < alpha:
    print("Decision: Reject the null hypothesis. The sample mean is significantly different from the population mean.")
else:
    print("Decision: Fail to reject the null hypothesis. The sample mean is not significantly different from the population mean.")

One-Sample t-test:
Sample Data: [75, 80, 85, 70, 95, 90, 85, 78, 88, 92]
Hypothesized Population Mean: 85
t-statistic: -0.478
p-value: 0.644
Decision: Fail to reject the null hypothesis. The sample mean is not significantly different from the population mean.


Code for Two-Sample t-test:

---



In [4]:
# Import necessary library
from scipy.stats import ttest_ind

# Data for two independent groups (e.g., scores of two classes)
group1 = [55, 60, 65, 70, 68, 72, 75]
group2 = [50, 52, 54, 56, 58, 60, 63]

# Null Hypothesis (H₀): The means of the two groups are equal.
# Alternative Hypothesis (H₁): The means of the two groups are not equal.

# Perform the two-sample t-test
t_stat, p_value = ttest_ind(group1, group2)

# Print results
print("\nTwo-Sample t-test:")
print(f"Group 1 Data: {group1}")
print(f"Group 2 Data: {group2}")
print(f"t-statistic: {t_stat:.3f}")
print(f"p-value: {p_value:.3f}")

# Decision-making at a 5% significance level
alpha = 0.05
if p_value < alpha:
    print("Decision: Reject the null hypothesis. The means of the two groups are significantly different.")
else:
    print("Decision: Fail to reject the null hypothesis. The means of the two groups are not significantly different.")


Two-Sample t-test:
Group 1 Data: [55, 60, 65, 70, 68, 72, 75]
Group 2 Data: [50, 52, 54, 56, 58, 60, 63]
t-statistic: 3.258
p-value: 0.007
Decision: Reject the null hypothesis. The means of the two groups are significantly different.


### **2. z-test**

A z-test is used to compare means when the population standard deviation is known, and the sample size is large.

In [5]:
import numpy as np
from scipy.stats import norm

# Sample data
sample_mean = 75
pop_mean = 80  # Hypothesized population mean
pop_std = 10  # Known population standard deviation
n = 30  # Sample size

# Calculate z-score
z_score = (sample_mean - pop_mean) / (pop_std / np.sqrt(n))

# p-value for two-tailed test
p_value = 2 * (1 - norm.cdf(abs(z_score)))

print("Z-test:")
print(f"z-score: {z_score:.3f}")
print(f"p-value: {p_value:.3f}")

# Decision
if p_value < 0.05:
    print("Reject the null hypothesis: Mean is significantly different.")
else:
    print("Fail to reject the null hypothesis: No significant difference.")

Z-test:
z-score: -2.739
p-value: 0.006
Reject the null hypothesis: Mean is significantly different.


### **3. One-Way ANOVA**


ANOVA compares means across three or more groups to determine if at least one mean is different.

In [6]:
# Import necessary library
from scipy.stats import f_oneway

# Data for three groups (e.g., scores from three teaching methods)
group1 = [88, 90, 85, 87, 89]
group2 = [92, 95, 93, 91, 94]
group3 = [85, 84, 86, 88, 87]

# Null Hypothesis (H₀): All group means are equal.
# Alternative Hypothesis (H₁): At least one group mean is significantly different.

# Perform one-way ANOVA
f_stat, p_value = f_oneway(group1, group2, group3)

# Print results
print("\nOne-Way ANOVA:")
print(f"Group 1 Data: {group1}")
print(f"Group 2 Data: {group2}")
print(f"Group 3 Data: {group3}")
print(f"F-statistic: {f_stat:.3f}")
print(f"p-value: {p_value:.3f}")

# Decision-making at a 5% significance level
alpha = 0.05
if p_value < alpha:
    print("Decision: Reject the null hypothesis. At least one group mean is significantly different.")
else:
    print("Decision: Fail to reject the null hypothesis. No significant difference between group means.")


One-Way ANOVA:
Group 1 Data: [88, 90, 85, 87, 89]
Group 2 Data: [92, 95, 93, 91, 94]
Group 3 Data: [85, 84, 86, 88, 87]
F-statistic: 22.782
p-value: 0.000
Decision: Reject the null hypothesis. At least one group mean is significantly different.


4. Chi-Square Test


A Chi-Square Test checks if there is an association between two categorical variables.

In [7]:
# Import necessary library
from scipy.stats import chi2_contingency

# Contingency table (e.g., survey data with two variables: gender and preference)
# Rows: Male and Female, Columns: Prefer A, Prefer B
data = [[50, 30],  # Male responses
        [20, 80]]  # Female responses

# Null Hypothesis (H₀): The variables (gender and preference) are independent.
# Alternative Hypothesis (H₁): The variables are dependent.

# Perform the Chi-Square test
chi2_stat, p_value, dof, expected = chi2_contingency(data)

# Print results
print("\nChi-Square Test:")
print(f"Contingency Table:\n{data}")
print(f"Chi-square statistic: {chi2_stat:.3f}")
print(f"p-value: {p_value:.3f}")
print(f"Degrees of freedom: {dof}")
print(f"Expected Frequencies:\n{expected}")

# Decision-making at a 5% significance level
alpha = 0.05
if p_value < alpha:
    print("Decision: Reject the null hypothesis. The variables are dependent.")
else:
    print("Decision: Fail to reject the null hypothesis. The variables are independent.")


Chi-Square Test:
Contingency Table:
[[50, 30], [20, 80]]
Chi-square statistic: 32.015
p-value: 0.000
Degrees of freedom: 1
Expected Frequencies:
[[31.11111111 48.88888889]
 [38.88888889 61.11111111]]
Decision: Reject the null hypothesis. The variables are dependent.


Summary:
1. t-test: Compare means of two groups (small sample).
2. z-test: Compare means when population standard deviation is known (large sample).
3. ANOVA: Compare means of three or more groups.
4. Chi-Square Test: Check for associations between categorical variables.






---



Key Features of Each Test:
1. One-Sample t-test: Tests whether a single sample mean differs from a known value.
2. Two-Sample t-test: Compares means between two groups.
3. Two-Sample z-test: Similar to the t-test but used for large samples or known variance.
4. One-Way ANOVA: Compares means across three or more groups.
5. Chi-Square Test: Tests independence between categorical variables.