# Chi-Square Tests Lab Manual
## Objective
The objective of this lab is to explore the application of the Chi-Square test in determining the independence of categorical variables and the goodness of fit. Through these exercises, students will learn to formulate hypotheses, calculate expected frequencies, and interpret results to assess relationships and distributions.

## Question 1: Goodness of Fit - Are the frequencies of balls with different colors equal in our bag?

In [11]:
from scipy.stats import chisquare
import numpy as np

# Observed and expected frequencies
observed_q1 = np.array([50, 30, 30, 10])
expected_q1 = np.array([25, 25, 25, 25]) * sum(observed_q1) / 100

# Chi-square test
chi2_stat_q1, p_value_q1 = chisquare(f_obs=observed_q1, f_exp=expected_q1)
print(f"Chi2 Statistic: {chi2_stat_q1}, P-Value: {p_value_q1}")

Chi2 Statistic: 26.666666666666668, P-Value: 6.914913279310525e-06


## Question 2: Independence - Are support levels for 'None,' 'Obama,' and 'McCain' independent of gender?

In [12]:
from scipy.stats import chi2_contingency

# Contingency table
table_q2 = np.array([
    [10, 50, 35],  # Male
    [15, 60, 40]   # Female
])

# Chi-square test
chi2_stat_q2, p_value_q2, dof_q2, expected_q2 = chi2_contingency(table_q2)
print(f"Chi2 Statistic: {chi2_stat_q2}, P-Value: {p_value_q2}")

Chi2 Statistic: 0.3407530684418557, P-Value: 0.843347207721


## Question 3: Independence - Does hand preference depend on gender?

In [13]:
# Contingency table
table_q3 = np.array([
    [12, 108],  # Female
    [24, 156]   # Male
])

# Chi-square test
chi2_stat_q3, p_value_q3, dof_q3, expected_q3 = chi2_contingency(table_q3)
print(f"Chi2 Statistic: {chi2_stat_q3}, P-Value: {p_value_q3}")

Chi2 Statistic: 0.4748000841750843, P-Value: 0.4907871540801906


## Question 4: Independence - Does class standing influence meal plan selection?

In [14]:
# Contingency table
table_q4 = np.array([
    [24, 32, 14],  # Freshman
    [22, 26, 12],  # Sophomore
    [10, 14, 6],   # Junior
    [14, 16, 10]   # Senior
])

# Chi-square test
chi2_stat_q4, p_value_q4, dof_q4, expected_q4 = chi2_contingency(table_q4)
print(f"Chi2 Statistic: {chi2_stat_q4}, P-Value: {p_value_q4}")

Chi2 Statistic: 0.7093382807668522, P-Value: 0.9942873142017763


## Question 5: Independence - Is there a relationship between award preference and SAT dominance?

In [15]:
# Contingency table
table_q5 = np.array([
    [21, 68, 116],  # Math
    [10, 79, 61]    # Verbal
])

# Chi-square test
chi2_stat_q5, p_value_q5, dof_q5, expected_q5 = chi2_contingency(table_q5)
print(f"Chi2 Statistic: {chi2_stat_q5}, P-Value: {p_value_q5}")

Chi2 Statistic: 13.622609647147335, P-Value: 0.0011012550185620991


## Question 6: Independence - Is there a preference for ice cream flavors among different age groups?

In [16]:
# Contingency table
table_q6 = np.array([
    [35, 25, 15],  # Children
    [20, 30, 20],  # Teenagers
    [25, 20, 25]   # Adults
])

# Chi-square test
chi2_stat_q6, p_value_q6, dof_q6, expected_q6 = chi2_contingency(table_q6)
print(f"Chi2 Statistic: {chi2_stat_q6}, P-Value: {p_value_q6}")

Chi2 Statistic: 8.595734126984128, P-Value: 0.07203791043005359


## Question 7: Goodness of Fit - Are the outcomes of a six-sided die roll equally distributed?

In [17]:
from scipy.stats import chisquare

# Observed and expected frequencies
observed_q7 = np.array([15, 20, 10, 18, 12, 25])
total_observed = sum(observed_q7)
expected_q7 = np.array([16.67] * 6)  # Define proportions
expected_q7 = expected_q7 * total_observed / sum(expected_q7)  # Scale to match total

# Chi-square test
chi2_stat_q7, p_value_q7 = chisquare(f_obs=observed_q7, f_exp=expected_q7)
print(f"Chi2 Statistic: {chi2_stat_q7}, P-Value: {p_value_q7}")


Chi2 Statistic: 9.079999999999998, P-Value: 0.10591545867098323


## Question 8: Independence - Does the choice of transportation vary by region?

In [18]:
# Contingency table
table_q8 = np.array([
    [40, 30, 20],  # Urban
    [50, 20, 10],  # Suburban
    [30, 10, 60]   # Rural
])

# Chi-square test
chi2_stat_q8, p_value_q8, dof_q8, expected_q8 = chi2_contingency(table_q8)
print(f"Chi2 Statistic: {chi2_stat_q8}, P-Value: {p_value_q8}")

Chi2 Statistic: 57.64583333333333, P-Value: 9.055583808766023e-12


## Question 9: Goodness of Fit - Are the observed and expected distributions of grades in a class significantly different?

In [19]:
# Observed and expected frequencies
observed_q9 = np.array([15, 30, 40, 10, 5])
expected_q9 = np.array([20, 30, 30, 10, 10]) * sum(observed_q9) / 100

# Chi-square test
chi2_stat_q9, p_value_q9 = chisquare(f_obs=observed_q9, f_exp=expected_q9)
print(f"Chi2 Statistic: {chi2_stat_q9}, P-Value: {p_value_q9}")

Chi2 Statistic: 7.083333333333334, P-Value: 0.1315494286414895
