## Question 1
A researcher gathers information about the patterns of Physical Activity of children in the fifth grade of primary school of a public school. He defines three categories of physical activity (Low, Medium, High). He also inquires about the regular consumption of sugary drinks at school, and defines two categories (Yes = consumed, No = not consumed). We would like to evaluate if there is an association between patterns of physical activity and the consumption of sugary drinks for the children of this school, at a level of 5% significance. The results are in the following table: 

![](table4.png)

In [2]:
import numpy as np
import scipy.stats as stats

# Observed contingency table
observed = np.array([
    [32, 12],
    [14, 22],
    [6, 9]
])

# Chi-square test of independence
chi2, p, dof, expected = stats.chi2_contingency(observed)

# Results
print("Chi² =", chi2)
print("p-value =", p)
print("Degrees of freedom =", dof)

# Conclusion
if p < 0.05:
    print("Conclusion: Reject H0. There is a significant association between physical activity level and sugary drink consumption.")
else:
    print("Conclusion: Fail to reject H0. There is no significant association between the variables.")


Chi² = 10.712198008709638
p-value = 0.004719280137040844
Degrees of freedom = 2
Conclusion: Reject H0. There is a significant association between physical activity level and sugary drink consumption.


## [OPTIONAL] Question 2
The following table indicates the number of 6-point scores in an American rugby match in the 1979 season.

![](table1.png)

Based on these results, we create a Poisson distribution with the sample mean parameter  = 2.435. Is there any reason to believe that at a .05 level the number of scores is a Poisson variable?

Check [here](https://www.geeksforgeeks.org/how-to-create-a-poisson-probability-mass-function-plot-in-python/) how to create a poisson distribution and how to calculate the expected observations, using the probability mass function (pmf). 
A Poisson distribution is a discrete probability distribution. It gives the probability of an event happening a certain number of times (k) within a given interval of time or space. The Poisson distribution has only one parameter, λ (lambda), which is the mean number of events.

In [3]:
import numpy as np
from scipy.stats import poisson, chisquare

# Observed frequencies (combining 6 and 7+ into one category)
observed = np.array([35, 99, 104, 110, 62, 25, 13])
n = sum(observed)
lambda_ = 2.435

# Theoretical Poisson probabilities
poisson_probs = [poisson.pmf(k, lambda_) for k in range(6)]
poisson_probs.append(1 - sum(poisson_probs))  # for 6 or more

# Expected frequencies
expected = np.array(poisson_probs) * n

# Chi-square goodness-of-fit test
chi2, p = chisquare(f_obs=observed, f_exp=expected)

# Results
print("Chi² =", chi2)
print("p-value =", p)

# Conclusion
if p < 0.05:
    print("Conclusion: Reject H0. The number of scores does not follow a Poisson distribution.")
else:
    print("Conclusion: Fail to reject H0. The number of scores is consistent with a Poisson distribution.")


Chi² = 6.051730109245756
p-value = 0.4174202786412414
Conclusion: Fail to reject H0. The number of scores is consistent with a Poisson distribution.
