## Question 1
The following table indicates the number of 6-point scores in an American rugby match in the 1979 season.

![](table1.png)

Based on these results, we create a Poisson distribution with the sample mean parameter  = 2.435. Is there any reason to believe that at a .05 level the number of scores is a Poisson variable?
Check [here](https://www.geeksforgeeks.org/how-to-create-a-poisson-probability-mass-function-plot-in-python/) how to create a poisson distribution and how to calculate the expected observations, using the probability mass function (pmf). 
A Poisson distribution is a discrete probability distribution. It gives the probability of an event happening a certain number of times (k) within a given interval of time or space. The Poisson distribution has only one parameter, λ (lambda), which is the mean number of events.

In [5]:
import numpy as np
from scipy.stats import poisson, chisquare

sample_mean = 2.435

#Observed data
scores = [0, 1, 2, 3, 4, 5, 6, 7]
frequencies = [35, 99, 104, 110, 62, 25, 10, 3]

#Expected frequencies with Poisson distribution
expected_frequencies = [poisson.pmf(k, sample_mean) * np.sum(frequencies) for k in scores]

#Normalize frequencies
observed_freq_sum = np.sum(frequencies)
expected_freq_sum = np.sum(expected_frequencies)

observed_freq_normalized = np.array(frequencies) / observed_freq_sum
expected_freq_normalized = np.array(expected_frequencies) / expected_freq_sum

#Chi-square goodness-of-fit test
chi_square_stat, p_value = chisquare(f_obs=observed_freq_normalized, f_exp=expected_freq_normalized, ddof=1)

print(f"Chi-square statistic: {chi_square_stat}")
print(f"P-value: {p_value}")

#Check significance at the 0.05 level
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: The data does not follow a Poisson distribution.")
else:
    print("Fail to reject the null hypothesis: The data may follow a Poisson distribution.")


Chi-square statistic: 0.012278033332975774
P-value: 0.9999999616163838
Fail to reject the null hypothesis: The data may follow a Poisson distribution.


## Question 2
A researcher gathers information about the patterns of Physical Activity of children in the fifth grade of primary school of a public school. He defines three categories of physical activity (Low, Medium, High). He also inquires about the regular consumption of sugary drinks at school, and defines two categories (Yes = consumed, No = not consumed). We would like to evaluate if there is an association between patterns of physical activity and the consumption of sugary drinks for the children of this school, at a level of 5% significance. The results are in the following table: 

![](table4.png)

In [4]:
from scipy.stats import chi2_contingency
import numpy as np

# Let's create the contingency table with our data
data = np.array([[32, 12],
                 [14, 22],
                 [6, 9]])

# Chi-square test
chi2, p, dof, expected = chi2_contingency(data)

chi2, p, dof, expected


(10.712198008709638,
 0.004719280137040844,
 2,
 array([[24.08421053, 19.91578947],
        [19.70526316, 16.29473684],
        [ 8.21052632,  6.78947368]]))

In [None]:
#Given that the p-value (0.0047) is less than the significance level of 0.05, we reject the null hypothesis 
#and conclude that there is a statistically significant association between physical activity patterns and 
#the consumption of sugary drinks among the children