## Question 1
The following table indicates the number of 6-point scores in an American rugby match in the 1979 season.

![](table1.png)

Based on these results, we create a Poisson distribution with the sample mean parameter  = 2.435. Is there any reason to believe that at a .05 level the number of scores is a Poisson variable?
Check [here](https://www.geeksforgeeks.org/how-to-create-a-poisson-probability-mass-function-plot-in-python/) how to create a poisson distribution and how to calculate the expected observations, using the probability mass function (pmf). 
A Poisson distribution is a discrete probability distribution. It gives the probability of an event happening a certain number of times (k) within a given interval of time or space. The Poisson distribution has only one parameter, λ (lambda), which is the mean number of events.

In [4]:
import pandas as pd
import numpy as np
from scipy.stats import poisson, chisquare

# Create the dataframe with provided data
df = pd.DataFrame()
df['Number of Scores'] = ['0','1','2','3','4','5','6','7 or more']
df['Number of times'] = [35,99,104,110,25,62,10,3]

# Given lambda value
lambda_val = 2.435
total_matches = df['Number of times'].sum()

# Calculate expected frequencies using Poisson PMF
expected_frequencies = []
for k in range(7):  # For scores from 0 to 6
    expected_frequencies.append(poisson.pmf(k, lambda_val) * total_matches)

# For '7 or more', we sum the probabilities from 7 onwards
expected_frequencies.append(total_matches - sum(expected_frequencies))

df['Expected Frequencies'] = expected_frequencies

# Calculate the chi-squared statistic and p-value
chi2_stat, p_val = chisquare(df['Number of times'], df['Expected Frequencies'])

print(df)
print("\nCalculated chi-squared statistic:", chi2_stat)
print("P-value:", p_val)

  Number of Scores  Number of times  Expected Frequencies
0                0               35             39.243791
1                1               99             95.558630
2                2              104            116.342632
3                3              110             94.431437
4                4               25             57.485137
5                5               62             27.995262
6                6               10             11.361410
7        7 or more                3              5.581701

Calculated chi-squared statistic: 65.47797118460656
P-value: 1.205354426517408e-11


Next, we will compute the chi-squared statistic and compare it to the critical value to determine if the observed frequencies significantly differ from what we'd expect under a Poisson distribution.

In [5]:
from scipy.stats import chisquare

# Calculate the chi-squared statistic and p-value
chi2_stat, p_val = chisquare(df['Number of times'], df['Expected Frequencies'])

chi2_stat, p_val

(65.47797118460656, 1.205354426517408e-11)

The calculated chi-squared statistic is 

χ 
2
 =65.48. The corresponding p-value for this statistic is approximately 
1.21
×
1
0
−
11
1.21×10 
−11
 .

Given that the p-value is significantly less than the significance level of 

α=0.05, we reject the null hypothesis (

H 
0
​
 ). This suggests that the observed frequencies significantly differ from what we would expect under a Poisson distribution with 

λ=2.435.

## Question 2
A researcher gathers information about the patterns of Physical Activity of children in the fifth grade of primary school of a public school. He defines three categories of physical activity (Low, Medium, High). He also inquires about the regular consumption of sugary drinks at school, and defines two categories (Yes = consumed, No = not consumed). We would like to evaluate if there is an association between patterns of physical activity and the consumption of sugary drinks for the children of this school, at a level of 5% significance. The results are in the following table: 

![](table4.png)

In [6]:
#your answer here

from scipy.stats import chi2_contingency

# Observed frequencies
observed = np.array([[32, 14, 6],
                     [12, 22, 9]])

# Chi-squared test of independence
chi2, p_val, _, expected = chi2_contingency(observed)

chi2, p_val, expected

(10.712198008709638,
 0.004719280137040844,
 array([[24.08421053, 19.70526316,  8.21052632],
        [19.91578947, 16.29473684,  6.78947368]]))

The calculated 

χ 
2
  statistic is approximately 
10.71
10.71.
The associated p-value is 
0.00472
0.00472.

Given that the p-value (
0.00472
0.00472) is less than the significance level (

α=0.05), we reject the null hypothesis (

H 
0
​
 ).