A chi-squared test (also chi-square or χ2 test) is a statistical hypothesis test that is valid to perform when the test statistic is chi-squared distributed under the null hypothesis, specifically Pearson's chi-squared test and variants thereof. Pearson's chi-squared test is used to determine whether there is a statistically significant difference between the expected frequencies and the observed frequencies in one or more categories of a contingency table.

In the standard applications of this test, the observations are classified into mutually exclusive classes. If the null hypothesis that there are no differences between the classes in the population is true, the test statistic computed from the observations follows a χ2 frequency distribution. The purpose of the test is to evaluate how likely the observed frequencies would be assuming the null hypothesis is true.

Test statistics that follow a χ2 distribution occur when the observations are independent. There are also χ2 tests for testing the null hypothesis of independence of a pair of random variables based on observations of the pairs.

Chi-squared tests often refers to tests for which the distribution of the test statistic approaches the χ2 distribution asymptotically, meaning that the sampling distribution (if the null hypothesis is true) of the test statistic approximates a chi-squared distribution more and more closely as sample sizes increase.
There are two types of chi-square tests. Both use the chi-square statistic and distribution for different purposes:

- A chi-square **goodness of fit test** determines if sample data matches a population. For more details on this type, see: Goodness of Fit Test.
- A chi-square test for **independence compares** two variables in a contingency table to see if they are related. In a more general sense, it tests to see whether distributions of categorical variables differ from each another.

## a)

In [10]:
tab <- matrix(nrow=3, ncol=3, byrow=TRUE)
colnames(tab) <- c('Win', 'Loose', 'Total')
rownames(tab) <- c('% in population', 'Expected #', 'Observed #')
tab[1,] = c('50%', '50%', '100%')
tab[2,] = c(50, 50, 100)
tab[3,] = c(38, 64, 100)
tab

Unnamed: 0,Win,Loose,Total
% in population,50%,50%,100%
Expected #,50,50,100
Observed #,38,64,100


In [12]:
chi_sqr =  ((38-50)^2) / 50 + ((64 - 50)^2) / 50 
chi_sqr

In [13]:
df = 2-1 
df

In [14]:
p_value = pchisq(6.8, 1, lower.tail = FALSE)
p_value


**If decision maker chooses a > 0.00911, the H_0 hypothesis is rejected.** 

## b)

The mean of a binomial distribution is p and its standard deviation is sqr(p(1-p)/n).

The shape of a binomial distribution is symmetrical when p=0.5 or when n is large.

When n is large and p is close to 0.5, the binomial distribution can be approximated from the **standard normal distribution**

![bionomial0.5](img/equation119.svg)

## c)

![chi3](img/chi3.png)


$ \chi_1^{2} = 6.8 \longrightarrow 6.8 > 6.63 $

$ p_{value} < q.99 $

$ p_{value} < 0.01 $


## d)

In [18]:
tab <- matrix(nrow=3, ncol=3, byrow=TRUE)
colnames(tab) <- c('Win', 'Loose', 'Total')
rownames(tab) <- c('% in population', 'Expected #', 'Observed #')
tab[1,] = c('50%', '50%', '100%')
tab[2,] = c(50, 50, 100)
tab[3,] = c('x', '100 - x', 100)
tab

Unnamed: 0,Win,Loose,Total
% in population,50%,50%,100%
Expected #,50,50,100
Observed #,x,100 - x,100


$ \chi^{2} = \frac{((x-50)^2)}{50} + \frac{((100 - x - 50)^2)}{50} $

$ = \frac{((x-50)^2)}{50} + \frac{(50 - x)^2)}{50} $

$ = \frac{((x-50)^2) + ((-(x - 50)^2)}{50} $

$ = \frac{((x-\frac{n}{2})^2) + ((-(x - \frac{n}{2})^2)}{50} $

$ \frac{2 * ((x-\frac{n}{2})^2)}{\frac{n}{2}} $

Given that the second power is positive, the answer is always positive and increases with increasing value of $|x-\frac{n}{2}|$

## e)

???