# **Types of Hypothesis testing**

# Chi-Square Test #

Pearson’s chi-square (Χ2) tests, often referred to simply as chi-square tests, are among the most common nonparametric tests. Nonparametric tests are used for data that don’t follow the assumptions of parametric tests, especially the assumption of a normal distribution.

If you want to test a hypothesis about the distribution of a categorical variable you’ll need to use a chi-square test or another nonparametric test. Categorical variables can be nominal or ordinal and represent groupings such as species or nationalities. Because they can only have a few specific values, they can’t have a normal distribution.

**The chi-square formula**
Both of Pearson’s chi-square tests use the same formula to calculate the test statistic, chi-square (Χ2):

  \begin{equation*} X^2=\sum{\frac{(O-E)^2}{E}} \end{equation*}

Where:

Χ2 is the chi-square test statistic

Σ is the summation operator (it means “take the sum of”)

O is the observed frequency

E is the expected frequency

The larger the difference between the observations and the expectations (O − E in the equation), the bigger the chi-square will be. To decide whether the difference is big enough to be statistically significant, you compare the chi-square value to a critical value.

A Pearson’s chi-square test can be used for your data if all of the following are true:

- You want to test a hypothesis about one or more categorical variables. If one or more of your variables is quantitative, you should use a different statistical test. Alternatively, you could convert the quantitative variable into a categorical variable by separating the observations into intervals.
- The sample was randomly selected from the population.
- There are a minimum of five observations expected in each group or combination of groups.


`Types of chi-square tests`

The two types of Pearson’s chi-square tests are:

- Chi-square goodness of fit test
- Chi-square test of independence

Mathematically, these are actually the same test used for different purposes.

`Chi-square goodness of fit test`

You can use a chi-square goodness of fit test when you have one categorical variable. It allows you to test whether the frequency distribution of the categorical variable is significantly different from your expectations. Often, but not always, the expectation is that the categories will have equal proportions.

`Expectation of equal proportions`

- Null hypothesis (H0): The bird species visit the bird feeder in equal proportions.

- Alternative hypothesis (HA): The bird species visit the bird feeder in different proportions.

`Expectation of different proportions`

- Null hypothesis (H0): The bird species visit the bird feeder in the same proportions as the average over the past five years.

- Alternative hypothesis (HA): The bird species visit the bird feeder in different proportions from the average over the past five years.

`Chi-square test of independence`

You can use a chi-square test of independence when you have two categorical variables. It allows you to test whether the two variables are related to each other. If two variables are independent (unrelated), the probability of belonging to a certain group of one variable isn’t affected by the other variable.

`Example: Chi-square test of independence`

Null hypothesis (H0): The proportion of people who are left-handed is the same for Americans and Canadians.

Alternative hypothesis (HA): The proportion of people who are left-handed differs between nationalities.

#

# ANOVA

- ANOVA, which stands for Analysis of Variance, is a statistical test used to analyze the difference between the means of more than two groups.

- A one-way ANOVA uses one independent variable, while a two-way ANOVA uses two independent variables.

- ANOVA determines whether the groups created by the levels of the independent variable are statistically different by calculating whether the means of the treatment levels are different from the overall mean of the dependent variable.

- If any of the group means is significantly different from the overall mean, then the null hypothesis is rejected.

- ANOVA uses the F test for statistical significance. This allows for comparison of multiple means at once, because the error is calculated for the whole set of comparisons rather than for each individual two-way comparison (which would happen with a t test).

- The F test compares the variance in each group mean from the overall group variance. If the variance within groups is smaller than the variance between groups, the F test will find a higher F value, and therefore a higher likelihood that the difference observed is real and not due to chance.

# Two Tailed Test
It is a statistical method that checks if a sample mean is significantly different from a hypothesized population mean in either direction. It is called two-sided because it looks at both ends (tails) of the probability distribution, meaning it considers the possibility that the sample mean is greater than or less than the population mean. This differs from a one-tailed test that only looks in one direction.

A two-sided t-test is used when you don't have a specific guess about the direction of the difference but want to see if there's any significant difference at all. It's a key tool in hypothesis testing and helps you figure out whether the differences you observe are real or just due to chance. By checking both tails of the distribution, the two-sided t-test gives you a fuller picture of what's going on in your data. This approach makes sure you're catching significant differences, no matter which direction they go.

`How it works`

- Set a significance level (α): Usually 0.05, representing a 5% chance of finding a false positive.

- Calculate the t-statistic: Based on your sample data.

- Compare to critical values: If your t-value falls beyond these critical values on either side, you reject the null hypothesis (which says there's no difference) and conclude that your sample mean does significantly differ from the population mean.



# Proportion Tests

The proportion test compares the sample's proportion to the population's proportion or compares the sample's proportion to the proportion of another sample.