# The scipy.stats Python Library 

[Official Documentation](https://docs.scipy.org/doc/scipy/reference/reference/stats.html)
***

# 10% A clear and concise overview of the scipy.stats Python library:

https://scipy-lectures.org/packages/statistics/index.html

https://docs.scipy.org/doc/scipy/reference/reference/stats.html

[t-test v anova](https://www.raybiotech.com/learning-center/t-test-anova/#:~:text=The%20t%2Dtest%20is%20a,statistically%20different%20from%20each%20other)

# 20% An example hypothesis test using ANOVA. You should find a data set on which it is appropriate to use ANOVA, ensure the assumptions underlying ANOVA are met, and then perform and display the results of your ANOVA using scipy.stats

# 10% Appropriate plots and other visualisations to enhance your notebook for viewers


## Example Hypothesis Test using ANOVA

https://www.itl.nist.gov/div898/education/datasets.htm#anova

One-way Anova and Two-Way Anova

https://www.statisticssolutions.com/free-resources/directory-of-statistical-analyses/anova/

[One-way ANOVA in SPSS Statistics](https://statistics.laerd.com/spss-tutorials/one-way-anova-using-spss-statistics.php)

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html

### Data Set - ensure the assumptions underlying ANOVA are met

p-values??

Assumption #1: Your dependent variable should be measured at the interval or ratio level (i.e., they are continuous). Examples of variables that meet this criterion include revision time (measured in hours), intelligence (measured using IQ score), exam performance (measured from 0 to 100), weight (measured in kg), and so forth. You can learn more about interval and ratio variables in our article: Types of Variable.

Assumption #2: Your independent variable should consist of two or more categorical, independent groups. Typically, a one-way ANOVA is used when you have three or more categorical, independent groups, but it can be used for just two groups (but an independent-samples t-test is more commonly used for two groups). Example independent variables that meet this criterion include ethnicity (e.g., 3 groups: Caucasian, African American and Hispanic), physical activity level (e.g., 4 groups: sedentary, low, moderate and high), profession (e.g., 5 groups: surgeon, doctor, nurse, dentist, therapist), and so forth.

Assumption #3: You should have independence of observations, which means that there is no relationship between the observations in each group or between the groups themselves. For example, there must be different participants in each group with no participant being in more than one group. This is more of a study design issue than something you can test for, but it is an important assumption of the one-way ANOVA. If your study fails this assumption, you will need to use another statistical test instead of the one-way ANOVA (e.g., a repeated measures design). If you are unsure whether your study meets this assumption, you can use our Statistical Test Selector, which is part of our enhanced guides.

Assumption #4: There should be no significant outliers. Outliers are simply single data points within your data that do not follow the usual pattern (e.g., in a study of 100 students' IQ scores, where the mean score was 108 with only a small variation between students, one student had a score of 156, which is very unusual, and may even put her in the top 1% of IQ scores globally). The problem with outliers is that they can have a negative effect on the one-way ANOVA, reducing the validity of your results. Fortunately, when using SPSS Statistics to run a one-way ANOVA on your data, you can easily detect possible outliers. In our enhanced one-way ANOVA guide, we: (a) show you how to detect outliers using SPSS Statistics; and (b) discuss some of the options you have in order to deal with outliers. You can learn more about our enhanced one-way ANOVA guide on our Features: One-way ANOVA page.

Assumption #5: Your dependent variable should be approximately normally distributed for each category of the independent variable. We talk about the one-way ANOVA only requiring approximately normal data because it is quite "robust" to violations of normality, meaning that assumption can be a little violated and still provide valid results. You can test for normality using the Shapiro-Wilk test of normality, which is easily tested for using SPSS Statistics. In addition to showing you how to do this in our enhanced one-way ANOVA guide, we also explain what you can do if your data fails this assumption (i.e., if it fails it more than a little bit). Again, you can learn more on our Features: One-way ANOVA page.

Assumption #6: There needs to be homogeneity of variances. You can test this assumption in SPSS Statistics using Levene's test for homogeneity of variances. If your data fails this assumption, you will need to not only carry out a Welch ANOVA instead of a one-way ANOVA, which you can do using SPSS Statistics, but also use a different post hoc test. In our enhanced one-way ANOVA guide, we (a) show you how to perform Levene’s test for homogeneity of variances in SPSS Statistics, (b) explain some of the things you will need to consider when interpreting your data, and (c) present possible ways to continue with your analysis if your data fails to meet this assumption, including running a Welch ANOVA in SPSS Statistics instead of a one-way ANOVA, and a Games-Howell test instead of a Tukey post hoc test (learn more on our Features: One-way ANOVA page).

### Results - perform and display the results of your ANOVA using scipy.stats

![Tests](https://www.reneshbedre.com/assets/posts/anova/main.webp)

https://www.reneshbedre.com/blog/anova.html

https://www.pythonfordatascience.org/anova-python/

https://www.analyticsvidhya.com/blog/2020/06/introduction-anova-statistics-data-science-covid-python/

https://www.marsja.se/four-ways-to-conduct-one-way-anovas-using-python/

https://analyticsindiamag.com/a-complete-python-guide-to-anova/ - missing data images

https://analyticsindiamag.com/maximum-likelihood-estimation-python-guide/ - Maximum Likelihood Estimation 

## References

[1] https://docs.scipy.org/doc/scipy/reference/reference/stats.html

***
## End