ENH: stats: Add new ANOVA functions (WIP) #13783

WarrenWeckesser · 2021-04-01T02:40:37Z

Expand the capabilities for analysis of variance in SciPy.

This is a very rough draft; issues include a rough (i.e. inconsistent and incomplete) API, no parameter validation tests (but lots of basic tests), incomplete docstrings, and more. I want to get what I have so far in a PR and run it though CI and the docs build.

mdhaber · 2021-04-01T03:57:58Z

Please let me know what you do to fix test_warning_calls_filters!

mdhaber · 2021-04-05T19:40:38Z

Also, when you are ready for this to start being reviewed, can we split this up into smaller PRs (ideally one per function)?

josef-pkt · 2023-03-10T19:00:52Z

scipy/stats/_anova_oneway.py

+                                                  high=means[i] + delta)))
+        return cis
+
+    def deltas(self, confidence_level=0.95):


I wouldn't add those.
They will be misleading. Users should call tukey-hsd instead

josef-pkt · 2023-03-10T19:22:20Z

oneway looks fine, but anova users will look at pairwise comparisons and don't think about multiple testing problems.

I never looked at two-way anova. It sounds too painful.
(Statsmodels supports multiway anova only for regression models)
(I just looked again at two and three way MANOVA, and interpretation of main and interaction effects are tricky. Two long issues why does statsmodels' results differ from SPSS and why do they differ from R, which each took me several days to figure out. :)

WarrenWeckesser · 2023-03-10T20:00:23Z

Thanks @josef-pkt. Obviously this PR has stagnated for too long.

I've been tempted to close this, and instead work with statsmodels so that it is the place to go for ANOVA and its variants. It would be good to avoid duplication if we can, and statsmodels already has a quite a bit more than SciPy's f_oneway. @josef-pkt, what do you think?

tupui · 2023-03-13T09:26:18Z

I would support moving this work to statsmodels. (If we do, we should also update the roadmap.)

josef-pkt · 2023-03-13T12:39:54Z

I'm not really interested in two-way anova since we support general anova hypothesis testing through OLS, similarly, MANOVA is only a special case for multivariate tests in multivariate linear model.

I had added a lot of support for oneway anovas, but I wouldn't want to do the same for two-way as standalone function, unless there some extras for two-way that I'm not (yet) aware of.
https://www.statsmodels.org/dev/stats.html#oneway-anova

Two-way still fits in scipy.stats. It's requested every once in a while.

The main reason for me to add standalone hypothesis tests that can also be done with models is better small sample statistics.
e.g. Satterthwaite degrees of freedom for t-test and oneway. Models almost exclusively rely on the generic and asymptotic p-values, which are not very good in many cases for small or very small samples.
e.g. statsmodels/statsmodels#8727
That's a reason to support special design matrices, but my background is not good enough yet to know what to do.

A related request is within and between effects in repeated measures anova. statsmodels has univariate repeated measures anova but it's restrictive, either because of the theory or because of our understanding. I never understood much in this area. (The literature works a lot with sum of squares, but I never really understood or found the theoretical background behind it. Some statisticians argue to drop anova completely and just use the corresponding "modern" models directly.)

WarrenWeckesser · 2023-04-08T04:10:30Z

I closed the pull request. For now, the code resides in a separate repo, https://github.com/WarrenWeckesser/yanova

WarrenWeckesser marked this pull request as draft April 1, 2021 02:40

WarrenWeckesser added enhancement A new feature or improvement scipy.stats labels Apr 1, 2021

ENH: stats: Add new ANOVA functions (WIP)

7732ba5

WarrenWeckesser force-pushed the anova branch from 1303fc5 to 7732ba5 Compare April 28, 2021 18:01

Merge branch 'main' into anova

227a634

mdhaber requested a review from josef-pkt March 10, 2023 18:47

josef-pkt reviewed Mar 10, 2023

View reviewed changes

WarrenWeckesser closed this Apr 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: stats: Add new ANOVA functions (WIP) #13783

ENH: stats: Add new ANOVA functions (WIP) #13783

WarrenWeckesser commented Apr 1, 2021 •

edited

Loading

mdhaber commented Apr 1, 2021

mdhaber commented Apr 5, 2021

josef-pkt Mar 10, 2023

josef-pkt commented Mar 10, 2023

WarrenWeckesser commented Mar 10, 2023 •

edited

Loading

tupui commented Mar 13, 2023

josef-pkt commented Mar 13, 2023

WarrenWeckesser commented Apr 8, 2023

ENH: stats: Add new ANOVA functions (WIP) #13783

ENH: stats: Add new ANOVA functions (WIP) #13783

Conversation

WarrenWeckesser commented Apr 1, 2021 • edited Loading

mdhaber commented Apr 1, 2021

mdhaber commented Apr 5, 2021

josef-pkt Mar 10, 2023

Choose a reason for hiding this comment

josef-pkt commented Mar 10, 2023

WarrenWeckesser commented Mar 10, 2023 • edited Loading

tupui commented Mar 13, 2023

josef-pkt commented Mar 13, 2023

WarrenWeckesser commented Apr 8, 2023

WarrenWeckesser commented Apr 1, 2021 •

edited

Loading

WarrenWeckesser commented Mar 10, 2023 •

edited

Loading