-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: stats: Add new ANOVA functions (WIP) #13783
Conversation
Please let me know what you do to fix |
Also, when you are ready for this to start being reviewed, can we split this up into smaller PRs (ideally one per function)? |
high=means[i] + delta))) | ||
return cis | ||
|
||
def deltas(self, confidence_level=0.95): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't add those.
They will be misleading. Users should call tukey-hsd instead
oneway looks fine, but anova users will look at pairwise comparisons and don't think about multiple testing problems. I never looked at two-way anova. It sounds too painful. |
Thanks @josef-pkt. Obviously this PR has stagnated for too long. I've been tempted to close this, and instead work with |
I would support moving this work to statsmodels. (If we do, we should also update the roadmap.) |
I'm not really interested in two-way anova since we support general anova hypothesis testing through OLS, similarly, MANOVA is only a special case for multivariate tests in multivariate linear model. I had added a lot of support for oneway anovas, but I wouldn't want to do the same for two-way as standalone function, unless there some extras for two-way that I'm not (yet) aware of. Two-way still fits in scipy.stats. It's requested every once in a while. The main reason for me to add standalone hypothesis tests that can also be done with models is better small sample statistics. A related request is within and between effects in repeated measures anova. statsmodels has univariate repeated measures anova but it's restrictive, either because of the theory or because of our understanding. I never understood much in this area. (The literature works a lot with sum of squares, but I never really understood or found the theoretical background behind it. Some statisticians argue to drop anova completely and just use the corresponding "modern" models directly.) |
I closed the pull request. For now, the code resides in a separate repo, https://github.com/WarrenWeckesser/yanova |
Expand the capabilities for analysis of variance in SciPy.
This is a very rough draft; issues include a rough (i.e. inconsistent and incomplete) API, no parameter validation tests (but lots of basic tests), incomplete docstrings, and more. I want to get what I have so far in a PR and run it though CI and the docs build.