Power documentation and Example #4091

mikekaminsky · 2017-10-29T19:48:45Z

Improvements to power documentation. Comments and suggestions welcome!

Working on suggestions noted in #4085

Future Work:

Examples for plot_power() function
Potentially remove references to user-facing power() function and point everyone to the solve_power() versions
Figure out how to document the relationship between the classes and the helper functions

mikekaminsky · 2017-10-29T19:54:23Z

I could use some help figuring out how the documentation system is supposed to work. A few issues:

I'm getting an error in my use of smp.prop_ind_solve_power in the document power.rst and I'm not sure why
Is there way to use make html on only one file for faster iteration?
How do you control what shows up in the documentation pages (e.g., how do I make the signature of the function show up as referenced in this comment: DOC: Confusion in power functions and documentation #4085 (comment))?

coveralls · 2017-10-29T21:00:27Z

Coverage increased (+0.0002%) to 81.883% when pulling be986f1 on mikekaminsky:power_documentation into 0c3f6e9 on statsmodels:master.

coveralls · 2017-10-29T21:26:19Z

Coverage increased (+0.0002%) to 82.863% when pulling 5b72ff8 on mikekaminsky:power_documentation into 40e4f56 on statsmodels:master.

codecov-io · 2017-10-29T22:06:28Z

Codecov Report

Merging #4091 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #4091      +/-   ##
==========================================
+ Coverage   80.26%   80.26%   +<.01%     
==========================================
  Files         564      564              
  Lines       85555    85556       +1     
  Branches     9675     9675              
==========================================
+ Hits        68669    68670       +1     
- Misses      14656    14659       +3     
+ Partials     2230     2227       -3

Impacted Files	Coverage Δ
statsmodels/stats/api.py	`100% <ø> (ø)`	⬆️
statsmodels/stats/power.py	`80.16% <100%> (+0.08%)`	⬆️
statsmodels/iolib/summary.py	`54.16% <0%> (ø)`	⬆️
statsmodels/tsa/statespace/mlemodel.py	`85.61% <0%> (ø)`	⬆️
statsmodels/stats/libqsturng/make_tbls.py	`0% <0%> (ø)`	⬆️
statsmodels/stats/descriptivestats.py	`24.13% <0%> (ø)`	⬆️
statsmodels/sandbox/regression/treewalkerclass.py	`0% <0%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 40e4f56...5b72ff8. Read the comment docs.

josef-pkt · 2017-10-30T18:23:24Z

Is there way to use make html on only one file for faster iteration?

sphinx only rebuild the parts that have been changed, so it should be faster on repeated runs. AFAIR, I haven't done it in a while

How do you control what shows up in the documentation pages

That is done automatically by sphinx, I don't know if we can somehow force the missing signature.

I don't see a prop_ind_solve_power function, I need to check. I guess this should be using the Normal Power function.

mikekaminsky · 2017-10-31T11:59:39Z

@josef-pkt I added the prop_ind_solve_power function here

What do you think of the rest of the content?

josef-pkt · 2017-10-31T12:12:32Z

I will try to go over it later today.

We will most likely have to rename prop_ind_solve_power to prop_ztest_solve_power, because we need a different version also #1197

josef-pkt

Sorry this got lost,

I just found two pending review comments, but it looks like I didn't finish with the review.
I'm postponing this until after 0.9rc

josef-pkt · 2017-11-10T17:17:11Z

docs/source/power.rst

+
+Similar to the difference in proportions, we can use the :code:`tt_ind_solve_power` function to analyze the power of an experiment where we'll use a t-test (e.g., if we want to compare the difference in means between two groups in our study).
+
+Imagine we're working on a study to compare the spending habits between two groups of students where the outcome of interest is the average amount spent on books.


"we are" I prefer not to use the shortened forms in written text

josef-pkt · 2017-11-10T17:17:57Z

docs/source/power.rst

+
+Imagine we're working on a study to compare the spending habits between two groups of students where the outcome of interest is the average amount spent on books.
+
+In this example, let's assume that we have are conducting a study where we know what our sample-size will be in advance, and we want to calculate the smallest effect-size we'll be able to detect with 80% power.


"we have are conducting" ?

mikekaminsky · 2018-04-27T13:42:33Z

I also forgot about this. I'll take another pass over the weekend.

josef-pkt · 2018-04-27T14:17:47Z

I think this would fit now better in a examples/notebook, which makes it also easier for users to run the example interactively.

Those notebooks are a mixed bag, some with extensive explanations and some just with code examples.

mikekaminsky · 2018-04-27T14:23:45Z

Nice. I didn't even know about those. I'll at least convert what I have.

…

On Fri, Apr 27, 2018 at 10:17 AM, Josef Perktold ***@***.***> wrote: I think this would fit now better in a examples/notebook, which makes it also easier for users to run the example interactively. Those notebooks are a mixed bag, some with extensive explanations and some just with code examples. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#4091 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFJQYc6nJ_s_ZmFJyTRZfzWm5txB5DZOks5tsyiQgaJpZM4QKcp2> .

Add documentation for power module

mikekaminsky · 2018-04-30T02:58:23Z

@josef-pkt I moved what I had into a notebook in examples/.

I still think we could improve the docs for this functionality (e.g., for tt_ind_solve_power) which is the first statsmodels result when you search python power analysis .

In particular, something on that page that points either to the example notebook or a more info on the suite of power tests would be helpful for end users.

LMK if I can help with that or if you just want to leave it.

josef-pkt · 2018-04-30T03:13:22Z

@mikekaminsky
The docstring can have an Examples section, which is very useful for users to quickly see how to use it.
The main problem is that it is a lot of work to add those to many functions and methods.

In our old sourceforge doc pages it was possible to directly link to the notebooks. I don't think it's currently possible in the new github site, but those links should (eventually) be available again.
(I didn't really pay a lot of attention to how the current doc setup works.)

mikekaminsky · 2018-04-30T13:31:54Z

@josef-pkt works for me. I think this is an improvement that's good to go out as-is unless you have additional comments / suggestions.

josef-pkt · 2018-04-30T13:46:48Z

I put milestone 0.9 back.
I think we can still add it to 0.9, except for the prop_ind_solve_power function. We could define it in the notebook itself.

I'm not sure yet whether prop_ind_solve_power doesn't need to have different options, i.e. add a method argument or use different names for different version of normal based prop_ind tests
The current version is not what most users expect coming from other packages, unless they come from the effect size references.

mikekaminsky · 2018-04-30T14:07:08Z

I think I was trying to both follow your patterns and show how to generate what you get in R

R> power.prop.test(p1=0.10, p2=0.1*(1+.05), power=.80)

     Two-sample comparison of proportions power calculation 

              n = 57763
             p1 = 0.1
             p2 = 0.105
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

NOTE: n is number in *each* group

I can make the interface the same as the R one so that you just give two proportions (and the ES gets calculated under the hood) -- LMK what you think would be most useful to users

bashtage · 2019-05-10T08:03:40Z

The notebook is still incomplete, and is missing the second section
@josef-pkt comment about ading the alias needs to be addressed.

mikekaminsky · 2019-05-10T12:56:53Z

@bashtage I don't really have time to pick this back up right now. Happy to simply close the PR.

EmanuelGoncalves · 2021-07-13T16:32:43Z

Thanks for this great functionality.

I found the documentation of the class and methods to be a bit confusing to me particularly about the number of observations, i.e. if it was about the whole set or for each group.

Meanwhile I managed to clarify this through a similar implementation in R.

As I couldn't find a reply to this question I created a stackoverflow Q&A: https://stackoverflow.com/questions/68365921/statsmodels-power-analysis-number-of-observations/68366180#68366180

Hope this is of any use.

Thanks,

josef-pkt · 2021-07-13T17:11:17Z

@EmanuelGoncalves
saying it's nobs "per group" is a bit tricky because group sizes are not the same if ratio is not 1 in two sample case.

So, the general pattern is that nobs1, i.e. the nobs of the first sample, is the main sample size variable, the sample size of the second sample, nobs2 and total nobs, then depend on the ratio.

EmanuelGoncalves · 2021-07-13T19:53:49Z

That makes sense, thank you.

But isn't it still ambiguous in the plot_power function of the TTestIndPower class, since the argument is nobs?

josef-pkt · 2021-07-13T20:05:19Z

The name nobs is ambiguous or "wrong", but it's inherited and we would need to add a method to subclasses to adjust the signature of the method.

I guess we should do this, but when I'm writing new things, I'm more focused on getting the code to work instead of thinking about every part of the api. That's were feedback becomes helpful.

A bit similar: some functions work for both one and two sample cases, and we cannot have nobs and nobs1 at the same time. even though that would be appropriate.

mikekaminsky mentioned this pull request Oct 29, 2017

DOC: Confusion in power functions and documentation #4085

Open

josef-pkt added type-enh comp-docs comp-stats labels Apr 27, 2018

josef-pkt added this to the 0.10 milestone Apr 27, 2018

josef-pkt reviewed Apr 27, 2018

View reviewed changes

josef-pkt added the prio-elev label Apr 27, 2018

mikekaminsky force-pushed the power_documentation branch from be986f1 to 3d9fdad Compare April 30, 2018 00:43

mikekaminsky added 3 commits April 29, 2018 22:48

Exposes a new helper function prop_ind_solve_power

dddb79a

Add documentation for power module

Independent ttest example

e6ebbe6

Move documentation into an example notebook

411df78

mikekaminsky force-pushed the power_documentation branch from ac1fa86 to 411df78 Compare April 30, 2018 02:51

mikekaminsky changed the title ~~WIP Power documentation~~ Power documentation and Example Apr 30, 2018

josef-pkt modified the milestones: 0.10, 0.9 Apr 30, 2018

Oops remove errors from notebook

5b72ff8

josef-pkt mentioned this pull request Apr 30, 2018

MAINT: release 0.9rc1 to 0.9 final #4573

Closed

bashtage self-assigned this May 2, 2019

bashtage removed their assignment May 10, 2019

bashtage added the Documentation label Jun 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Power documentation and Example #4091

Power documentation and Example #4091

mikekaminsky commented Oct 29, 2017 •

edited

mikekaminsky commented Oct 29, 2017 •

edited

coveralls commented Oct 29, 2017 •

edited

coveralls commented Oct 29, 2017 •

edited

codecov-io commented Oct 29, 2017 •

edited

josef-pkt commented Oct 30, 2017 •

edited

mikekaminsky commented Oct 31, 2017

josef-pkt commented Oct 31, 2017

josef-pkt left a comment

josef-pkt Nov 10, 2017

josef-pkt Nov 10, 2017

mikekaminsky commented Apr 27, 2018

josef-pkt commented Apr 27, 2018

mikekaminsky commented Apr 27, 2018 via email

mikekaminsky commented Apr 30, 2018

josef-pkt commented Apr 30, 2018

mikekaminsky commented Apr 30, 2018

josef-pkt commented Apr 30, 2018

mikekaminsky commented Apr 30, 2018

bashtage commented May 10, 2019

mikekaminsky commented May 10, 2019

EmanuelGoncalves commented Jul 13, 2021

josef-pkt commented Jul 13, 2021

EmanuelGoncalves commented Jul 13, 2021

josef-pkt commented Jul 13, 2021


		Similar to the difference in proportions, we can use the :code:`tt_ind_solve_power` function to analyze the power of an experiment where we'll use a t-test (e.g., if we want to compare the difference in means between two groups in our study).

		Imagine we're working on a study to compare the spending habits between two groups of students where the outcome of interest is the average amount spent on books.


		Imagine we're working on a study to compare the spending habits between two groups of students where the outcome of interest is the average amount spent on books.

		In this example, let's assume that we have are conducting a study where we know what our sample-size will be in advance, and we want to calculate the smallest effect-size we'll be able to detect with 80% power.

Power documentation and Example #4091

Are you sure you want to change the base?

Power documentation and Example #4091

Conversation

mikekaminsky commented Oct 29, 2017 • edited

mikekaminsky commented Oct 29, 2017 • edited

coveralls commented Oct 29, 2017 • edited

coveralls commented Oct 29, 2017 • edited

codecov-io commented Oct 29, 2017 • edited

Codecov Report

josef-pkt commented Oct 30, 2017 • edited

mikekaminsky commented Oct 31, 2017

josef-pkt commented Oct 31, 2017

josef-pkt left a comment

Choose a reason for hiding this comment

josef-pkt Nov 10, 2017

Choose a reason for hiding this comment

josef-pkt Nov 10, 2017

Choose a reason for hiding this comment

mikekaminsky commented Apr 27, 2018

josef-pkt commented Apr 27, 2018

mikekaminsky commented Apr 27, 2018 via email

mikekaminsky commented Apr 30, 2018

josef-pkt commented Apr 30, 2018

mikekaminsky commented Apr 30, 2018

josef-pkt commented Apr 30, 2018

mikekaminsky commented Apr 30, 2018

bashtage commented May 10, 2019

mikekaminsky commented May 10, 2019

EmanuelGoncalves commented Jul 13, 2021

josef-pkt commented Jul 13, 2021

EmanuelGoncalves commented Jul 13, 2021

josef-pkt commented Jul 13, 2021

mikekaminsky commented Oct 29, 2017 •

edited

mikekaminsky commented Oct 29, 2017 •

edited

coveralls commented Oct 29, 2017 •

edited

coveralls commented Oct 29, 2017 •

edited

codecov-io commented Oct 29, 2017 •

edited

josef-pkt commented Oct 30, 2017 •

edited