-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Choose standardizer for effect size in paired t-test #156
Comments
Do you have a reference perhaps? If the test is on the score difference, I am not sure how we can define the test-relevant parameter as one that involves only the pretest... |
Cumming (2013): https://link.springer.com/article/10.3758%2Fs13428-013-0392-4, section "choice of standardizer". Wolfgang Viechtbauer (metafor developer) has also raised it here and there, e.g.: https://stats.stackexchange.com/a/256205/17459. The pretest would be just one of the variables, so if you could select one of [var2-var1, var1, var2], that could work. It would complicate the widget a bit, though. |
Thanks; I am afraid that, unless I am missing something, I disagree with Cumming here. We want to test the difference, that is, the treatment effect. Consequently, the effect size we want to learn about also concerns the difference. If every participant shows similar improvement, SDdiff is small and effect size is therefore large; this is how it should be. On the other hand, if you use SD1 then effect size for the treatment effect is determined by pretest homogeneity. But that says nothing about the effect of the treatment. |
I think Cumming's point was that using SDdiff as standardizer is indeed appropriate for inference about the difference (for which we want to know the sampling distribution of the difference rather than of the mean in any one group), but that when viewed purely descriptively it can be misleading. Is that right? If the treatment effect is extremely consistent across all participants but tiny relative to the pre-treatment variance then the effect will appear very 'large' with SDdiff as standardizer, where most people would nonetheless think of it intuitively as only a 'small effect'. Using SD1 as standardizer gives a more intuitive descriptive effect size because it tells us what the average change is in terms of the initial differences among people (i.e. does the treatment tend to 'move someone up' within the pre-treatment population a lot or only a little?) So if the choice is being offered for descriptive results I would also favour having both SDdiff and SD1 as options for standardizer. It might even be instructive for people to see when the two are very different (e.g. 'small but consistent effect' or 'large but variable effect'). Or maybe that was obvious already and this discussion is only about inference, I'm not sure. |
Exactly as Luke explained, for treatments, you often want to know how much
you "moved" the group relative to the population from which they were
sampled at baseline. This also allows for conversions to Number Needed to
Treat etc.
It's not like it's difficult to do by hand :-) But I've seen several
publications messing them up, reducing comparability between studies. So it
would be nice if JASP could do both.
Best,
Jonas
…______________________________
Jonas Kristoffer Lindeløv, M.Sc., Ph.D.
Assistant Professor in Cognitive Neuroscience and Neuropsychology
Profile at Aalborg University <http://personprofil.aau.dk/117060>, Scientific
blog <http://lindeloev.net/>
On Tue, May 1, 2018 at 8:48 AM, Luke Tudge ***@***.***> wrote:
I think Cumming's point was that using SDdiff as standardizer is indeed
appropriate for inference about the difference (for which we want to know
the sampling distribution of the difference rather than of the mean in any
one group), but that when viewed purely descriptively it can be misleading.
Is that right?
If the treatment effect is extremely consistent across all participants
but tiny relative to the pre-treatment variance then the effect will appear
very 'large' with SDdiff as standardizer, where most people would
nonetheless think of it intuitively as only a 'small effect'. Using SD1 as
standardizer gives a more intuitive descriptive effect size because it
tells us what the average change is in terms of the initial differences
among people (i.e. does the treatment tend to 'move someone up' within the
pre-treatment population a lot or only a little?)
So if the choice is being offered for descriptive results I would also
favour having both SDdiff and SD1 as options for standardizer. It might
even be instructive for people to see when the two are very different (e.g.
'small but consistent effect' or 'large but variable effect').
Or maybe that was obvious already and this discussion is only about
inference, I'm not sure.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<https://github.com/jasp-stats/jasp-desktop/issues/2498#issuecomment-385611085>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABX2uYRyjBjdhQtc2Ndzhtvc_WS1aTCQks5tuAVEgaJpZM4Tr-Hh>
.
|
This discussion has a long tradition and it seems to me that there is no "general" approach. However, it might be interesting to have several options for the ES depending on your research question (see also Morris & DeShon, 2002). Best, Friedrich Morris, S. B., & DeShon, R. P. (2002). Combining effect size estimates in meta-analysis with repeated measures and independent-groups design. Psychological Methods, 7(1), 105-125. |
OK, I can see how the SD1 approach is informative, descriptively. For inference I would still want to stick to SDdiff though. But it would be possible to add the SD1 method somewhere in the GUI. Any suggestions? |
To me, it would be most intuitive to put it under "effect size" in the paired t-test widget: Standardizer:
In the Bayesian widget, it would just be part of the basic widget since everything is standardized. Further down the line, it would be great to have the choice between report effects in standardized (current) or original units. Then the choice of standardizer could be subsumed under the former. Just mentioning this since other issues point to a want for more options concerning effect sizes (Hedge's g: #2270 and #2094, corrections for correlations: #1576). @EJWagenmakers, If you had a particular reason in mind why you would not provide a CI when using SD1 as standardizer, I would be interested to learn! |
Well I guess I am willing to provide a confidence interval on effect size when SD1 is used as a standardizer (or a credible interval, possibly with a vague prior). But the test just seems to be on the treatment effect, and therefore involve SDdiff. And this produces the problem. If we just let users define effect size by means of SD1, and then conduct a test for the treatment effect using SD1, then this does not strike me as meaningful. So the challenge is to produce a point estimate and CI for the SD1 case, but without using it to do the test. This means it ought to be presented in the descriptives table, for instance. I'll discuss possible ways to do this with the team. |
Still valid with 0.19 beta. The challenge: "Produce a point estimate and CI for the SD1 case [pre-test], but without using it to do the test. This means it ought to be presented in the descriptives table." |
If you do a Paired samples t-test on pretest and posttest scores, you would use the pretest SD as standardizer, not the SD of the pairwise differences as is currently used. This is because the pretest SD represents the population variance that you want to "move" the subjects relative to.
So I suggest that you add an option to choose standardizer [var2-var1, var1, var2]. Both for frequentist t-test and the Bayesian one. The ES plays a central role in the Bayesian t-test since it is currently used for setting priors, etc. The ability to use the non-standardized effect size for priors and plots would also be wonderful, but that's another issue!
The text was updated successfully, but these errors were encountered: