Choose standardizer for effect size in paired t-test #156

lindeloev · 2018-04-29T21:43:48Z

If you do a Paired samples t-test on pretest and posttest scores, you would use the pretest SD as standardizer, not the SD of the pairwise differences as is currently used. This is because the pretest SD represents the population variance that you want to "move" the subjects relative to.

So I suggest that you add an option to choose standardizer [var2-var1, var1, var2]. Both for frequentist t-test and the Bayesian one. The ES plays a central role in the Bayesian t-test since it is currently used for setting priors, etc. The ability to use the non-standardized effect size for priors and plots would also be wonderful, but that's another issue!

EJWagenmakers · 2018-04-30T22:35:15Z

Do you have a reference perhaps? If the test is on the score difference, I am not sure how we can define the test-relevant parameter as one that involves only the pretest...
E.J.

lindeloev · 2018-04-30T22:58:38Z

Cumming (2013): https://link.springer.com/article/10.3758%2Fs13428-013-0392-4, section "choice of standardizer". Wolfgang Viechtbauer (metafor developer) has also raised it here and there, e.g.: https://stats.stackexchange.com/a/256205/17459.

The pretest would be just one of the variables, so if you could select one of [var2-var1, var1, var2], that could work. It would complicate the widget a bit, though.

EJWagenmakers · 2018-04-30T23:17:23Z

Thanks; I am afraid that, unless I am missing something, I disagree with Cumming here. We want to test the difference, that is, the treatment effect. Consequently, the effect size we want to learn about also concerns the difference. If every participant shows similar improvement, SDdiff is small and effect size is therefore large; this is how it should be. On the other hand, if you use SD1 then effect size for the treatment effect is determined by pretest homogeneity. But that says nothing about the effect of the treatment.
E.J.

luketudge · 2018-05-01T06:48:36Z

I think Cumming's point was that using SDdiff as standardizer is indeed appropriate for inference about the difference (for which we want to know the sampling distribution of the difference rather than of the mean in any one group), but that when viewed purely descriptively it can be misleading. Is that right?

If the treatment effect is extremely consistent across all participants but tiny relative to the pre-treatment variance then the effect will appear very 'large' with SDdiff as standardizer, where most people would nonetheless think of it intuitively as only a 'small effect'. Using SD1 as standardizer gives a more intuitive descriptive effect size because it tells us what the average change is in terms of the initial differences among people (i.e. does the treatment tend to 'move someone up' within the pre-treatment population a lot or only a little?)

So if the choice is being offered for descriptive results I would also favour having both SDdiff and SD1 as options for standardizer. It might even be instructive for people to see when the two are very different (e.g. 'small but consistent effect' or 'large but variable effect').

Or maybe that was obvious already and this discussion is only about inference, I'm not sure.

lindeloev · 2018-05-01T10:13:52Z

Exactly as Luke explained, for treatments, you often want to know how much you "moved" the group relative to the population from which they were sampled at baseline. This also allows for conversions to Number Needed to Treat etc. It's not like it's difficult to do by hand :-) But I've seen several publications messing them up, reducing comparability between studies. So it would be nice if JASP could do both. Best, Jonas

…

______________________________ Jonas Kristoffer Lindeløv, M.Sc., Ph.D. Assistant Professor in Cognitive Neuroscience and Neuropsychology Profile at Aalborg University <http://personprofil.aau.dk/117060>, Scientific blog <http://lindeloev.net/>

On Tue, May 1, 2018 at 8:48 AM, Luke Tudge ***@***.***> wrote: I think Cumming's point was that using SDdiff as standardizer is indeed appropriate for inference about the difference (for which we want to know the sampling distribution of the difference rather than of the mean in any one group), but that when viewed purely descriptively it can be misleading. Is that right? If the treatment effect is extremely consistent across all participants but tiny relative to the pre-treatment variance then the effect will appear very 'large' with SDdiff as standardizer, where most people would nonetheless think of it intuitively as only a 'small effect'. Using SD1 as standardizer gives a more intuitive descriptive effect size because it tells us what the average change is in terms of the initial differences among people (i.e. does the treatment tend to 'move someone up' within the pre-treatment population a lot or only a little?) So if the choice is being offered for descriptive results I would also favour having both SDdiff and SD1 as options for standardizer. It might even be instructive for people to see when the two are very different (e.g. 'small but consistent effect' or 'large but variable effect'). Or maybe that was obvious already and this discussion is only about inference, I'm not sure. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/jasp-stats/jasp-desktop/issues/2498#issuecomment-385611085>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABX2uYRyjBjdhQtc2Ndzhtvc_WS1aTCQks5tuAVEgaJpZM4Tr-Hh> .

fplatz · 2018-05-04T08:24:27Z

This discussion has a long tradition and it seems to me that there is no "general" approach. However, it might be interesting to have several options for the ES depending on your research question (see also Morris & DeShon, 2002).

Best, Friedrich

Morris, S. B., & DeShon, R. P. (2002). Combining effect size estimates in meta-analysis with repeated measures and independent-groups design. Psychological Methods, 7(1), 105-125.

EJWagenmakers · 2018-05-04T08:42:26Z

OK, I can see how the SD1 approach is informative, descriptively. For inference I would still want to stick to SDdiff though. But it would be possible to add the SD1 method somewhere in the GUI. Any suggestions?
E.J.

lindeloev · 2018-05-04T12:22:06Z

To me, it would be most intuitive to put it under "effect size" in the paired t-test widget:

Standardizer:

SD of difference [default!]
SD of column 1
SD of column 2

In the Bayesian widget, it would just be part of the basic widget since everything is standardized. Further down the line, it would be great to have the choice between report effects in standardized (current) or original units. Then the choice of standardizer could be subsumed under the former. Just mentioning this since other issues point to a want for more options concerning effect sizes (Hedge's g: #2270 and #2094, corrections for correlations: #1576).

@EJWagenmakers, If you had a particular reason in mind why you would not provide a CI when using SD1 as standardizer, I would be interested to learn!

EJWagenmakers · 2018-05-07T22:14:54Z

Well I guess I am willing to provide a confidence interval on effect size when SD1 is used as a standardizer (or a credible interval, possibly with a vague prior). But the test just seems to be on the treatment effect, and therefore involve SDdiff. And this produces the problem. If we just let users define effect size by means of SD1, and then conduct a test for the treatment effect using SD1, then this does not strike me as meaningful. So the challenge is to produce a point estimate and CI for the SD1 case, but without using it to do the test. This means it ought to be presented in the descriptives table, for instance. I'll discuss possible ways to do this with the team.

tomtomme · 2024-02-18T16:41:13Z

Still valid with 0.19 beta. The challenge: "Produce a point estimate and CI for the SD1 case [pre-test], but without using it to do the test. This means it ought to be presented in the descriptives table."

JohnnyDoorn self-assigned this Oct 22, 2018

JorisGoosen transferred this issue from jasp-stats/jasp-desktop Nov 14, 2018

koenderks added the Module: jaspTTests label Jun 11, 2021

juliuspfadt added the Feature Request label Mar 21, 2023

tomtomme mentioned this issue Dec 28, 2023

Enable posterior distributions with raw units. #437

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose standardizer for effect size in paired t-test #156

Choose standardizer for effect size in paired t-test #156

lindeloev commented Apr 29, 2018

EJWagenmakers commented Apr 30, 2018

lindeloev commented Apr 30, 2018

EJWagenmakers commented Apr 30, 2018

luketudge commented May 1, 2018

lindeloev commented May 1, 2018 via email

fplatz commented May 4, 2018

EJWagenmakers commented May 4, 2018

lindeloev commented May 4, 2018

EJWagenmakers commented May 7, 2018

tomtomme commented Feb 18, 2024

Choose standardizer for effect size in paired t-test #156

Choose standardizer for effect size in paired t-test #156

Comments

lindeloev commented Apr 29, 2018

EJWagenmakers commented Apr 30, 2018

lindeloev commented Apr 30, 2018

EJWagenmakers commented Apr 30, 2018

luketudge commented May 1, 2018

lindeloev commented May 1, 2018 via email

fplatz commented May 4, 2018

EJWagenmakers commented May 4, 2018

lindeloev commented May 4, 2018

EJWagenmakers commented May 7, 2018

tomtomme commented Feb 18, 2024