Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Choose standardizer for effect size in paired t-test #156

Open
lindeloev opened this issue Apr 29, 2018 · 10 comments
Open

Choose standardizer for effect size in paired t-test #156

lindeloev opened this issue Apr 29, 2018 · 10 comments

Comments

@lindeloev
Copy link

If you do a Paired samples t-test on pretest and posttest scores, you would use the pretest SD as standardizer, not the SD of the pairwise differences as is currently used. This is because the pretest SD represents the population variance that you want to "move" the subjects relative to.

So I suggest that you add an option to choose standardizer [var2-var1, var1, var2]. Both for frequentist t-test and the Bayesian one. The ES plays a central role in the Bayesian t-test since it is currently used for setting priors, etc. The ability to use the non-standardized effect size for priors and plots would also be wonderful, but that's another issue!

@EJWagenmakers
Copy link
Collaborator

Do you have a reference perhaps? If the test is on the score difference, I am not sure how we can define the test-relevant parameter as one that involves only the pretest...
E.J.

@lindeloev
Copy link
Author

Cumming (2013): https://link.springer.com/article/10.3758%2Fs13428-013-0392-4, section "choice of standardizer". Wolfgang Viechtbauer (metafor developer) has also raised it here and there, e.g.: https://stats.stackexchange.com/a/256205/17459.

The pretest would be just one of the variables, so if you could select one of [var2-var1, var1, var2], that could work. It would complicate the widget a bit, though.

@EJWagenmakers
Copy link
Collaborator

Thanks; I am afraid that, unless I am missing something, I disagree with Cumming here. We want to test the difference, that is, the treatment effect. Consequently, the effect size we want to learn about also concerns the difference. If every participant shows similar improvement, SDdiff is small and effect size is therefore large; this is how it should be. On the other hand, if you use SD1 then effect size for the treatment effect is determined by pretest homogeneity. But that says nothing about the effect of the treatment.
E.J.

@luketudge
Copy link

I think Cumming's point was that using SDdiff as standardizer is indeed appropriate for inference about the difference (for which we want to know the sampling distribution of the difference rather than of the mean in any one group), but that when viewed purely descriptively it can be misleading. Is that right?

If the treatment effect is extremely consistent across all participants but tiny relative to the pre-treatment variance then the effect will appear very 'large' with SDdiff as standardizer, where most people would nonetheless think of it intuitively as only a 'small effect'. Using SD1 as standardizer gives a more intuitive descriptive effect size because it tells us what the average change is in terms of the initial differences among people (i.e. does the treatment tend to 'move someone up' within the pre-treatment population a lot or only a little?)

So if the choice is being offered for descriptive results I would also favour having both SDdiff and SD1 as options for standardizer. It might even be instructive for people to see when the two are very different (e.g. 'small but consistent effect' or 'large but variable effect').

Or maybe that was obvious already and this discussion is only about inference, I'm not sure.

@lindeloev
Copy link
Author

lindeloev commented May 1, 2018 via email

@fplatz
Copy link

fplatz commented May 4, 2018

This discussion has a long tradition and it seems to me that there is no "general" approach. However, it might be interesting to have several options for the ES depending on your research question (see also Morris & DeShon, 2002).

Best, Friedrich

Morris, S. B., & DeShon, R. P. (2002). Combining effect size estimates in meta-analysis with repeated measures and independent-groups design. Psychological Methods, 7(1), 105-125.

@EJWagenmakers
Copy link
Collaborator

OK, I can see how the SD1 approach is informative, descriptively. For inference I would still want to stick to SDdiff though. But it would be possible to add the SD1 method somewhere in the GUI. Any suggestions?
E.J.

@lindeloev
Copy link
Author

To me, it would be most intuitive to put it under "effect size" in the paired t-test widget:

Standardizer:

  • SD of difference [default!]
  • SD of column 1
  • SD of column 2

In the Bayesian widget, it would just be part of the basic widget since everything is standardized. Further down the line, it would be great to have the choice between report effects in standardized (current) or original units. Then the choice of standardizer could be subsumed under the former. Just mentioning this since other issues point to a want for more options concerning effect sizes (Hedge's g: #2270 and #2094, corrections for correlations: #1576).

@EJWagenmakers, If you had a particular reason in mind why you would not provide a CI when using SD1 as standardizer, I would be interested to learn!

@EJWagenmakers
Copy link
Collaborator

Well I guess I am willing to provide a confidence interval on effect size when SD1 is used as a standardizer (or a credible interval, possibly with a vague prior). But the test just seems to be on the treatment effect, and therefore involve SDdiff. And this produces the problem. If we just let users define effect size by means of SD1, and then conduct a test for the treatment effect using SD1, then this does not strike me as meaningful. So the challenge is to produce a point estimate and CI for the SD1 case, but without using it to do the test. This means it ought to be presented in the descriptives table, for instance. I'll discuss possible ways to do this with the team.

@JohnnyDoorn JohnnyDoorn self-assigned this Oct 22, 2018
@JorisGoosen JorisGoosen transferred this issue from jasp-stats/jasp-desktop Nov 14, 2018
@tomtomme
Copy link
Member

Still valid with 0.19 beta. The challenge: "Produce a point estimate and CI for the SD1 case [pre-test], but without using it to do the test. This means it ought to be presented in the descriptives table."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants