Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Total SD, take 2 #133

Open
aaronrudkin opened this issue Jun 12, 2018 · 1 comment
Open

Total SD, take 2 #133

aaronrudkin opened this issue Jun 12, 2018 · 1 comment
Milestone

Comments

@aaronrudkin
Copy link
Contributor

In #111, @chadhazlett proposed being able to specify a total standard deviation / variance for draw_normal_icc. I implemented this in early May -- in this implementation, the user supplies an ICC and a total_sd; we generate the ICC variable stochastically, fixing one of the sds as 1 and deriving the other from the ICC; then, the total_sd variable is used to rescale the variable at the end.

An advantage of this approach, which is the one I think Chad suggested, is that it ensures exact total standard deviation 100% of the time.

A disadvantage of this approach, as Neal mentioned, is that the rescaling will possibly distort the between group differences. Neal, instead, proposed noting that total = within + between. So basically, rather than a post-hoc scaling, you'd specify any two variables and get the other two. I'd have to work out the math, but this would basically leave us with two constraints; the ICC and one of within/between constrains the other of within/between and total; the total and within/between constrains the other of within/between and ICC. I would have to think a bit about what possible combinations of arguments we would allow.

I agree the solution I came up with is imperfect because the other three variables are targets for the stochasticity to hit, while total_sd is an exact mechanical consequence of the scaling.

Issuing this so that there can be some discussion.

@nfultz
Copy link
Contributor

nfultz commented Jun 12, 2018

I think rescaling is generally not what people would expect, eg people don't usually expect

sd(rnorm(100)) == 1

exactly - there's sampling variability there, and I think that when ICC = 0 we should be the same as rnorm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants