-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Effective sample size or total N? #95
Comments
Hi Adam,
The case/control imbalance should be mostly handled by the conversion to liability scale by suppling the sample prevalence (--samp-prev) and the population prevalence (--pop-prev) when you run ldsc. See this post <https://groups.google.com/d/msg/ldsc_users/yJT-_qSh_44/MmKKJYsBAwAJ> for detail on how the effective N calculation is a component of the observed-to-liability transform.
Cheers,
Raymond
… On Dec 1, 2017, at 3:11 PM, Adam X. Maihofer ***@***.***> wrote:
Hi,
I am working with a lot of cohort study data, which has a high ratio of controls to cases (e.g. 4:1). LDSC h2 estimates are highly dependent upon the value put in for N, which is calculated as N cases + N controls. I am fairly sure that because of the high number of controls, calculating N this way is causing my h2 to be under-estimated.
To illustrate the extent of the problem, a response to an earlier issues comment states, "For example, if the entries in you N column are half what they should be, then you will over-estimate h2 by a factor of two." .
Naturally then if the entries in the N column are twice what they should be, you will under-estimate h2 by a factor of two. With cohort data, the N will definitely be overstated, as the excess controls don't actually do much at all to SNP odds ratio estimates or their variances (recall that beyond a 3:1 control to case ratio, excess controls basically add nothing to estimate precision).
Should I use effective sample size instead of overall N? I.e. from https://www.nature.com/articles/nprot.2014.071 <https://www.nature.com/articles/nprot.2014.071> , Neff = 2 / (1/Ncases + 1/Ncontrols). Does choice of N also bias rg estimates?
Thanks!
Adam
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#95>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AILEvbhY4TAU64TYhZPXAuSipcMSTpfTks5s8F1kgaJpZM4Qy2nT>.
|
I see, great, thanks! |
Hi Raymond and users, First of all many thanks for your excellent program and previous answers. I'm using LDSC with the following command: ~/project/tools/ldsc/ldsc.py In my input file I'm using the total sample size for case-control studies in the n column. I understand from the previous comments that using the flag --samp-prev will account for the proportion of cases and controls and the effective sample size. My question is: Many thanks! Paloma |
Hi,
I am working with a lot of cohort study data, which has a high ratio of controls to cases (e.g. 4:1). LDSC h2 estimates are highly dependent upon the value put in for N, which is calculated as N cases + N controls. I am fairly sure that because of the high number of controls, calculating N this way is causing my h2 to be under-estimated.
To illustrate the extent of the problem, a response to an earlier issues comment states, "For example, if the entries in you N column are half what they should be, then you will over-estimate h2 by a factor of two." .
Naturally then if the entries in the N column are twice what they should be, you will under-estimate h2 by a factor of two. With cohort data, the N will definitely be overstated, as the excess controls don't actually do much at all to SNP odds ratio estimates or their variances (recall that beyond a 3:1 control to case ratio, excess controls basically add nothing to estimate precision).
Should I use effective sample size instead of overall N? I.e. from https://www.nature.com/articles/nprot.2014.071 , Neff = 2 / (1/Ncases + 1/Ncontrols). Does choice of N also bias rg estimates?
Thanks!
Adam
The text was updated successfully, but these errors were encountered: