Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distortion test #25

Open
YUYU-TING opened this issue Aug 1, 2023 · 1 comment
Open

Distortion test #25

YUYU-TING opened this issue Aug 1, 2023 · 1 comment

Comments

@YUYU-TING
Copy link

Dear authors,

Since the numerator of the empirical p-value of the article and the GitHub code is written differently, are the following H0 and Ha correct?
Ho: causal effect estimates not distorted by horizontal pleiotropy
Ha: causal effect estimates are distorted by horizontal pleiotropy

paper
coding

In the distortion test, how to draw with replacement nE − 2no from the entire set of non-outlier variants?

In the empirical p-value, how to calculate the expected distortion?

Thank you.

Best,
Tina

@CypRiv
Copy link

CypRiv commented Aug 31, 2023

I noticed this thread while I was about to post similar questions regarding the Distortion test. Apart from the questions that have already been mentioned, I am adding another one related to the calculation of the expected bias.

In the paper, it is stated "The null distribution is generated by substituting nO variants detected as outliers by the MR-PRESSO outlier test with nE−2nO non-outliers, which are drawn with replacement from the entire set of non-outlier variants."

However, I believe that the behavior of the getRandomBias function is different:

  • The code starts by placing the outlier indices at the beginning of the indices array:
    indices <- c(refOutlier, replicate(nrow(data)-length(refOutlier), sample(setdiff(1:nrow(data), refOutlier))[1]))

  • Then it computes the linear regression with the first indices[len(indices)-len(outliers)] rows:
    data = data[indices[1:(length(indices) - length(refOutlier))], ]

  • This implies that the indices corresponding to the outliers are always included in the model. I think this contradicts the description in the paper, which specifies that the outlier variants should be replaced by randomly sampled non-outlier variants.

  • I believe that for the code to behave similarly to the paper's description, the indices array should be obtained by placing the outlier indices at the end:
    indices <- c(replicate(nrow(data)-length(refOutlier), sample(setdiff(1:nrow(data), refOutlier))[1]), refOutlier)

Thanks a lot for providing clarification on this!

Best,
Cyprien

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants