susie_rss gives different results using different R #132

Zepeng-Mu · 2021-06-30T18:56:22Z

Hello,
I've been trying to run susie_rss using different R matrices.
The first version I used is the correlation matrix of genotype count from a reference panel. Part of which looks like:

            [,1]        [,2]        [,3]        [,4]        [,5]
[1,]  1.00000000  1.00000000 -0.09664246  1.00000000  1.00000000
[2,]  1.00000000  1.00000000 -0.09664246  1.00000000  1.00000000
[3,] -0.09664246 -0.09664246  1.00000000 -0.09664246 -0.09664246
[4,]  1.00000000  1.00000000 -0.09664246  1.00000000  1.00000000
[5,]  1.00000000  1.00000000 -0.09664246  1.00000000  1.00000000

In this way I did not need to scale or center the count matrix, because cor gives the same output. The result makes much sense to me:

I also saw in package documentation that the R option can be:

A p by p symmetric, positive semidefinite matrix.
It can be X'X, the covariance matrix X'X/(n-1), or a correlation matrix.
It should be estimated from the same samples used to compute bhat and shat.
Using an out-of-sample matrix may produce unreliable results.

So the second matrix I used is t(geno) %*% geno, where geno is the nxp matrix of counts that is centered and scaled. The matrix looks like this:

           1:37312926 1:37312963 1:37313050 1:37313060 1:37313086
1:37312926  3201.0000  3201.0000  -309.3525  3201.0000  3201.0000
1:37312963  3201.0000  3201.0000  -309.3525  3201.0000  3201.0000
1:37313050  -309.3525  -309.3525  3201.0000  -309.3525  -309.3525
1:37313060  3201.0000  3201.0000  -309.3525  3201.0000  3201.0000
1:37313086  3201.0000  3201.0000  -309.3525  3201.0000  3201.0000

This matrix only differs from the first one by a factor of 3201, as I have 3202 individuals in the 1KG reference panel. So I expected the result from susie_rss would be the same as before. However, using this matrix I got no CS and PIP is flat.

I'm wondering what can lead to such huge difference in PIP when the R is very similar. Should I just use the correlation matrix that is directly calculated from genotype matrix?

Thanks so much!

The text was updated successfully, but these errors were encountered:

zouyuxin · 2021-06-30T19:24:59Z

In susie_rss documentation, R is a p by p correlation matrix, so you should use the correlation matrix. If you multiply R by 3201, the z scores should multiply by sqrt(3201) to get the same result.

stephens999 · 2021-06-30T19:32:33Z

@zeping where is this documentation that you quoted (which seems to be out of date) A p by p symmetric, positive semidefinite matrix. It can be X'X, the covariance matrix X'X/(n-1), or a correlation matrix. It should be estimated from the same samples used to compute bhat and shat. Using an out-of-sample matrix may produce unreliable results.

…

On Wed, Jun 30, 2021 at 2:25 PM Yuxin Zou ***@***.***> wrote: In susie_rss documentation, R is a p by p correlation matrix, so you should use the correlation matrix. If you multiply R by 3201, the z scores should multiply by sqrt(3201) to get the same result. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#132 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANXRROWESZLMQOPW4W33JDTVNVRNANCNFSM47S5WWXQ> .

pcarbo · 2021-06-30T19:40:10Z

@stephens999 It is in susie_ss.R.

Zepeng-Mu · 2021-06-30T19:42:31Z

Hi, I'm using version "0.11.43" of susieR and found that documentation in ?susie_suff_stat

zouyuxin · 2021-06-30T19:57:15Z

The documentation you quoted is for susie_suff_stat, which is different from susie_rss. If you are using susie_rss, please check ?susie_rss.

stephens999 · 2021-07-01T13:40:58Z

@zouyuxin so in susie_suff_stat I find the documentation of the R matrix confusing.
In the details it says R = (1/(n-1))X'X
But in the documentation for R it says it can also be X'X.
why is that?

zouyuxin · 2021-07-01T15:35:41Z

In susie_suff_stat, the R is converted to a correlation matrix internally. I agree it is clearer to require R to be a correlation matrix in the documentation. I will update the documentation.

gaow closed this as completed Nov 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

susie_rss gives different results using different R #132

susie_rss gives different results using different R #132

Zepeng-Mu commented Jun 30, 2021

zouyuxin commented Jun 30, 2021

stephens999 commented Jun 30, 2021 via email

pcarbo commented Jun 30, 2021

Zepeng-Mu commented Jun 30, 2021

zouyuxin commented Jun 30, 2021

stephens999 commented Jul 1, 2021

zouyuxin commented Jul 1, 2021

susie_rss gives different results using different R #132

susie_rss gives different results using different R #132

Comments

Zepeng-Mu commented Jun 30, 2021

zouyuxin commented Jun 30, 2021

stephens999 commented Jun 30, 2021 via email

pcarbo commented Jun 30, 2021

Zepeng-Mu commented Jun 30, 2021

zouyuxin commented Jun 30, 2021

stephens999 commented Jul 1, 2021

zouyuxin commented Jul 1, 2021