Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LMM.getPosteriorWeights() returns inconsitent results #12

Closed
remomomo opened this issue Jan 20, 2021 · 3 comments
Closed

LMM.getPosteriorWeights() returns inconsitent results #12

remomomo opened this issue Jan 20, 2021 · 3 comments

Comments

@remomomo
Copy link

remomomo commented Jan 20, 2021

Hi,

I'm using fastlmm.inference.lmm.LMM to fit mixed models with no background kernel.

After fitting the model, I want to get the posterior weights of the SNPs in the kernel. I do this with LMM.getPosteriorWeights. I've noticed that getPostiorWeights returns inconsistent results, depending on how it's parameterized.

Here is a full reproducible example:

import fastlmm
import numpy as np
from numpy.linalg import cholesky
from sklearn.metrics.pairwise import rbf_kernel
from fastlmm.inference.lmm import LMM
from matplotlib import pyplot as plt

n_snp = 10

np.random.seed(1)

# fixed effects
X = np.random.randn(1000,5)

snps = np.random.randn(1000,n_snp) # 10 SNPs

pos = np.linspace(0,n_snp,num=10)
R = rbf_kernel(pos[:,np.newaxis], gamma=-1*np.log(0.5)/(2**2))


plt.imshow(R) # correlation matrix of SNPs

R /= np.max(R)

L = cholesky(R + np.eye(len(R))*1e-7)

# correlate the SNPs
snps = L.dot(snps.T)

snps.shape

snps = snps.T

plt.scatter(x=snps[:,0],y=snps[:,1])

plt.imshow(np.corrcoef(snps.T))

w = np.random.randn(n_snp) # single variant weights
w[0:7] = 0. 
w # 3 causal SNPs and 7 noise SNPs

b = np.array([0.1, 0.01, 0.5, -0.15, 0.05]) # true beta

# phenotype is a function of fixed effects X, random effects (snps) + noise
y = X.dot(b[:,np.newaxis])
y += snps.dot(w[:,np.newaxis])
y[:,0] += np.random.randn(1000)  # noise

lmm = LMM()
lmm.setG(snps)
lmm.setX(X)
lmm.sety(y[:,0])

lik = lmm.findH2()
lik

plt.scatter(b, lik['beta']) # fixed effects are accurately estimated

# get the weights of the snps in the kernel
beta_random = lmm.getPosteriorWeights(lik['beta'], h2=lik['h2'])

plt.scatter(w, beta_random) # far away from the real weights

logdelta = lmm.find_log_delta(n_snp)
logdelta

beta_random_2 = lmm.getPosteriorWeights(logdelta['beta'], logdelta=logdelta['log_delta'])

plt.scatter(w, beta_random_2) # much closer to the real weights

I now use the second approach with logdelta instead of h2, because these tests show that the estimated weights are much closer to the true weights.

My question is now: am I doing something wrong, or is this a bug?

Also, I would have liked to get the posterioir variances as well. The documentation states that this can be triggered by setting the flag returnVar = True. However, this option is not actually an argument to the function.

The documentation regarding what is returned is wrong as well. It states that the function should return a dictionary: " weights : [k0+k1] 1-dimensional array of predicted phenotype values". However, it simply returns a numpy array of weights.

@CarlKCarlK
Copy link

CarlKCarlK commented Jan 20, 2021 via email

@CarlKCarlK
Copy link

CarlKCarlK commented Jan 20, 2021 via email

@remomomo
Copy link
Author

Hi Carl,

Thanks for looking in to it and thanks for your suggestions!

I use the LMM class to calculate log likelihoods for a custom implementation of likelihood ratio tests. I also stumbled across lmm_cov yesterday... but it unfortunately doesn't have the function that returns posterior weights. I'm working with Christoph actually! I'll see if he remembers anything about these things.

cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants