-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Colesky factorization failed #11
Comments
Hi Eugene, |
Hello Xihao, Thank you for the quick reply. Unfortunately, when I implemented the null model via GENESIS as suggested:
I get a very similar error:
Any suggestions? I would hope this is very easy to reproduce. I simply acquired the sparse matrix (.rel) from The UK Biobank website, and converted it into a symmetric dsTMatrix from the Matrix package. I then limited it to individuals included in the covariate table (given to 'x' above). |
Hi Eugene, Thank you very much for letting me know. In this case, I think you could try some different thresholds of the sparse GRM by gradually increasing the threshold from 0.0442 to something slightly higher and see if the null model would fit. For this route, it would be ideal to keep all pairwise (estimated) relatedness within a cluster, even if they are below the threshold, rather than setting all entries below the threshold to 0 in the sparse GRM. To do this, the Hope this helps! Best, |
Hi Xihao, I don't think there is a 'thresh' parameter? Did you mean kins_cutoff? I will try the other method as well. |
Hi Xihao, I tried both approaches and still got the same error. Is markSparseMatrix() really the solution given that the matrix is already sparse? I used a slightly different (external) approach to threshold and it seemed to work:
Where 0.05 is the threshold. This seems a bit like a suboptimal solution since it misrepresents the relatedness between individuals. Did I misunderstand your comment of:
? |
Hi Eugene,
Hope this is more clear. Best, |
Hello Xihao, I've recently noticed via testing additional phenotypes that this seems very dependent on the case/control imbalance of the phenotype being studied. In a set of ~200,000 individuals, phenotypes with a prevalence of < ~0.2% seem to always run into the Cholesky issue I mentioned in my original post. Just wondering if you have any advice on how to handle this as the thresholding issue now seems suboptimal when it is not possible to know a priori what the threshold should be in the first place. |
Hi Eugene, |
Did you always see the Cholesky issue in this subset (regardless of which phenotypes), or just sometimes? The kinship coefficients provided by the UK Biobank do not give a positive definite kinship matrix (due to the cut-off at 0.0442), but this may or may not be the issue for you, depending on which subset you were using. If the kinship matrix for your subset is not positive definite, I think there was really just one block of a few hundred individuals that caused the problem. You could drop these individuals to avoid the Cholesky issue. A better solution would be the UK Biobank fixes this block of individuals by filling in values that have been zeroed out (e.g., 0.0441) to make it positive definite, and release a new kinship matrix. If you only saw this issue sometimes, and got a counter-example that successfully converged in the same subset (with a different or a simulated phenotype), then it was likely a convergence problem related to your case/control phenotypes. Best, |
Hello,
I am having some issues with running STAAR using a pre-computed GRM. When running
fit_null_glmmkin
using the following:Where
data_for_STAAR
is a simple data.table of covariates listed in the formula for ~410k European-ancestry individuals, I get the following error:This GRM being used is identical to the one provided in Bycroft et al. for the main UK Biobank genetic publication from 2018 (https://doi.org/10.1038/s41586-018-0579-z) that ranges from 0.0442 - 0.5. I have also tried simply multiplying the kinship factor by 2 to bring it in-line with that generated by the
kinship2
package, but that failed as well.Do you have any suggestions/advice to get this working? Oddly I was able to run STAAR perfectly fine with a previous GRM that was calculated using a (poorly optimised) GRM from SAIGE, but I wanted to use a more accurate representation.
Thanks!
The text was updated successfully, but these errors were encountered: