-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to deal with zero or near-zero mixture weights? #5
Comments
I agree there is no need to update covariance if it has weight 0. I was thinking update |
Related to this, there is the question of whether we should prune weights that are smaller than some pre-specific threshold, e.g., 1e-8. |
From my understanding, what we want is that we can specify many mixture components when fitting the model and use the data to learn weights. Therefore, there should be some components with weight zero because they don't match the data.
Specifying a threshold sounds like a good solution to me. If the weight is less than some threshold, say 1e-8, we directly set it to zero.
Best,
Yunqi
…________________________________
From: Peter Carbonetto <notifications@github.com>
Sent: Friday, January 1, 2021 4:01 PM
To: stephenslab/udr <udr@noreply.github.com>
Cc: Yunqi Yang <yunqiyang@uchicago.edu>; Mention <mention@noreply.github.com>
Subject: [stephenslab/udr] How to deal with zero or near-zero mixture weights? (#5)
It doesn't make much sense to update prior covariance matrices with weights that are zero, or near zero. @stephens999<https://urldefense.com/v3/__https://github.com/stephens999__;!!BpyFHLRN4TMTrA!oFArQQl6KBxj8dvDIcM6oBlUR9HFKcrDOsyGnGaFNElAC94Kb2E3qLGzlDJzgdHLjN_K1g$> @yunqiyang0215<https://urldefense.com/v3/__https://github.com/yunqiyang0215__;!!BpyFHLRN4TMTrA!oFArQQl6KBxj8dvDIcM6oBlUR9HFKcrDOsyGnGaFNElAC94Kb2E3qLGzlDJzgdHHwpwUrQ$> @zouyuxin<https://urldefense.com/v3/__https://github.com/zouyuxin__;!!BpyFHLRN4TMTrA!oFArQQl6KBxj8dvDIcM6oBlUR9HFKcrDOsyGnGaFNElAC94Kb2E3qLGzlDJzgdGQFJqzdg$> Ideas are welcome.
Here's an example:
set.seed(1)
dat <- readRDS("dat.rds")
f0 <- ud_init(X = as.matrix(dat$data),V = dat$S,U_scaled = list(),
U_unconstrained = dat$Ulist,n_rank1 = 0)
res <- ud_fit(f0,control = list(unconstrained.update = "teem",
resid.update = "none",
version = "R"))
# Performing Ultimate Deconvolution on 600 x 20 matrix (udr 0.3-30, "R"):
# data points are i.i.d. (same V)
# prior covariances: 0 scaled, 0 rank-1, 10 unconstrained
# prior covariance updates: none (scaled), none (rank-1), teem (unconstrained)
# mixture weights update: em
# residual covariance update: none
# max 20 updates, conv tol 1.0e-06
# iter log-likelihood |w - w'| |U - U'| |V - V'|
# 1 -3.0699325059934326e+04 4.65e-01 1.22e+02 0.00e+00
# 2 -3.0330311442511782e+04 9.41e-02 1.11e+02 0.00e+00
# ...
# 19 -2.9720957242869852e+04 2.79e-05 2.27e-01 0.00e+00
# 20 -2.9720891074856361e+04 7.07e-05 4.21e-01 0.00e+00
print(round(res$w,digits = 6))
# FLASH_1 FLASH_2 FLASH_3 FLASH_4 tFLASH PCA_1 PCA_2 PCA_3
# 0.000000 0.000000 0.001667 0.004884 0.229977 0.000000 0.000000 0.000116
# tPCA XX
# 0.265002 0.498354
dat.rds.gz<https://urldefense.com/v3/__https://github.com/stephenslab/udr/files/5759490/dat.rds.gz__;!!BpyFHLRN4TMTrA!oFArQQl6KBxj8dvDIcM6oBlUR9HFKcrDOsyGnGaFNElAC94Kb2E3qLGzlDJzgdFG8H__WQ$>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/stephenslab/udr/issues/5__;!!BpyFHLRN4TMTrA!oFArQQl6KBxj8dvDIcM6oBlUR9HFKcrDOsyGnGaFNElAC94Kb2E3qLGzlDJzgdE-C4cv4A$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AJUIA2JQKOY3KTI3KIIKTPDSXZA4XANCNFSM4VQSABAA__;!!BpyFHLRN4TMTrA!oFArQQl6KBxj8dvDIcM6oBlUR9HFKcrDOsyGnGaFNElAC94Kb2E3qLGzlDJzgdEOG_X5wA$>.
|
I added two checks in the code. The idea is if weight[i] < minval (1e-15, I set for default for now), we set weight[i] == 0 and skip updating U[[i]].
|
@yunqiyang0215 I wrote this in Slack but I'll post my comments here as well. I would suggest defining a new control parameter, e.g., Also there is another subtlety in your check because you are checking the prior weights So maybe if |
It doesn't make much sense to update prior covariance matrices with weights that are zero, or near zero. @stephens999 @yunqiyang0215 @zouyuxin Ideas are welcome.
Here's an example (thanks to Yuxin):
dat.rds.gz
The text was updated successfully, but these errors were encountered: