You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for sharing. I found that in the 2-D simulation experiment the learning rate(injected Gaussian noise level) is kept constant, which doesn't satisfy the assumption 1 in your AAAI '16 paper. While in previous works e.g. Welling 2011, I found a polynomial decay scheme is applied. Will this be a problem?
The text was updated successfully, but these errors were encountered:
In theory, the learning rate decay guarantees the asymptotic convergence. In practice, its implementation varies case by case.
Depending on your purpose, the answer can be different.
(1) To validate the theory, or pursue better performance, the decay scheme may help.
(2) To see the performance difference between SGLD and pSGLD, I don't think the decay scheme matters much, to see pSGLD algorithm gives better samples by adjusting the learning rate. One experiment you can try is to compare the two algorithms with the same learning rate decay. That said, perhaps more rigorous comparison is to use the same learning rate decay. I wish I have done it at the first place.
I tried pSGLD on the first experiment in [Welling 2011], which has 2 modes in the posterior and the two variables are strongly negatively correlated. I get proper posterior samples with similar learning rate annealing scheme.
Thanks.
Hi Chunyuan,
Thanks for sharing. I found that in the 2-D simulation experiment the learning rate(injected Gaussian noise level) is kept constant, which doesn't satisfy the assumption 1 in your AAAI '16 paper. While in previous works e.g. Welling 2011, I found a polynomial decay scheme is applied. Will this be a problem?
The text was updated successfully, but these errors were encountered: