You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
added a parameter called "stepsize", which adjust the weights for each datapoint. This parameter is set to be dataset size / batch size in our ICML paper.
a rough explanation for this phenomenon:
Assume that K is large (say 80). after first few mini-batches, the bayesian posterior is going to be multi-modal due to uncertainty (too many parameters compared to data). therefore, for the latent samples of initial mini-batches as well as the variational approximate to be accurate, we'll set J (#latent samples per data point) to be large.
So any of the three following solutions can mitigate the accuracy denegeration:
increase J.
more sweeps over dataset (more than 1).
set stepsize to be large.
Revisiting the binary classification experiment, the solutions lead to the following results:
set J = 10, test accuracy = 0.76.
set pass = 3, test accuracy = 0.81.
set stepsize = 25, test accuracy = 0.81.
The third solution seems to be most computationally efficient one.
with the following setting
the results seem to differ from original
paMedLDAgibbs
implementation. Here the accuracy drops dramatically as the number of topicsk
increases.The text was updated successfully, but these errors were encountered: