Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to perform online inference #2

Closed
jengelman opened this issue Jun 26, 2017 · 2 comments
Closed

how to perform online inference #2

jengelman opened this issue Jun 26, 2017 · 2 comments

Comments

@jengelman
Copy link

Looking at the project website, bnpy supports SVI, but I can't figure out how to actually perform online updates of a single model, sine bnpy.Run seems to be meant for batch mode. Does anyone have an example I could take a look at?

@michaelchughes
Copy link
Contributor

Example of SVI for the same toy mixture of Gaussians problem as here: http://bnpy.readthedocs.io/en/latest/examples/01_asterisk_K8/plot-02-demo=vb_single_run-model=dp_mix+gauss.html

trained_model, info_dict = bnpy.run(
    dataset, 'DPMixtureModel', 'Gauss', 'soVB',
    output_path='/tmp/AsteriskK8/trysvi-K=10/',
    nLap=100, nTask=1, nBatch=10,
    sF=0.1, ECovMat='eye',
    K=10, initname='randexamples')

Note that we simply need to specify "soVB" as the algorithm and specify a number of minibatches.

You can set the learning rate "rho"'s decay schedule via the "--rhoexp" and "--rhodelay" kwargs, so at iteration t, we have:

    rho_t = (t + rhodelay)**(-rhoexp)

Generally, rhodelay is a positive real, and rhoexp is in [0.5, 0.9999...]

I'll note importantly that SVI (and all the algorithms we support currently) is more properly called a "minibatch" algorithm than an "online/streaming" algorithm. That is, the classic SVI algorithm requires that the full size of the dataset be known in advance. So you can't really keep adding data as time goes on. Generally, what the code above does is load the entire dataset at once, subdivide into batches, and then proceed with learning one batch at a time. If you need to load data one batch at a time, we support that too but is a little bit trickier (please raise another issue for details).

@jengelman
Copy link
Author

Great, that's a good start. I'll open the other issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants