Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display optimization progress for find_MAP #1813

Closed
wants to merge 10 commits into from

Conversation

bwengals
Copy link
Contributor

@bwengals bwengals commented Feb 23, 2017

If disp=True in call to find_MAP, then optimization progress is printed out.

  • If ipython notebook is being used, an updating table showing parameter progress is displayed. Colors are taken from PyMC3 logo.
  • If ipython notebook is not being used, more basic info (number of iterations, log posterior value, ||grad||, runtime) is displayed every 2 seconds.

Also in this PR is a small change to appearances of tqdm. If the user is developing in a notebook, the better looking tqdm notebook iteration bar is displayed.

Let me know if there are any changes, cosmetic or otherwise, you would like or would prefer.

terminal version

notebook version

with Model():
mu = Normal('mu', 0, 1, testval=100)
# test with verbosity turned off (it is on by default)
map_est1 = starting.find_MAP(fmin=starting.optimize.fmin_bfgs, disp=True, callback = lambda x: 1.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no space for kwargs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be callback=lambda x: 1.0

Copy link
Member

@twiecki twiecki Feb 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and fmin=starting.optimize.fmin_bfgs, disp=True

# should not error
with Model():
mu = Normal('mu', 0, 1, testval=100)
map_est1 = starting.find_MAP(fmin=starting.optimize.fmin_bfgs, disp=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no extra spacing

""".format(parameter, values["size"], values["valstr"])
html += "</table>"
self.param_table.value = html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty-lines

@twiecki
Copy link
Member

twiecki commented Feb 23, 2017

Do you have a screenshot of what the Monitor in find_MAP() looks like?

@bwengals
Copy link
Contributor Author

bwengals commented Feb 23, 2017

@twiecki added screenshots to the original comment. One thing you'll notice is that values of untransformed variables are printed. Is there a clean way to transform them back? I had a hard time finding that poking around.

@twiecki
Copy link
Member

twiecki commented Feb 23, 2017

I'm not sure about Monitor. find_MAP() is not really a strong focus of this package. We could use it in other parts too (MCMC) but not sure how useful that really is.

Then I think there is the problem of creating a lot of html output in a cell. If you look at the raw NB after an optimization run, does it have a huge amount of html code in it?

@bwengals
Copy link
Contributor Author

bwengals commented Feb 23, 2017

It looks like the monitor does not persist across saves. Since it uses the interactive widgets, one needs to do this to save it.

And additionally, I imagine a version for MCMC chains could get messy pretty fast. I guess my arguments in favor would be that finding the MAP point is a good first step in (not all!) but many situations, like GPs. It is helpful (for me at least) to understand subtle problems in model specification by seeing the parameters veer off somewhere, without waiting around till it terminates. But very true, it's not an optimization package!

Could dial back the css to not seem so "branded", since its not the focus.

@fonnesbeck
Copy link
Member

I'm not necessarily opposed, as long as it stays relatively lightweight. I do think more sophisticated monitoring would be nice for NUTS and ADVI. One could make it optional, with a monitor=True flag, or similar.

I've thought about using Bokeh for some graphical monitoring, similar to what dask does for its dashboard. But you wouldnt necessarily want it all the time.

@bwengals
Copy link
Contributor Author

Could do something similar for the samplers and ADVI. For something lightweight, maybe it would look similar but show marginal mean and standard deviation in the "Current Value" column?

Yeah, default should be monitor=False. Like this?

@@ -187,7 +187,11 @@ def _sample(draws, step=None, start=None, trace=None, chain=0, tune=None,
sampling = _iter_sample(draws, step, start, trace, chain,
tune, model, random_seed)
if progressbar:
sampling = tqdm(sampling, total=draws)
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should refactor this to our own tqdm function which does this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good. Where would you recommend it goes?

with Model():
mu = Normal('mu', 0, 1, testval=100)
# test with verbosity turned off (it is on by default)
map_est1 = starting.find_MAP(fmin=starting.optimize.fmin_bfgs, disp=True, callback = lambda x: 1.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no white-space around callback kwarg =.




class Monitor(object):
Copy link
Member

@twiecki twiecki Mar 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does there exist nothing in the jupyter (or bokeh) universe that does this already?

Copy link
Contributor Author

@bwengals bwengals Mar 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gosh, not that I've seen. Couldn't find anything similar after a quick look around, but I'll check further. I took the idea of monitoring optimization progress from GPy, theirs was made from scratch also.

@aseyboldt
Copy link
Member

aseyboldt commented Mar 5, 2017

I think a couple of calls to tqdm.close() are missing. Otherwise it sometimes prints weird things into the next output in the notebook if the iterator is not exhausted (because of an exception or a keyboard interrupt). Something like this should do it:

vals = tqdm.tqdm(some_iter)
try:
    for val in vals:
        pass
finally:
    vals.close()

Or alternatively using contextmanager.closing.
Most of them were missing before this patch...

@@ -401,7 +405,11 @@ def sample_ppc(trace, samples=None, model=None, vars=None, size=None, random_see
seed(random_seed)

if progressbar:
indices = tqdm(randint(0, len(trace), samples), total=samples)
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close missing

uw_i, elbo_current = f()
if np.isnan(elbo_current):
raise FloatingPointError('NaN occurred in ADVI optimization.')
# set up progress bar
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing close

@@ -400,7 +404,14 @@ def rvs(x):
trace = pm.sampling.NDArray(model=model, vars=vars_sampled)
trace.setup(draws=draws, chain=0)

range_ = trange(draws) if progressbar else range(draws)
if progressbar:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing close

@@ -518,9 +518,14 @@ def is_shared(t):
updates = OrderedDict(optimizer(loss=-1 * elbo, param=params))
f = theano.function(tensors, elbo, updates=updates, mode=mode)

# set up progress bar
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing close

@bwengals
Copy link
Contributor Author

bwengals commented Mar 6, 2017

Thanks @aseyboldt, nice catch. Will include this.

@bwengals
Copy link
Contributor Author

bwengals commented Mar 7, 2017

Want to give a heads up that, in addition to the suggestions, I'm working on having it print untransformed variables, ie. the value of sigma instead of sigma_log_ .

@bwengals bwengals deleted the find_MAP-monitor branch June 9, 2017 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants