-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP Samplers should keep the same dtype as provided by theano. #1253
Conversation
else: | ||
q = q0 + delta | ||
q0 = q0.astype(theano.config.floatX) | ||
q = (q0 + delta).astype(theano.config.floatX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this effect numerical precision when theano is not actually needed? Can we use model.dtype
instead of theano.config.floatX
? This way if the model does not actually need theano or to run on GPU, the sampling can still run in float64
even if theanorc
specifies float32
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you mean? Theano is always needed and I would expect that model.dtype
must match theano.config.floatX
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Also, downcast only needs to happen at the very last step before the theano function is called. This is why I prefer the allow_input_downcast=True
setting of theano.function
as it only happens quietly when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this over here #1265 but it doesn't seem to work.
e4614b6
to
c1a1d0e
Compare
else: | ||
q = q0 + delta | ||
q0 = q0.astype(theano.config.floatX) | ||
q = (q0 + delta).astype(theano.config.floatX) | ||
|
||
q_new = metrop_select(self.delta_logp(q, q0), q, q0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey, is a theano function involving logp
recompiled every time a Metropolis-Hastings accept-reject happens? This is going to be painfully slow (although theano
does cache some of the optimisations, you still only want to compile a function once).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is just a python function (contrary to what I said earlier, I misremembered because we made delta_logp
a theano function, and that function should only be compiled once).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably can easily turn metrop_select
into a function
using ifelse
. Then turn the proposal generation to theano random.
c1a1d0e
to
beb720e
Compare
Merged in #1265 in case it helps. It now seems this is the only way to enforce correct updating of shared variables. |
Can someone check if we can sample on the GPU now? @fhuszar |
It seems to work on my laptop (Python 3.5.1, Theano 0.8.2, PyMC3 from the branch |
@cynddl Awesome, thanks for checking! |
@cynddl A really good test would be https://github.com/twiecki/WhileMyMCMCGentlySamples/blob/master/content/downloads/notebooks/bayesian_neural_network_lasagne.ipynb (specifically the convolutional net at the bottom). Want to test with that? |
I've been trying with the branch
Same behavior on CPU or GPU. Maybe the problem is too large for my computer. Any idea? |
Yeah, possibly, it's definitely slow for such a big model. You could also try the smaller https://github.com/pymc-devs/pymc3/blob/master/docs/source/notebooks/bayesian_neural_network_advi.ipynb |
What's the status on this - are we working on a GPU - I need to get one for home so I can check this out. It looks some cool work - and it looks like a lot of the errors are fixed. I've not had time or the interest to look into this :) |
Yeah, I'm not quite sure what the best way to go about this is. If we enforce the same dtype everywhere model creation suffers. Maybe there is a way to only switch it on explicitly but that seems a bit cumbersome. |
@twiecki I'm somewhat confused. I think I'm running into this same error ( Were these commits (inadvertently) reverted? Apologies if I'm missing something obvious. (PS: I'm on CPU, not GPU, but with |
I did: 1c9adc6 I thought the PR fixed the problem even with |
Like I wrote I think the commits in this PR (at least beb720e) have somehow disappeared from master. I'm not git wizard enough to try to determine how, but this seems to be the case. Because of that, here Commenting the |
Hm, I reverted that merge because it caused some other unanticipated problems. Ideally we'd have dtypes consistent with Just to make sure, you're saying that the model works with that commit in place? |
Interesting, I suppose we should revisit why other people where seeing problems with that in place. |
Model works, both Metropolis and NUTS. Edit: note, just for completeness, I'm running on CPU with Edit 2: that branch gives me massive dtype-related errors with ADVI, just sampling works well now. |
Related: #1246