-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: fix bound evaluation in dist_math #1591
Conversation
does the original code not work? |
Original code? |
Actually build on miniconda with python2.7
ret = 1
for c in vals:
ret = ret * (1 * c)
return ret |
I assumed it would reintroduce #1449 |
Oh that was the original issue. I think that bears taking another look. Either we have broadcasting in |
Perhaps we should have a kwarg |
How does this handle the #1449 case? |
It does fix it, at least based on the example @AustinRochford used in the issue. Perhaps I will include that as a test, incase of regression. |
Please have a peek at this @ferrine and/or @AustinRochford |
I'm a bit sick and have a deadline for tomorrow:( I'll review the PR in some days |
@ferrine no big deal, thanks. Get well. |
This looks like a reasonable compromise to me. 👍 |
try: | ||
return tt.all(tt.stack([1*val for val in vals]), axis=0) | ||
except (TypeError, IndexError): | ||
return tt.all([tt.all(1 * val) for val in vals]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is it right to do this second version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes tt.stack
will fail due to different lengths of the elements of vals
(the TypeError; happens with the Poisson mixture) and when there is no axis to iterate over (resulting in the IndexError).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can try to ravel all variables after safe converting them in theano, then concatenate and finally use tt.all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought stack did most of that (including the conversion) for me automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You run into situations where vals
is not only of different types, but also of different lengths, so we need to deal with those potential combinations. tt.stack
doesn't like heterogeneous dimensions. Haven't seen a case that tricks both yet, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different dims can be excluded by ravel
hmm, they should always be broadcastable together though (since this is
elementwise). It seems to me that it should work like this.
Actually, I think I'm not clear why we're not sticking with my original
version. What exactly is the extra case that we want to handle?
This version will sometimes collapse down the conditions into a single
value, which is not what we want for elementwise. We do want that (or are
ok with it) for non-elementwise distributions, but we should just have a
different function for that. Or make a more explicit switch of some kind.
Right now, you might accidentally get the collapsing behavior, and that
will be confusing.
Stacking doesn't handle arrays that can be broadcasted to together but
aren't the same shape (some of the inputs to conditions may not be the full
shape), which we need for this.
On Mon, Dec 12, 2016 at 12:57 PM Chris Fonnesbeck <notifications@github.com> wrote:
*@fonnesbeck* commented on this pull request.
------------------------------
In pymc3/distributions/dist_math.py
<#1591>:
@@ -29,8 +29,16 @@ def bound(logp, *conditions):
def alltrue(vals):
- return tt.all([tt.all(1 * val) for val in vals])
+ """
+ Asserts truth of all elements in vals, across the lowest axis.
This maintains
+ element-wise evaluations for multivariate inputs.
+ """
+ try:
+ return tt.all(tt.stack([1*val for val in vals]), axis=0)
+ except (TypeError, IndexError):
+ return tt.all([tt.all(1 * val) for val in vals])
You run into situations where vals is not only of different types, but also
of different lengths, so we need to deal with those potential combinations.
tt.stack doesn't like heterogeneous dimensions. Haven't seen a case that
tricks both yet, however.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1591>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAQgsTctmyE_HDhdi5fH6V_qdpw4hLp9ks5rHbU7gaJpZM4LJxTP>
.
|
The original problem is #1449, which had some unexpected broadcasting behavior. |
Right, okay. Then suggest we make |
Perhaps its just simpler (or at least clearer, if not simpler) to call |
For multivariate distributions? That would be okay. Though I think we should still rename We should keep elemwise bound to be completely elemwise so that the logps are elemwise (and individual elements can be nans). Its useful for debugging. |
No, I mean for all the distributions (multivariates already do this). Unless I am missing something, if any condition for any element of a call to So, if you have a vector-valued Gamma distribution, for example, you would return |
Gotcha, I think that seems bad to me. Elementwise distributions should be as elementwise as possible. This change would make them partially a single distribution. Its more conceptually coherent. What do you mean you can still debug inside the logp? The problems I'm thinking of are things like: if you pass bad initial values to an elemwise dist, its better if you can look at the logp to tell which ones are bad. Or if find_MAP messes up. Or if you write a bad sampler or distribution. You can also imagine custom samplers using the -inf information to adjust the scale for particular elements. I also don't really see the benefit. We can just have two functions with clearer names. |
I think I've lost the plot on this one. I will close this and let someone else take a shot at it. I'm sure the answer is easy and I'm just missing it. |
This is a first pass at fixing the bug in
bound
that manifests itself in #1579. I've used stack to get around the heterogeneous arguments, but this results in dimension-matching problems for some models that I don't yet understand.Also added the tanks example from #1579 to
test_examples
.Closes #1579