-
-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement at.diff
using basic slicing and subtraction Op
s
#901
Implement at.diff
using basic slicing and subtraction Op
s
#901
Conversation
Remove altogether. We don't need a dispatch for at.diff. We have dispatch for the slicing and subtraction so the new graph will also work fine with those backends. That's yet another reason why we prefer to keep the number of Ops small, less dispatch code to maintain.
I don't think so. Aesara scan will introduce some overhead and I don't think there's a big need for large or symbolic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current error in TestDiffOp.test_output_type
looks like a static shape inference problem: i.e. the test assumes that all the Op.make_node
and/or Op.infer_shape
methods involved can exactly infer the output Type
's shape using only static information.
Previously, at.diff
would return a single DiffOp
that would perform the static shape inference in DiffOp.make_node
and return an output Type
with the expected Type.shape
values. Now, at.diff
returns a graph that contains a few different Op
s that are expected to produce the same end result. We need to find out how/where those Op
s are losing the static information along the way to the graph's output Type
.
Since all the new Op
s that make up at.diff
are differences and *Subtensor*
s, my guess is that one of the *Subtensor*
s is losing that information in its Op.make_node
. I don't think those Op
s have been updated to use the newly available Type.shape
information, so that could easily be the issue.
In other words, the error may have nothing to do with the changes you made, and everything to do with some shortcomings in a *Subtensor*.make_node
implementation.
These are good things to come across, because an improvement to a *Subtensor*
Op
could affect a lot more than just at.diff
.
My advice is to run TestDiffOp.test_output_type
locally, stop in the debugger and aesara.dprint(out, print_type=True)
. That will show you the entire graph and the types of each term in the graph. From there you can see which nodes/Op
s produced Type
s that lost shape information. You may need to reason about that yourself, though (e.g. given a vector y
, z = y[slice(None)]
implies z.shape == y.shape
, so, if y.type.shape == (s,)
then z.type.shape == (s,)
).
Actually, take a look at this:
import aesara.tensor as at
y = at.tensor(dtype="float32", shape=(2,))
y.type.shape
# (2,)
# We know that the shape of this should be equal to the shape of `y`
z = y[slice(None)]
aesara.dprint(z, print_type=True)
# Subtensor{::} [id A] <TensorType(float32, (None,))> ''
# |<TensorType(float32, (2,))> [id B] <TensorType(float32, (2,))>
z.type.shape
# (None,)
The Subtensor
node is returning an output variable, z
, with a Type
of TensorType(float32, (None,))
. Since the shape value None
corresponds to "not-known-at-compile-time", it's this Op
that's breaking the test.
At this point we should at least create an issue for the Subtensor
problem. As far as this PR/issue is concerned, you can address the problem here, if it's not too complicated, or we can set that particular test case to xfail
and address the problem in a separate PR. (A separate PR can also be worked on simultaneously and this one put on hold.) Either way, I'll leave that up to you.
at.diff
using basic slicing and subtraction Op
s
The failing test was introduced in #855, because there used to be an actual error in the output type of the I think it is fine to remove that test (or xfail it), and open a separate issue to make the output type of |
To understand the conversation about the output type, it's useful to read the documentation here: https://aesara.readthedocs.io/en/latest/extending/type.html |
e154bb1
to
93d7c21
Compare
Hi all, revisiting this PR after some hiatus. I just opened an issue to address what @brandonwillard explained above (see issue #922). Thank you for the clear instructions. As for this PR, I currently removed the class |
Codecov Report
@@ Coverage Diff @@
## main #901 +/- ##
==========================================
- Coverage 78.94% 78.91% -0.04%
==========================================
Files 152 152
Lines 47712 47654 -58
Branches 10858 10851 -7
==========================================
- Hits 37667 37606 -61
- Misses 7546 7548 +2
- Partials 2499 2500 +1
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good; thanks!
I forgot to remove those "fix pre-commit" before merging; regardless, such commits shouldn't exist. If @larryshamalama, make sure In general, we would like it to be possible for all tests ( |
Thanks for the suggestion. I do have the habit to stack my commits without much thought... Whenever I fix pre-commit errors, it's because I forgot to check before pushing. I'm looking at the repository commit history and I see many of my "fix pre-commit" commits. Generally, is it good practice to separate progress into multiple commits? I just learned about rebasing commits into one so I'll be sure to remember to do this for next time. Also, for this PR, I initially left it as a draft because I was not sure if I needed to look into issue #922 first. |
Take a look at this page for some descriptions and examples of good commit structuring.
That's a more important issue, but we don't need it to hold up this PR. |
Closes #860.
I'm happy to receive any pointers or suggestions via the creation of this PR. Hopefully, this PR can bring some discussion since there are gaps in my understanding.
Two questions:
DiffOp
because the Jax and Numba seem to use it. Should I deleteDiffOp
altogether or just modifyat.diff
as in the current (first) commit?aesara.scan
?