-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make tf.transpose emit simpler graph when possible #21945
Conversation
If not given an explicit 'perm' parameter, tf.transpose currently emits a graph that dynamically calculates it from the rank of the input tensor. This is completely unnecessary when the rank of the input can be statically determined at graph construction time. Modify tf.transpose to emit 'perm' as a single Const node whenever possible.
EDIT: It looks like //tensorflow/contrib/learn:dnn_test fails with the patch in addition to the other pre-existing failures under contrib. Need to debug this further, since I can't figure out the root cause. |
It looks like there are a few tests where this patch causes a test to fail. They all raise an exception in the same place: tensorflow/contrib/learn/python/learn/estimators/head.py", line 1924, in _centered_bias_step What's strange is that the code that builds the graph doesn't fail when tf.transpose is called, i.e. the graph node is created just as expected, so its input parameters seem validated. Having gone through every such call with some good old print debugging, the parameters don't look like anything strange. The exception in the test is raised when the optimizer is creating the backprop graph for the bias computation and at this point it looks like some variable and it's grad have different shapes:
I don't quite understand how it would be possible to have a valid forward graph and then have the optimizer end up with different sizes for variables during backprop. I'll look closer into this later this week. |
It looks like I've triggered a bug in TensorFlow (probably the grappler optimizer). The following code fails with the error in the message above:
However, if I simply add a tf.Print on the perm parameter it works, i.e. the following code passes unit tests:
|
Can you fix this by doing perm = ops.convert_to_tensor((rank - 1) -
np.arange(rank, dtype=np.int32))?
…On Mon, Sep 17, 2018 at 5:53 AM Edvard Fagerholm ***@***.***> wrote:
It looks like I've triggered a bug in TensorFlow (probably the grappler
optimizer). The following code fails with the error in the message above:
a = ops.convert_to_tensor(a, name="a")
if not a.get_shape().ndims:
rank = gen_array_ops.rank(a)
perm = (rank - 1) - gen_math_ops._range(0, rank, 1)
else:
rank = a.get_shape().ndims
perm = (rank - 1) - np.arange(rank, dtype=np.int32)
However, if I simply add a tf.Print on the perm parameter it works, i.e.
the following code passes unit tests:
a = ops.convert_to_tensor(a, name="a")
if not a.get_shape().ndims:
rank = gen_array_ops.rank(a)
perm = (rank - 1) - gen_math_ops._range(0, rank, 1)
else:
rank = a.get_shape().ndims
perm = (rank - 1) - np.arange(rank, dtype=np.int32)
from tensorflow.python.ops import logging_ops
perm = logging_ops.Print(perm, [perm], "sdfsdf")
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#21945 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAATxfHBtfO86QL9m4uClIzk828SluZRks5ub5tGgaJpZM4WRYE2>
.
--
- Alex
|
I suggest this because I think it's not a grappler-related bug but instead an issue where some piece of code downstream behaves differently whether perm is a tensor or not, and print makes it a tensor. |
Should have mentioned that I already tried this and it doesn't help. In fact, I tried the following things:
Unit tests only pass with (1), the others all fail. |
Since tf.Print() is basically tf.identity(), I'm not sure if there could be some strange device placement issues going on here? However, I'm running tests with "--config=opt", so there's really only the CPU to choose from, so I can't see how this could factor in either. |
@rmlarsen is there someone on the grappler side who can help investigate this failure? |
@efagerho thanks for the PR and sorry for the delay. Let me take a look. |
This does appear to be a Grappler bug. The tests pass when I disable all Grappler optimizations. I will hunt down and squash the bug now. |
I believe this was caused by a bug in the shape function of Transpose. I will submit a fix shortly. Then we should be able to proceed with this PR. |
That's quite unexpected. Would have assumed that code to have been fairly well exercised. Thanks for figuring it out! |
@efagerho indeed! |
PiperOrigin-RevId: 215824410
@efagerho your PR has now been merged. Thanks for the contribution! |
Seems like the patch got rolled back. Were the Grappler fixes checked in before the CI ran? |
We're working on resubmitting it; there were some obscure test failures triggered by this. |
Automated rollback of PR tensorflow#21945 END_PUBLIC Automated rollback of commit 863f614. Revert tensorflow#21945. PiperOrigin-RevId: 215913175
Automated rollback of PR tensorflow#21945 END_PUBLIC Automated rollback of commit 863f614. Revert tensorflow#21945. PiperOrigin-RevId: 215913175
If not given an explicit 'perm' parameter, tf.transpose currently
emits a graph that dynamically calculates it from the rank of the
input tensor. This is completely unnecessary when the rank of the
input can be statically determined at graph construction time.
Modify tf.transpose to emit 'perm' as a single Const node whenever
possible.