-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
incorrect gradient of reduce_prod(tf.complex*) #12514
Comments
It has been 14 days with no activity and this issue has an assignee.Please update the label and/or status accordingly. |
1 similar comment
It has been 14 days with no activity and this issue has an assignee.Please update the label and/or status accordingly. |
Nagging Assignee: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly. |
2 similar comments
Nagging Assignee: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly. |
Nagging Assignee: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly. |
Nagging Assignee @rmlarsen: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly. |
3 similar comments
Nagging Assignee @rmlarsen: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly. |
Nagging Assignee @rmlarsen: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly. |
Nagging Assignee @rmlarsen: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly. |
@rmlarsen Any timeline on when this might be looked at? For others that run into this issue, you can use import tensorflow as tf
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
from tensorflow.python.ops.math_grad import _safe_shape_div
@tf.RegisterGradient("ModifiedProdGrad")
def _ModifiedProdGrad(op, grad):
"""Gradient for Prod."""
# The gradient can be expressed by dividing the product by each entry of the
# input tensor, but this approach can't deal with zeros in the input.
# Here, we avoid this problem by composing the output as a product of two
# cumprod operations.
input_shape = array_ops.shape(op.inputs[0])
# Reshape reduction indices for the case where the parameter is a scalar
reduction_indices = array_ops.reshape(op.inputs[1], [-1])
# Expand grad to full input shape
output_shape_kept_dims = math_ops.reduced_shape(input_shape, op.inputs[1])
tile_scaling = _safe_shape_div(input_shape, output_shape_kept_dims)
grad = array_ops.reshape(grad, output_shape_kept_dims)
grad = array_ops.tile(grad, tile_scaling)
# Pack all reduced dimensions into a single one, so we can perform the
# cumprod ops. If the reduction dims list is empty, it defaults to float32,
# so we need to cast here. We put all the shape-related ops on CPU to avoid
# copying back and forth, and since listdiff is CPU only.
with ops.device("/cpu:0"):
rank = array_ops.rank(op.inputs[0])
reduction_indices = (reduction_indices + rank) % rank
reduced = math_ops.cast(reduction_indices, dtypes.int32)
idx = math_ops.range(0, rank)
other, _ = array_ops.setdiff1d(idx, reduced)
perm = array_ops.concat([reduced, other], 0)
reduced_num = math_ops.reduce_prod(array_ops.gather(input_shape, reduced))
other_num = math_ops.reduce_prod(array_ops.gather(input_shape, other))
permuted = array_ops.transpose(op.inputs[0], perm)
permuted_shape = array_ops.shape(permuted)
reshaped = array_ops.reshape(permuted, (reduced_num, other_num))
# Calculate product, leaving out the current entry
left = math_ops.cumprod(reshaped, axis=0, exclusive=True)
right = math_ops.cumprod(reshaped, axis=0, exclusive=True, reverse=True)
y = array_ops.reshape(tf.conj(left) * tf.conj(right), permuted_shape)
# Invert the transpose and reshape operations.
# Make sure to set the statically known shape information through a reshape.
out = grad * array_ops.transpose(y, array_ops.invert_permutation(perm))
return array_ops.reshape(out, input_shape), None With TF gradient: with tf.Graph().as_default() as g:
x = tf.Variable(1.0)
E = tf.real(tf.reduce_prod(tf.complex( [x,x], [2*x,2*x] )))
with tf.Session() as sess:
sess.run(tf.variables_initializer([x]))
print(sess.run(tf.gradients(E,x)))
>>> [10.0] With modified gradient: with tf.Graph().as_default() as g:
with g.gradient_override_map({"Prod": "ModifiedProdGrad"}):
x = tf.Variable(1.0)
E = tf.real(tf.reduce_prod(tf.complex( [x,x], [2*x,2*x] )))
with tf.Session() as sess:
sess.run(tf.variables_initializer([x]))
print(sess.run(tf.gradients(E,x)))
>>> [-6.0] |
Also note that the title of this issue should be "incorrect gradient of reduce_prod" |
Nagging Assignee @rmlarsen: It has been 15 days with no activity and this issue has an assignee. Please update the label and/or status accordingly. |
Describe the problem
Tensorflow computes the wrong result for the following gradient:
Tensorflow returns 10.0
The correct result is -6 since:
Below is mathematically equivalent code for E, for which Tensorflow returns the correct result of -6.0:
System information
Linux distribution = Arch Linux (up to date)
TensorFlow was installed from the Arch Linux package python-tensorflow
I'm using an x86_64 CPU. I'm not using my GPU.
numpy (1.13.1)
protobuf (3.3.2)
tensorflow (1.3.0)
python (3.6.2)
The text was updated successfully, but these errors were encountered: