incorrect gradient of reduce_prod(tf.complex*) #12514

kjslag · 2017-08-23T05:28:20Z

Describe the problem

Tensorflow computes the wrong result for the following gradient:

import tensorflow as tf
x = tf.Variable(1.0)
E = tf.real(tf.reduce_prod(tf.complex( [x,x], [2*x,2*x] )))
sess = tf.Session()
sess.run(tf.variables_initializer([x]))
sess.run(tf.gradients(E,x))

Tensorflow returns 10.0
The correct result is -6 since:

E = real((x+2i*x)^2) = real((1+2i)^2) * x^2 = real(1+4i-4) * x^2 = -3*x^2
dE/dx = -6*x = -6 for x=1

Below is mathematically equivalent code for E, for which Tensorflow returns the correct result of -6.0:

E = tf.real( tf.complex(x,2*x) * tf.complex(x,2*x) )
E = tf.real(tf.exp(tf.reduce_sum(tf.log(tf.complex( [x,x], [2*x,2*x] )))))

System information

Linux distribution = Arch Linux (up to date)
TensorFlow was installed from the Arch Linux package python-tensorflow
I'm using an x86_64 CPU. I'm not using my GPU.
numpy (1.13.1)
protobuf (3.3.2)
tensorflow (1.3.0)
python (3.6.2)

The text was updated successfully, but these errors were encountered:

tensorflowbutler · 2017-12-20T01:21:59Z

It has been 14 days with no activity and this issue has an assignee.Please update the label and/or status accordingly.

tensorflowbutler · 2018-01-03T19:10:54Z

It has been 14 days with no activity and this issue has an assignee.Please update the label and/or status accordingly.

tensorflowbutler · 2018-01-18T19:12:08Z

Nagging Assignee: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

tensorflowbutler · 2018-02-06T07:40:10Z

Nagging Assignee: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

tensorflowbutler · 2018-02-20T19:38:58Z

Nagging Assignee: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

tensorflowbutler · 2018-03-07T13:16:57Z

Nagging Assignee @rmlarsen: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

tensorflowbutler · 2018-03-25T00:55:16Z

Nagging Assignee @rmlarsen: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

tensorflowbutler · 2018-04-08T18:37:20Z

Nagging Assignee @rmlarsen: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

tensorflowbutler · 2018-04-23T18:48:11Z

Nagging Assignee @rmlarsen: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

woodshop · 2018-04-24T15:18:17Z

@rmlarsen Any timeline on when this might be looked at?
@brianwa84 - you might be interested in this

For others that run into this issue, you can use gradient_override_map. Ex (please note that this has not been unit tested):

import tensorflow as tf
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
from tensorflow.python.ops.math_grad import _safe_shape_div

@tf.RegisterGradient("ModifiedProdGrad")
def _ModifiedProdGrad(op, grad):
    """Gradient for Prod."""
    # The gradient can be expressed by dividing the product by each entry of the
    # input tensor, but this approach can't deal with zeros in the input.
    # Here, we avoid this problem by composing the output as a product of two
    # cumprod operations.

    input_shape = array_ops.shape(op.inputs[0])
    # Reshape reduction indices for the case where the parameter is a scalar
    reduction_indices = array_ops.reshape(op.inputs[1], [-1])

    # Expand grad to full input shape
    output_shape_kept_dims = math_ops.reduced_shape(input_shape, op.inputs[1])
    tile_scaling = _safe_shape_div(input_shape, output_shape_kept_dims)
    grad = array_ops.reshape(grad, output_shape_kept_dims)
    grad = array_ops.tile(grad, tile_scaling)

    # Pack all reduced dimensions into a single one, so we can perform the
    # cumprod ops. If the reduction dims list is empty, it defaults to float32,
    # so we need to cast here.  We put all the shape-related ops on CPU to avoid
    # copying back and forth, and since listdiff is CPU only.
    with ops.device("/cpu:0"):
        rank = array_ops.rank(op.inputs[0])
        reduction_indices = (reduction_indices + rank) % rank
        reduced = math_ops.cast(reduction_indices, dtypes.int32)
        idx = math_ops.range(0, rank)
        other, _ = array_ops.setdiff1d(idx, reduced)
        perm = array_ops.concat([reduced, other], 0)
        reduced_num = math_ops.reduce_prod(array_ops.gather(input_shape, reduced))
        other_num = math_ops.reduce_prod(array_ops.gather(input_shape, other))
    permuted = array_ops.transpose(op.inputs[0], perm)
    permuted_shape = array_ops.shape(permuted)
    reshaped = array_ops.reshape(permuted, (reduced_num, other_num))

    # Calculate product, leaving out the current entry
    left = math_ops.cumprod(reshaped, axis=0, exclusive=True)
    right = math_ops.cumprod(reshaped, axis=0, exclusive=True, reverse=True)
    y = array_ops.reshape(tf.conj(left) * tf.conj(right), permuted_shape)

    # Invert the transpose and reshape operations.
    # Make sure to set the statically known shape information through a reshape.
    out = grad * array_ops.transpose(y, array_ops.invert_permutation(perm))
    return array_ops.reshape(out, input_shape), None

With TF gradient:

with tf.Graph().as_default() as g:
    x = tf.Variable(1.0)
    E = tf.real(tf.reduce_prod(tf.complex( [x,x], [2*x,2*x] )))
    with tf.Session() as sess:
        sess.run(tf.variables_initializer([x]))
        print(sess.run(tf.gradients(E,x)))

>>> [10.0]

With modified gradient:

with tf.Graph().as_default() as g:
    with g.gradient_override_map({"Prod": "ModifiedProdGrad"}):
        x = tf.Variable(1.0)
        E = tf.real(tf.reduce_prod(tf.complex( [x,x], [2*x,2*x] )))
        with tf.Session() as sess:
            sess.run(tf.variables_initializer([x]))
            print(sess.run(tf.gradients(E,x)))

>>> [-6.0]

woodshop · 2018-04-24T15:26:09Z

Also note that the title of this issue should be "incorrect gradient of reduce_prod"

tensorflowbutler · 2018-05-10T01:13:48Z

Nagging Assignee @rmlarsen: It has been 15 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

michaelisard assigned rmlarsen Aug 28, 2017

brianwa84 changed the title ~~incorrect gradient of real(reduce_prod(complex(...)))~~ incorrect gradient of reduce_prod(tf.complex*) May 16, 2018

brianwa84 assigned brianwa84 and unassigned rmlarsen May 16, 2018

av8ramit closed this as completed in 1614536 May 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

incorrect gradient of reduce_prod(tf.complex*) #12514

incorrect gradient of reduce_prod(tf.complex*) #12514

kjslag commented Aug 23, 2017 •

edited

tensorflowbutler commented Dec 20, 2017

tensorflowbutler commented Jan 3, 2018

tensorflowbutler commented Jan 18, 2018

tensorflowbutler commented Feb 6, 2018

tensorflowbutler commented Feb 20, 2018

tensorflowbutler commented Mar 7, 2018

tensorflowbutler commented Mar 25, 2018

tensorflowbutler commented Apr 8, 2018

tensorflowbutler commented Apr 23, 2018

woodshop commented Apr 24, 2018 •

edited

woodshop commented Apr 24, 2018

tensorflowbutler commented May 10, 2018

incorrect gradient of reduce_prod(tf.complex*) #12514

incorrect gradient of reduce_prod(tf.complex*) #12514

Comments

kjslag commented Aug 23, 2017 • edited

Describe the problem

System information

tensorflowbutler commented Dec 20, 2017

tensorflowbutler commented Jan 3, 2018

tensorflowbutler commented Jan 18, 2018

tensorflowbutler commented Feb 6, 2018

tensorflowbutler commented Feb 20, 2018

tensorflowbutler commented Mar 7, 2018

tensorflowbutler commented Mar 25, 2018

tensorflowbutler commented Apr 8, 2018

tensorflowbutler commented Apr 23, 2018

woodshop commented Apr 24, 2018 • edited

woodshop commented Apr 24, 2018

tensorflowbutler commented May 10, 2018

kjslag commented Aug 23, 2017 •

edited

woodshop commented Apr 24, 2018 •

edited