ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval. #12521

Utsav-Patel · 2019-03-20T18:06:37Z

I asked my question on StackOverflow. Link.

I tried to make a custom layer using keras. I only want to implement the following 2 lines of code in call function which should be trainable.

AV = K.dot(A, Vin)
Vout = K.dot(AV, W)

dimensions of A, Vin and W are (n, n), (?, n, c) and (c, f) respectively.
I would like to train my network on mnist or cifar10 dataset.
Sharky said in her answer that it depends on the dataset and data shapes.
I don't get exactly what is the problem here.
Please, someone, help me to overcome this problem.
Thank you.

The text was updated successfully, but these errors were encountered:

cottrell · 2019-04-08T22:00:48Z

Which version are you on? I am hitting this on some generic math manipulations in tf 2.0. I think it is weird error message.

luozhouyang · 2019-04-24T08:15:38Z

@cottrell same issue

abaxi · 2019-05-29T00:55:19Z

@Utsav-Patel, @luozhouyang, @cottrell

Does your code have any weights that were defined but left unused? That may be the reason for that error. My guess is that since it is not being used, its gradient can not be computed wr.t. loss. Thus gradient is None.

This is more difficult to identify if your layer is inheriting from another layer. Calling the super constructor will add weights that you probably don't use. In which case, don't call the super().

I've coded out an example to show this in action (tf version 1.13.1, keras 2.2.4). Comment out the line

 v = v+K.dot(x, self.kernelB)       ### comment out this line to get NONE gradient error

inside call(), to get the error. If commented out that, then self.kernelB is never used, and keras gives you an error.

from keras import backend as K
from keras.layers import Layer, Activation
from keras.engine.base_layer import InputSpec
import numpy as np
from keras.models import Sequential

class CustomDense(Layer):

    def __init__(self, units, bias_constraint=None, **kwargs):
        
        if 'input_shape' not in kwargs and 'input_dim' in kwargs:
            kwargs['input_shape'] =  (kwargs.pop('input_dim'),)
        
        super(CustomDense, self).__init__(**kwargs)
        self.num_outputs = units
        self.input_spec = InputSpec(min_ndim=2)

    def build(self, input_shape):
        # Create a trainable weight variable for this layer.
        self.kernelA = self.add_weight(name='kernelA',
                                       shape=(input_shape[1], self.num_outputs),
                                       initializer='uniform')

        ##This weight is defined here, but its usage can
        ##be controlled by commenting out a line in call
        self.kernelB = self.add_weight(name='kernelB',
                                       shape=(input_shape[1], self.num_outputs),
                                       initializer='uniform')
        
        self.built = True
        super(CustomDense, self).build(input_shape)  # Be sure to call this at the end

    def call(self, x):
        v = K.dot(x, self.kernelA)
        v = v+K.dot(x, self.kernelB)       ### comment out this line to get 
                                                      ### NONE gradient error
        return v
        
    def compute_output_shape(self, input_shape):
        return (input_shape[0], self.num_outputs)

if __name__ == '__main__':
    n_units = in_dim = 10
    test = np.random.random((100,in_dim))
    model = Sequential()
    layer = CustomDense(units=n_units, input_dim=in_dim)
    model.add(layer)
    model.add(Activation("elu"))
    model.compile("adam", "mae")
    model.fit(test, test)

nbro · 2019-10-24T16:28:36Z

@abaxi I could reproduce this error with another example similar to yours. You can find the example here: https://stackoverflow.com/a/58533503/3924118. Just remove the usage of shared_variable in the method call.

giridhar-pamisetty · 2020-03-30T06:38:01Z

@Utsav-Patel This error arises when some of your weights in the model are not used. So, it shows that it is not differentiable. Make sure you use all the weights in the model to overcome this error.

chizala · 2020-03-31T10:52:30Z

@Utsav-Patel This error arises when some of your weights in the model are not used. So, it shows that it is not differentiable. Make sure you use all the weights in the model to overcome this error.

How do you ensure using all weights in the model

giridhar-pamisetty · 2020-03-31T12:23:54Z

In my case I just used the left over weights by multiplying by 0, so that all weights are covered. This solved the issue.

chizala · 2020-03-31T15:53:19Z

In my case I just used the left over weights by multiplying by 0, so that all weights are covered. This solved the issue.

Thank you Sir

jaswanthbjk · 2020-09-04T12:29:32Z

@giridhar-pamisetty

Can you please suggest me a way to check for unused weights ?

giridhar-pamisetty · 2020-09-04T17:32:15Z

I was using only a part of hidden node weights to calculate the output. So, after getting this error, I multiplied the remaining hidden node weights with zero. So, all the weights are covered.

fchollet closed this as completed Jun 24, 2021

Markets00 mentioned this issue Sep 22, 2021

mobile_netv2/Bottleneck_B5_3/Bottleneck_B5_3_conv/depthwise_kernel has 'None' for gradient saunack/MobileNetv2-SSD#1

Closed

sanatmpa1 mentioned this issue Nov 19, 2021

Try in tensorflow2.6 by myself. But had some problems. Hope get some help! #15676

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval. #12521

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval. #12521

Utsav-Patel commented Mar 20, 2019

cottrell commented Apr 8, 2019

luozhouyang commented Apr 24, 2019

abaxi commented May 29, 2019 •

edited

Loading

nbro commented Oct 24, 2019

giridhar-pamisetty commented Mar 30, 2020

chizala commented Mar 31, 2020

giridhar-pamisetty commented Mar 31, 2020

chizala commented Mar 31, 2020

jaswanthbjk commented Sep 4, 2020

giridhar-pamisetty commented Sep 4, 2020

ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval. #12521

ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval. #12521

Comments

Utsav-Patel commented Mar 20, 2019

cottrell commented Apr 8, 2019

luozhouyang commented Apr 24, 2019

abaxi commented May 29, 2019 • edited Loading

nbro commented Oct 24, 2019

giridhar-pamisetty commented Mar 30, 2020

chizala commented Mar 31, 2020

giridhar-pamisetty commented Mar 31, 2020

chizala commented Mar 31, 2020

jaswanthbjk commented Sep 4, 2020

giridhar-pamisetty commented Sep 4, 2020

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval. #12521

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval. #12521

abaxi commented May 29, 2019 •

edited

Loading