How to deep control gradient back propagation with Keras #956

jerryli1981 · 2015-11-06T00:04:05Z

Hi All,
I would like to know how to write code to conduct gradient back propagation. Like Lua does below,

local sim_grad = self.criterion:backward(output, targets[j])
local rep_grad = self.MLP:backward(rep, sim_grad)

Keras's example teach me how to construct sequential model like below,
model = Sequential()
model.add(Dense(128, input_shape=(784,)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(10))
model.add(Activation('softmax'))

However, it is not enough for me. I need generate gradient for this model. How can write code to control sequential model backward propagation?
Thanks

EderSantana · 2015-11-06T01:42:53Z

You want to train the model or you need the gradients to do something else? If you want to train the model, just keep reading the docs and see the fit method it will calculate gradients and train everything for your.

If you need the gradients to do other things you have to use Theano. You have to get the output of your model and, define a cost function and calculate the gradients with respect to each parameter. For example:

D = T.matrix() # desired
Y = model.get_output()
Cost = ((D-Y)**2).mean()
gradients = [T.grad(Cost, p) for p in model.get_params()]

jerryli1981 · 2015-11-06T02:23:55Z

My model is Recurive Neutral Network(RNN) + MLP. Based on your suggestion. I have two choices.
One is focus on training MLP. and generate gradients to train RNN.
The other is I build a sequence model contains RNN + MLP. And then, train together.
The second choice seems like below

model.add(MyRNN)
model.add(Dense(128, input_shape=(784,)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(10))
model.add(Activation('softmax'))

Is that possible?

EderSantana · 2015-11-06T03:49:11Z

run and understand this example, they do something like what you are doing https://github.com/fchollet/keras/blob/master/examples/lstm_text_generation.py

NightFury13 · 2016-03-08T16:43:29Z

@EderSantana : I don't think thats what Jerry asked. Is there some way to compute the gradients of backpropogation w.r.t each hidden layer (or input layer?). An equivalent of this in Caffe for example would be something like :

net.blobs[last_layer].diff[0][target_class]=1 #Setting diff of last layer to 1 (i.e. grad considering target class is obtained)
back_pass = net.backward()
jacobian = back_pass[desired_layer].copy() #Gives gradient update for the desired_class.

@jerryli1981 : Were you able to find a way to do this?

johnny5550822 · 2016-12-18T04:27:04Z

@jerryli1981 Were you able to identify the way to calculate the gradient in a layer? (I am also originally a torch7 user and it is straightforward to do that. I am not sure about in Keras...)

jemshit · 2017-04-15T16:04:57Z

I'm trying to do backpropagation with MLP, do we have a way to do backward pass in Keras (usng Tensorflow)?

hamzamerzic · 2017-04-20T16:41:59Z

@jemshit TensorFlow allows that using opt.apply_gradients method, as shown here: https://www.tensorflow.org/api_docs/python/tf/train/Optimizer or here: https://github.com/fchollet/keras/blob/master/keras/optimizers.py#L592
Is there a backend agnostic way of doing this though? @fchollet

ROZBEH · 2017-07-04T02:03:34Z

Were you guys able to resolve this issue? I have to back propagate the error but at each time step the derivative is different and I have to manipulate that. How is that possible in Keras/Tensor flow?

stale · 2017-10-02T02:13:06Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

mongoose54 · 2018-01-16T00:51:21Z

More or less the same question here: How can I backpropagate a specific error value in a Keras model? Thanks

ROZBEH · 2018-01-16T02:20:57Z

I couldn't figure this out. I ended up using Pytorch. Pytorch gives you this capability.

jnhelen · 2018-10-03T16:23:26Z

@jemshit Hi! Have you solved this problem?

jemshit · 2018-10-03T16:29:46Z

No need to do anything manually. Optimizer algorithm (SGD, Adadelta...) will use backpropagation.

…

On Wed, Oct 3, 2018, 19:26 jiangnan ***@***.***> wrote: @jemshit <https://github.com/jemshit> Hi! Have you solved this problem? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#956 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB28nNUU_P8qyz7iH1dyAL4hEz10ESxkks5uhOUhgaJpZM4GdEPs> .

eliethesaiyan · 2018-12-23T11:11:44Z

@jemshit ,i think what @jerryli1981 meant is to be able to apply a function on gradient at each stage of backprop or forwardpass. for example what if you want to binarize(quantize) gradient on each backprob which is widely used in quantized model

birdmw · 2019-01-30T03:50:07Z

from keras.models import Sequential
from keras.layers import Dense, Activation
from keras import backend as k
from keras import losses
import numpy as np
import tensorflow as tf
from sklearn.metrics import mean_squared_error
from math import sqrt

model = Sequential()
model.add(Dense(12, input_dim=8, kernel_initializer='uniform', activation='relu'))
model.add(Dense(8, kernel_initializer='uniform', activation='relu'))
model.add(Dense(8, kernel_initializer='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

inputs = np.random.random((1, 8))
outputs = model.predict(inputs)
targets = np.random.random((1, 8))
rmse = sqrt(mean_squared_error(targets, outputs))
loss = losses.mean_squared_error(targets, model.output)

#  ===== Symbolic Gradient =====
gradients = k.gradients(loss, model.trainable_weights)

print("===BEFORE WALKING DOWN GRADIENT===")
print("outputs:\n", outputs)
print("targets:\n", targets)

# Begin TensorFlow
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())

steps = 100  # steps of gradient descent
for s in range(steps):

    # ===== Numerical gradient =====
    evaluated_gradients = sess.run(gradients, feed_dict={model.input: inputs})

    # Step down the gradient for each layer
    for i in range(len(model.trainable_weights)):
        sess.run(tf.assign_sub(model.trainable_weights[i], evaluated_gradients[i]))

    # Every 10 steps print the RMSE
    if s % 10 == 0:
        outputs = model.predict(inputs)
        rmse = sqrt(mean_squared_error(targets, outputs))
        print("step " + str(s) + " rmse:", rmse)

final_outputs = model.predict(inputs)
final_rmse = sqrt(mean_squared_error(targets, final_outputs))

print("===AFTER STEPPING DOWN GRADIENT===")
print("outputs:\n", outputs)
print("targets:\n", targets)

theceday · 2019-04-04T06:38:03Z

Is there any way to do this? or with tf.keras?

maulberto3 · 2019-04-24T03:11:00Z

Hi @theceday I also need to manually backprop gradients in keras. Did you managed?

birdmw · 2019-04-24T07:22:43Z

ya finally but it's super slow. What are you using it for? you can batch train if you are trying for reinforcement learning.

…

On Tue, Apr 23, 2019 at 8:12 PM Mauricio ***@***.***> wrote: Hi @theceday <https://github.com/theceday> I also need to manually backprop gradients in keras. Did you managed? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#956 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAWKJTCMIGQVSKG6ZRO3B73PR7F25ANCNFSM4BTUIPWA> .

maulberto3 · 2019-04-24T17:25:25Z

@theceday I am in the process of. Just calculated gradients outside the computation graph (I can see them in my termnail). Now I need to update each weight accordingly, I guess I'm doing what an optimizer does for you, however, as you know RL models differ a bit from keras internals, so that's why I am 'on foot' here. That's also why model.train_on_batch() does not fit my needs either.

birdmw · 2019-04-25T02:37:46Z

did you see my example?

…

On Wed, Apr 24, 2019 at 10:26 AM Mauricio ***@***.***> wrote: @theceday <https://github.com/theceday> I am in the process of. Just calculated gradients outside the computation graph (I can see them in my termnail). Now I need to update each weight accordingly, I guess I'm doing what an optimizer does for you, however, as you know RL models differ a bit from keras internals, so that's why I am 'on foot' here. That's also why model.train_on_batch() does not fit my needs either. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#956 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAWKJTFUV7AQFFU4P65RF5DPSCJ6FANCNFSM4BTUIPWA> .

theceday · 2019-04-25T06:49:45Z

I am not sure everyone has the same case, but I was trying to backrop a custom loss value (numpy array/input tensor), I have used K.switch/tf.cond with no luck. As far I understand so far, tf doesnt backprop those seperate branches.
In order to that loss function should be "explicitly" defined as "loss" function, so that maybe some operators could be used for that.

Maybe instead of using K.switch, returning a loss expression containing both tensors (actual and custom) might work, but I am not sure if it allows such a expression

I might give another try for this, if I have time

Edit: There is a listed change in tensorflow 2.0 alpha release:
Adding clear_losses API to be able to clear losses at the end of forward pass in a custom training loop in eager.

That hints there could be some changes in tf&tf.keras that could help the issue, but I am not sure atm

seongwook-ham mentioned this issue Aug 23, 2016

how to get gradient without computation time wastes? #3545

Closed

stale bot added the stale label Oct 2, 2017

stale bot closed this as completed Nov 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to deep control gradient back propagation with Keras #956

How to deep control gradient back propagation with Keras #956

jerryli1981 commented Nov 6, 2015

EderSantana commented Nov 6, 2015

jerryli1981 commented Nov 6, 2015

EderSantana commented Nov 6, 2015

NightFury13 commented Mar 8, 2016

johnny5550822 commented Dec 18, 2016

jemshit commented Apr 15, 2017

hamzamerzic commented Apr 20, 2017

ROZBEH commented Jul 4, 2017 •

edited

stale bot commented Oct 2, 2017

mongoose54 commented Jan 16, 2018

ROZBEH commented Jan 16, 2018

jnhelen commented Oct 3, 2018

jemshit commented Oct 3, 2018 via email

eliethesaiyan commented Dec 23, 2018

birdmw commented Jan 30, 2019

theceday commented Apr 4, 2019

maulberto3 commented Apr 24, 2019

birdmw commented Apr 24, 2019 via email

maulberto3 commented Apr 24, 2019

birdmw commented Apr 25, 2019 via email

theceday commented Apr 25, 2019 •

edited

How to deep control gradient back propagation with Keras #956

How to deep control gradient back propagation with Keras #956

Comments

jerryli1981 commented Nov 6, 2015

EderSantana commented Nov 6, 2015

jerryli1981 commented Nov 6, 2015

EderSantana commented Nov 6, 2015

NightFury13 commented Mar 8, 2016

johnny5550822 commented Dec 18, 2016

jemshit commented Apr 15, 2017

hamzamerzic commented Apr 20, 2017

ROZBEH commented Jul 4, 2017 • edited

stale bot commented Oct 2, 2017

mongoose54 commented Jan 16, 2018

ROZBEH commented Jan 16, 2018

jnhelen commented Oct 3, 2018

jemshit commented Oct 3, 2018 via email

eliethesaiyan commented Dec 23, 2018

birdmw commented Jan 30, 2019

theceday commented Apr 4, 2019

maulberto3 commented Apr 24, 2019

birdmw commented Apr 24, 2019 via email

maulberto3 commented Apr 24, 2019

birdmw commented Apr 25, 2019 via email

theceday commented Apr 25, 2019 • edited

ROZBEH commented Jul 4, 2017 •

edited

theceday commented Apr 25, 2019 •

edited