-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to obtain the gradient of each parameter in the last epoch of training #2226
Comments
You can have the outputs of a particular layer by: The parameters (weights and so on) are easily retrieved in your model object. To compute the gradient you can use this code:
That's a partial answer. I hope this helps. |
How do you call |
Here is an example where you can call the function on a 2x2 matrix. I hope this helps.
|
@gideonite If you can make it work on a toy example, please let me know |
@philipperemy I wound up printing out the weight of my model during training to debug and have since moved on. Thank you, I appreciate the help. |
@philipperemy and @gideonite
I need the gradient of loss wrt each parameter in the neural network for targeted reinforcement learning application, so the model.fit style functions are not useful for me. Roughly, what i cant to accomplish is: The problem is, I am unable to get the symbolic model_params. If I do model.get_weights or something, it get's me the numeric weights and not a symbolic one. Would appreciate some help. |
From what I know, it's very hard to do it in Keras. In your case, I strongly advice you to use Tensorflow only WITHOUT keras. It's much easier. If you're interested in the final layer and if you use the MSE, you can always reverse-engineer the backpropagation function to find the gradient but that's a very specific case:
Then your gradient would be:
When you do
Hope this helps. Let me know if you can make it in Keras! |
Hi @philipperemy. |
Here's how I did it:
|
@ebanner What is model.total_loss? My model (Theano 0.9.0dev2) object has no such attribute - it only seems to have a .loss attribute and that is just the string name (e.g. "mse"). |
@davidljung
I am also using theano 0.9.0dev2 for the record. |
@ebanner I have tried you method, but I do not obtain the values, but get things like: [Elemwise{add,no_inplace}.0, GpuFromHost.0, GpuFromHost.0, GpuFromHost.0]. |
@ebanner I have tried you method too but have got the same result as @jf003320018 . What can we do with this? |
OK here's a full working example from start to finish. Hopefully this will clear things up. What you do with the gradient tensors is define a keras function to evaluate those tensors for a particular setting of the model's inputs. Then call the function on a particular setting of the inputs! Define model
Get gradient tensors
Define keras function to return gradients
Get gradients of weights for particular
|
@ebanner Thank you for your answer. It really works. But could you tell me how to calculate the gradients for two or more samples simultaneously? Because I do not know the meanings of 'model.sample_weights', I cannot modify the code. Thank you very much. |
Passing a value of As for scaling up the example to an arbitrary number of samples, see this example (using the same function defined in my previous post): Get gradients of weights for particular
|
@ebanner The problem is soved. Thank you very much. |
How would I apply the gradients that I retrieve using this onto a separate model with the same parameters? |
|
In case anyone is trying to do this with the
you can use
|
I sent a PR to visualize grads via TensorBoard, see #6313. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
@ebanner I have a query regarding the learning phase in TEST/TRAIN mode in inputs. What is their effect. I get the same set of gradients in both |
@shaifugpt Here's the documentation for learning phase. https://keras.io/backend/
For instance when learning_phase=1, a dropout layer will actually perform dropout (i.e. zero out each input activation with some probability). Whereas if learning_phase=0 a dropout layer will instead scale the input accordingly (as opposed to zeroing it out). You're getting the same gradients in both cases because none of the layers you are using depend on the learning phase. |
@ebanner If the first layer in the model is a merge layer. We need to pass two sets of inputs for each. I am passing it as:
Then,
gives error TypeError: unhashable type: 'list' What is the correct way of passing them |
I got the following error by running the code to get the following errors. I use keras 2.0.6 with theano 0.9.0. How to solve this? Thanks. @ebanner
|
I just ran into the same error for the same reason (multiple inputs) and passing them sequentially seems to fix it. I.e.:
Then:
|
For anyone looking for this with Keras 2.0 onwards this is the syntax:
doing this is no longer necessary, and gave me an error: Also note that my model had two |
@sachinruk Absolutley perfect! Thank you |
@sachinruk Is it possible to calculate the gradients using "sub_losses", too? By sub losses I mean having 2 or more outputs (and losses: total_loss = loss_1 + loss_2) and then doing something like |
I am getting |
What if there are multiple outputs, so model.total_loss consists of multiple losses and model.targets is also multiple labels, but we are only interested in one target? specifying model.targets[0][0] to select the first target does not work. |
@shaifugpt @mathieumb did you figure out how to work with multiple inputs? I also got same error. |
@michelleowen Try something like: |
@ebanner Is it possible compute gradient with respect to a specific weight connection of a layer rather than all weights. |
@shaifugpt Hi, may I know whether you find a way to compute gradient with respsect to a specific weight? |
I found this old bug via google. Here's the "modern" version:
|
I want to obtain the gradient of each parameter in the last epoch of training. Is there a way to do so in Keras?
Thanks,
Ming
The text was updated successfully, but these errors were encountered: