-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backend argmax has none for gradients. Can you even define one? #11157
Comments
Hello! The argmax function has no gradient. Or at least, its gradient is equal to zero all the time. This is not specific to keras. It's the same in all deep learning frameworks because this is the mathematical definition of the gradient of argmax. If you wish to create your own operation, with a custom gradient, you need to access the backend directly and create a new op. But most of the time, it's not a walk in the park. See https://www.tensorflow.org/extend/adding_an_op |
Yes, I know argmax has no gradient. But the error is clearly asking me to define one for argmax. How do I get to fix this error then? |
The error message is maybe not clear. It's saying that you should only use backend functions which have a gradient. So something else than argmax. The message is not saying that you should define argmax's gradient. Maybe this message is not explicit enough. |
Okay. So is there any alternative for argmax (as my model cannot work without one) that I can use? Btw, why does backend have argmax function when we can't use it in model? |
I don't know any alternative for argmax, I've never worked with a model requiring one. Argmax is there to perform operations whenever the gradient is not needed. For example, when computing a metric. I suppose you can try to use the argmax from tensorflow directly and see if you get the error. But you must know what you are doing because if there is no error, it is implied that the gradient is null (like tf.around) |
Okay, thanks for helping me out. I will give tensorflow a go. I will let this issue open for a day and wait for someone who knows an alternative to argmax. I hope no one has problem with this (else they can close it). |
Did you find any solution to this? @lcukerd |
@MansiAgarwal11 Yes, I did. You will have to use Keras in Tensorflow model. For training, you will have to define a loss function like in this article. In the model shown in the article if you include argmax, it will still work. You should be able to do this using only Keras but I haven't tried yet. |
But if there is no gradient for argmax function, how does the model backpropagate? |
I am not sure myself but I think the tensorflow code was written to bypass it in a clever way. Probably someone tensorflow team can clear this up? Btw Did your model converge? |
I didn't make use of argmax and came up with a different loss function for my problem. |
FYI, in my experience with a different tensorflow function that didn't have a gradient, I found that I could run and train the model without any errors, but because there was no gradient, there was no actual learning taking place. It's something to look out for if you try to use argmax. |
I have the same problem. There is no any problem for train and evaluation, and Ok for saving the model in H5. However, when loading the saved model, the error message pops up: ValueError: An operation has |
^ You're saying that you can train a model successfully with argmax? That surprises me. What I was trying to say in my earlier comment is that you can sometime run the training with arguments that don't have a gradient and no errors will be thrown, but your model won't actually get better. How confident are you that the model you're training is actually getting better as you train it? |
I monitored the precision, recall and accuracy while training, the model was getting better. If the model was saved with Keras.save, then the error above appears with Keras.load_model. However, if the model was saved with Keras.mode_to_json and Keras.save_weights, then everything is fine when loading the saved model. |
Well, thanks for the update, but you've stumped me. I don't understand:
Sorry I couldn't be more help. |
Gumbel-softmax may solve the problem of argmax. |
Yeah, or SeqGAN-based idea of policy updates: https://arxiv.org/abs/1609.05473 |
I faced the same problem with GPU. with runtime as None, it seems the problem no longer persists. |
I'm facing the same issued , I define new layer in Lambda ValueError: An operation has |
I implemented this solution and it worked for me. Save model to JSONmodel_json = model.to_json() serialize weights to HDF5model.save_weights("model.h5") Load JSON modeljson_file = open('model.json', 'r') load weights into new modelloaded_model.load_weights("model.h5") |
I am using
Keras.Backend.armax()
in a gamma layer. The model compiles fine but throws an error during fit().My model:
Model summary for easy visualizing:
I googled for the solution but almost all were about a faulty model. Some recommended to not use functions causing that are causing issues. However, as you can see, I cannot create this model without K.argmax (If you know any other way then do tell me).
Also, how can you even define gradient of argmax!
I am guessing its an issue in Keras, if not, pls tell me how to define its gradient.
The text was updated successfully, but these errors were encountered: