-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can i use keras optimizer for backprop-ing on my own loss functions #4746
Comments
You should check out the optimizer API as defined in |
I did. My backprop would be: get_gradients (https://github.com/fchollet/keras/blob/master/keras/optimizers.py#L61) seems to be called by get_updates() in Adam. Do i just call get_updates() once to build the update function? I am not sure how to use that function either. Specifically, I am confused about parts that are building a function vs functions where i could pass my numpy array to compute updates. |
This is the relevant portion: https://github.com/raghakot/keras-vis/blob/master/vis/optimizer.py#L163 |
You can use Keras optimizers outside of Keras if you really can't do whatever you're doing within Keras. Yes, it is important to call get_updates() once and only once and hang on to the returned updates. For example, the Adam optimizer locally creates momentum variables in the get_updates() function. Calling get_updates() multiple times for the same set of parameters will cause chaos. If you have some custom loss function and a list of shared variables:
You're better off doing backprop on the GPU instead of back-and-forth with numpy. Store your weights as GPU variables and update them with functions. When you need the weights in numpy, use get_value and set_value. Cheers, |
Thanks. the updates = opt.get_updates([input], [], [loss_fn]) it complains about None. Any ideas on how to handle that? |
Please always post a stack trace or something if you have specific issues. I put together a Gist showing how to use Keras optimizers. It should teach you the basic style of how everything goes together. https://gist.github.com/bstriner/e1e011652b297d13b3ac3f99fd11b2bc The standard in Keras is that model parameters are variables that live on the GPU and inputs and targets are placeholders that get passed in for each batch. A training function is created with inputs: batch inputs, batch targets; and outputs: loss, accuracy, other metrics. The function also performs updates on the model parameters on the GPU each time it is executed. To train, you just pass batch inputs and batch targets to the training function and print out the current loss. At the end, if you want to get the trained parameters, use K.get_value.
|
Thanks. The example and gist are awesome. You should perhaps add or reference it somewhere in keras docs/examples for others. Here is a minimal example of whats happening in my case. from keras import backend as K
from keras.optimizers import Adam
x = K.placeholder(shape=(None, 224, 224, 3))
opt = Adam()
# Some contrived example
loss = K.square(x)
updates = opt.get_updates([x], [], [loss])
iterate = K.function([x], [], updates=updates) This will give me Also, how do i added a placeholder on top of proxy = K.placeholder(shape=K.int_shape(model.input))
# This was my futile attempt to connect to existing model graph
proxy = model.input + K.variable(0.) |
@bstriner I am new to Keras, in your example how I can modify it to get the model's parameters if I have a loaded network (e.g. VGG16) through |
@mongoose54 kind of unrelated to the OP. If you have a model you can inspect That will give you the tensor variable which gives you the variable name. You can get the actual value of the variable with Cheers |
@bstriner Sorry for placing it here. However I have a question related to this topic: Let's say I have the losses explicitly defined in a numpy array: |
@bstriner thx for such an example but i have a weird problem. the only reasonable difference with your example is: model is actually learning, loss is going down, val accuracy increasing (actually up to 100 in some iterations), i can save and load the model etc. but something wrong with LR not changing. (i have changed these values just to see the change more easily, but no luck) self.opt = SGD(lr=1.0, decay= 1e-3, momentum=0.5, nesterov=False) K.get_value(m.opt.lr) => outputting always 1.0 in each loop, it doesnt change. any ideas anyone? edit: just added opt.lr to outputs directly, still no change. edit2: adding "self.lr = lr" edit3: since i use tf as backend, probably it works ok as it builds up a graph, but some dependencies might not work as expected since opt.lr is not updated correctly. Is this a bug? what do i miss here? |
Hi @bstriner, small question for you. Suppose I add another output head to your nn above, then what would need further adjustment? It's just that I have a very similar nn, but as soon as I add an extra head (output) to it, then I get the Everything is working fine before adding the extra output. Working code:
Not-working code:
Aditional info: When python vs code debugging, I can see the contents of loss being (correctly?) constructed/passed in, but I can't see as well inside params... |
Good morning @bstriner , re-reading my own question, maybe I should feed the "_" inside the call to K.function, as now it needs 2 different y_true's, and |
I am working on guided backprop for activation maximization. Instead of implementing rmsprop, Adam etc., I want to reuse optimizers defined in keras.
The text was updated successfully, but these errors were encountered: