-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to apply several SGD steps within the ineer loop? #3
Comments
I've also been trying to build off this repo, but have encountered the same issue. It seems that updating the weights manually as done here makes them non-trainable. @davidjimenezphd Have you found a workaround? Without multiple inner loop SGD steps, this repo doesn't actually run the full version of MAML. |
Hi Alekxos. Yes we find a solution based on "watch"-ing some variables in the gradient tape. Give me some time, and I'll try to upload the solution. |
It's definitely a bug in Tensorflow. We worked around it by doing following:
This is a bit hacky and needs some extra calculation (for copying and forwarding through the net, but Tensorflow has so many open issues, that we use this as long the bug exists ;-) - and I think it will be there for a while..) See our Tensorflow issue: tensorflow/tensorflow#34335 |
Hi @shufflebyte This is actually not a tensorflow bug.
In this function, This is not problematic in this repo because we do manual replacement:
|
Hi @llan-ml Actually i tried to directly apply a tf.keras.optimizers.SGD() for updating the fast weights, this can keep the variables in the copied model trainable. |
Have you found out how to add batch and serveral SGD steps? I have ben stuck in this problem some days, I tried to use two tapes to watch the whole batch |
Hi @HilbertXu In the case of multiple inner gradient steps, you need to manually watch the weight tensors (they are already not |
Hi llan-ml Thanks for your help, i will try it later |
Hi @HilbertXu I wrote a toy MAML-like script, which may be helpful for you. Please let me know if you find that the implementation is correct and works in more practical situations. |
Hi @llan-ml It shows that i dont have the access to your files. Could u please help me with this? Maybe we can chat on wechat or email? My ss server has been blocked so it's hard for me to |
I forgot to enable sharing of that link, and now it should be accessible. Also, feel free to access me by email in my profile. |
But I also have error, why model.get_weight() return empty list.
|
And I try your code, the model and copied_model is the same object.When you update copied_model,it also update model. |
Hi @mari-linhares , thanks for the repo!
We are building on your code to implement a bit more general version of MAML that includes a batch of tasks within the inner loop and several steps of gradient descent wrt the parameters of each task. However, we are stuck in how to add several steps of SGD within your code using tensorflow 2.0. Do you have any idea of how to do that?
The text was updated successfully, but these errors were encountered: