Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAML makes weights stuck in some bad equilibrium #18

Closed
kapsl opened this issue Nov 8, 2017 · 3 comments
Closed

MAML makes weights stuck in some bad equilibrium #18

kapsl opened this issue Nov 8, 2017 · 3 comments

Comments

@kapsl
Copy link

kapsl commented Nov 8, 2017

Hi Chelsea,
I'm trying to use MAML for training an convolutional autoencoder, which should learn to encode robot motorcurrents. (Several traces from different robots, so every robot is a task, the traces are the samples of the task.)

I got this working in general, but it seems like MAML drives the weights into a direction, so that it produces a complete straight line, at the baseline of the amplitude. (Which in general probably makes sense?! because from this it is rather easy to go train to new motorcurrents?!) (See figure below)

It seems that the problem is, that when I try to finetune afterward for one robot, the optimization doesn't guide the weights out of this straight line equilibrium. So it actually doesn't work. I tried to add some noise to the weights, then it somehow goes out, but then the MAML pretraining is not preserved really well...

This graph shows the output of the autoencoder after MAML training
grafik

This graph (the green one) shows the real motor current. MAML seems to find a straight line at the base of the amplitude, but when trying to finetune to this task, it doesn't get out, anymore.
grafik

@cbfinn
Copy link
Owner

cbfinn commented Nov 8, 2017

That's strange. I haven't seen anything like that before. Have you tuned the update_lr? (alpha in the paper) It's possible that it might be much too large or much too small.

Can you visualize the motor current for multiple tasks?

Are you using a different sample for the inner and outer objectives? (You should be.)

@kapsl
Copy link
Author

kapsl commented Nov 9, 2017

Yes I played around a lot with the learning rates. I found out that it is very sensible to those learning rates, and doesn't learn at all, if its wrong. Currently i have meta_lr 0.0001 and update_lr 0.00001 while having num_updates 3. I recognized, if I make num_updates > 1, I have to make update_lr rather small that it learns anything - is this probably because the updates regarding every single task get to different, when going many steps into one task?!

I think for some reason, now the equilibrium is no problem anymore, don't know why. It takes me about 100 iterations of finetuning with a higher learning rate to come to a good accuracy.

What number of inner gradient steps do you normally use for good results?

The motor currents for the multiple tasks are not so different. Just having their spikes etc. at different locations.

What exactly does the console output of the preloss and postloss mean. Loss before updating the inner gradients and afterwards?!

Are you using a different sample for the inner and outer objectives? (You should be.)
I guess the code below is creating different samples for the inner and outer objective?! a single tasks, b meta training?!

inputa = tf.slice(image_tensor, [0,0,0], [-1,num_classes*FLAGS.update_batch_size, -1])
inputb = tf.slice(image_tensor, [0,num_classes*FLAGS.update_batch_size, 0], [-1,-1,-1])
labela = tf.slice(label_tensor, [0,0,0], [-1,num_classes*FLAGS.update_batch_size, -1])
labelb = tf.slice(label_tensor, [0,num_classes*FLAGS.update_batch_size, 0], [-1,-1,-1])

@cbfinn
Copy link
Owner

cbfinn commented Nov 10, 2017 via email

@cbfinn cbfinn closed this as completed Mar 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants