MAML makes weights stuck in some bad equilibrium #18

kapsl · 2017-11-08T07:56:06Z

Hi Chelsea,
I'm trying to use MAML for training an convolutional autoencoder, which should learn to encode robot motorcurrents. (Several traces from different robots, so every robot is a task, the traces are the samples of the task.)

I got this working in general, but it seems like MAML drives the weights into a direction, so that it produces a complete straight line, at the baseline of the amplitude. (Which in general probably makes sense?! because from this it is rather easy to go train to new motorcurrents?!) (See figure below)

It seems that the problem is, that when I try to finetune afterward for one robot, the optimization doesn't guide the weights out of this straight line equilibrium. So it actually doesn't work. I tried to add some noise to the weights, then it somehow goes out, but then the MAML pretraining is not preserved really well...

This graph shows the output of the autoencoder after MAML training

This graph (the green one) shows the real motor current. MAML seems to find a straight line at the base of the amplitude, but when trying to finetune to this task, it doesn't get out, anymore.

cbfinn · 2017-11-08T16:55:05Z

That's strange. I haven't seen anything like that before. Have you tuned the update_lr? (alpha in the paper) It's possible that it might be much too large or much too small.

Can you visualize the motor current for multiple tasks?

Are you using a different sample for the inner and outer objectives? (You should be.)

kapsl · 2017-11-09T09:35:13Z

Yes I played around a lot with the learning rates. I found out that it is very sensible to those learning rates, and doesn't learn at all, if its wrong. Currently i have meta_lr 0.0001 and update_lr 0.00001 while having num_updates 3. I recognized, if I make num_updates > 1, I have to make update_lr rather small that it learns anything - is this probably because the updates regarding every single task get to different, when going many steps into one task?!

I think for some reason, now the equilibrium is no problem anymore, don't know why. It takes me about 100 iterations of finetuning with a higher learning rate to come to a good accuracy.

What number of inner gradient steps do you normally use for good results?

The motor currents for the multiple tasks are not so different. Just having their spikes etc. at different locations.

What exactly does the console output of the preloss and postloss mean. Loss before updating the inner gradients and afterwards?!

Are you using a different sample for the inner and outer objectives? (You should be.)
I guess the code below is creating different samples for the inner and outer objective?! a single tasks, b meta training?!

inputa = tf.slice(image_tensor, [0,0,0], [-1,num_classes*FLAGS.update_batch_size, -1])
inputb = tf.slice(image_tensor, [0,num_classes*FLAGS.update_batch_size, 0], [-1,-1,-1])
labela = tf.slice(label_tensor, [0,0,0], [-1,num_classes*FLAGS.update_batch_size, -1])
labelb = tf.slice(label_tensor, [0,num_classes*FLAGS.update_batch_size, 0], [-1,-1,-1])

cbfinn · 2017-11-10T18:24:24Z

> The motor currents for the multiple tasks are not so different. Just

having their spikes etc. at different locations. In that case, I would at least expect the base current to be learnable. You might consider using gradient clipping or meta-gradient clipping, which we have found can stabilize training in different settings. For example, I typically use a meta_lr of 0.001 (the default for Adam). But if you see spikes in meta-training performance with this learning rate, it would make sense to clip the meta-gradients. If you have to set the inner learning rate to be very small, it's possible that a larger learning rate with clipping could produce better performance.

I guess this is creating different samples for the inner and outer

objective?! a single tasks, b meta training?! Yes, that code creates different samples for the inner and outer objective. a corresponds to the inner objective and b corresponds to the outer objective. I just wanted to make sure that you didn't modify that code, and that the inner and outer data corresponds to the same "task".

…

On Thu, Nov 9, 2017 at 1:35 AM, kapsl ***@***.***> wrote: Yes I played around a lot with the learning rates. I found out that it is very sensible to those learning rates, and doesn't learn at all, if its wrong. Currently i have meta_lr 0.0001 and update_lr 0.00001 while having num_updates 3. I recognized, if make num_updates > 1 I have to make update_lr rather small that it learns anything - is this probably because the updates regarding every single task get to different when going many steps into one task?! It also seems like it only reaches the straight line equilibrium if meta training runs pretty long. But I have to run more tests here. The motor currents for the multiple tasks are not so different. Just having their spikes etc. at different locations. *Are you using a different sample for the inner and outer objectives? (You should be.)* I guess this is creating different samples for the inner and outer objective?! a single tasks, b meta training?! inputa = tf.slice(image_tensor, [0,0,0], [-1,num_classes*FLAGS.update_batch_size, -1]) inputb = tf.slice(image_tensor, [0,num_classes*FLAGS.update_batch_size, 0], [-1,-1,-1]) labela = tf.slice(label_tensor, [0,0,0], [-1,num_classes*FLAGS.update_batch_size, -1]) labelb = tf.slice(label_tensor, [0,num_classes*FLAGS.update_batch_size, 0], [-1,-1,-1]) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#18 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABMlAfvTh7pTNjUgooY3NrdbsyLrv8Eqks5s0sdRgaJpZM4QV-Pj> .

cbfinn closed this as completed Mar 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAML makes weights stuck in some bad equilibrium #18

MAML makes weights stuck in some bad equilibrium #18

kapsl commented Nov 8, 2017 •

edited

Loading

cbfinn commented Nov 8, 2017

kapsl commented Nov 9, 2017 •

edited

Loading

cbfinn commented Nov 10, 2017 via email

MAML makes weights stuck in some bad equilibrium #18

MAML makes weights stuck in some bad equilibrium #18

Comments

kapsl commented Nov 8, 2017 • edited Loading

cbfinn commented Nov 8, 2017

kapsl commented Nov 9, 2017 • edited Loading

cbfinn commented Nov 10, 2017 via email

kapsl commented Nov 8, 2017 •

edited

Loading

kapsl commented Nov 9, 2017 •

edited

Loading