Question about the model architecture (the batchnorm) #2

jihoontack · 2022-05-11T03:10:46Z

Hi, again :)

Thank you for the wonderful work.

While reading your paper and seeing the implementation, it seems that this work used normal batchnorm (instead of transductive batchnorm that is usually used in other MAML-based papers)

Have you tried transductive batchnorm in your case? I was trying to use pre-trained checkpoints when training my MAML, but, I cannot reach the performance in your paper. I was just wondering whether the transductive batchnorm was the cause.

Thank you very much for your time!
Best,
Jihoon

Han-Jia · 2022-05-12T06:35:37Z

Hi, Jihoon,

We do not use transductive batchnorm in the paper, and we reset the batchnorm for each task during the meta-test.

I think the learning rate and step size matter when using the pre-trained weights. A relatively larger learning rate and step size may help.

best,
Han-Jia

jihoontack · 2022-05-12T23:11:40Z

Thank you very much for your reply!

Really appreciate it :)

Best,
Jihoon

jihoontack closed this as completed May 12, 2022

wlaud1001 mentioned this issue May 30, 2022

Questions about batchnorm #3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the model architecture (the batchnorm) #2

Question about the model architecture (the batchnorm) #2

jihoontack commented May 11, 2022

Han-Jia commented May 12, 2022

jihoontack commented May 12, 2022

Question about the model architecture (the batchnorm) #2

Question about the model architecture (the batchnorm) #2

Comments

jihoontack commented May 11, 2022

Han-Jia commented May 12, 2022

jihoontack commented May 12, 2022