Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the model architecture (the batchnorm) #2

Closed
jihoontack opened this issue May 11, 2022 · 2 comments
Closed

Question about the model architecture (the batchnorm) #2

jihoontack opened this issue May 11, 2022 · 2 comments

Comments

@jihoontack
Copy link

Hi, again :)

Thank you for the wonderful work.

While reading your paper and seeing the implementation, it seems that this work used normal batchnorm (instead of transductive batchnorm that is usually used in other MAML-based papers)

Have you tried transductive batchnorm in your case? I was trying to use pre-trained checkpoints when training my MAML, but, I cannot reach the performance in your paper. I was just wondering whether the transductive batchnorm was the cause.

Thank you very much for your time!
Best,
Jihoon

@Han-Jia
Copy link
Owner

Han-Jia commented May 12, 2022

Hi, Jihoon,

We do not use transductive batchnorm in the paper, and we reset the batchnorm for each task during the meta-test.

I think the learning rate and step size matter when using the pre-trained weights. A relatively larger learning rate and step size may help.

best,
Han-Jia

@jihoontack
Copy link
Author

Thank you very much for your reply!

Really appreciate it :)

Best,
Jihoon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants