Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the experiment setting. #1

Closed
kxgong opened this issue Nov 2, 2020 · 6 comments
Closed

About the experiment setting. #1

kxgong opened this issue Nov 2, 2020 · 6 comments

Comments

@kxgong
Copy link

kxgong commented Nov 2, 2020

Hi, Muhammad, I recently have read your paper. It's easy to follow and interesting. And I am currently trying to reproduce some results in your paper. But I encounter some problem with the 'iNat 2017' experiment. I use the same settings you described in your paper, but I can only obtain accuracy 47.5% in epoch 100 for the baseline 'Cross Entropy Loss'. (Settings I used: ResNet50 pretrained on ImageNet, learning rate 0.01, SGD optimizer)

@abdullahjamal
Copy link
Owner

Hi, for the baselines, we mostly follow Class-Balanced loss paper (https://github.com/richardaecn/class-balanced-loss) and classifier balancing (https://github.com/facebookresearch/classifier-balancing). You might want to increase the batch size because in our paper, we used a distributed system with multiple workers having the batch size of I guess 64. You should also train for more epochs.

@kxgong
Copy link
Author

kxgong commented Nov 2, 2020

Hi, for the baselines, we mostly follow Class-Balanced loss paper (https://github.com/richardaecn/class-balanced-loss) and classifier balancing (https://github.com/facebookresearch/classifier-balancing). You might want to increase the batch size because in our paper, we used a distributed system with multiple workers having the batch size of I guess 64. You should also train for more epochs.

Thanks for replying, I actually also used the distributed training with the total batch size 512 (on 4 GPUs). But I still wondering should I decay the learning rate during training? Or keeping the learning rate 0.01 till the end of the training?

@abdullahjamal
Copy link
Owner

Just for the baseline results, you do have to decay the learning rate during training. You can follow either of the links. I found cosine learning rate scheduler in classifier-balancing works better.

@kxgong
Copy link
Author

kxgong commented Nov 2, 2020

Just for the baseline results, you do have to decay the learning rate during training. You can follow either of the links. I found cosine learning rate scheduler in classifier-balancing works better.

Thanks! I will use the cosine scheduler with a initial learning 0.01 (described in your paper) to train 200 epochs. Is this settings correct?

@abdullahjamal
Copy link
Owner

If you want to use cosine scheduler, then you can use LR of 0.2 decayed to 0. Let me know if you still face difficulty in it.

@kxgong
Copy link
Author

kxgong commented Nov 2, 2020

If you want to use cosine scheduler, then you can use LR of 0.2 decayed to 0. Let me know if you still face difficulty in it.

Got it, I will try the new settings. 👍 Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants