Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot reproduce accuracy 84% (after step2) #11

Closed
asanakoy opened this issue Oct 31, 2018 · 8 comments
Closed

Cannot reproduce accuracy 84% (after step2) #11

asanakoy opened this issue Oct 31, 2018 · 8 comments

Comments

@asanakoy
Copy link

Hi Hao,

Thank you for a neat implementation.

I wonde if training with the hyperparameters written in README

 --base_lr 1e-2 \
 --batch_size 64 --epochs 25 --weight_decay 1e-5 \
 --model "model.pth" 

gives 84.17% test accuracy?

I used exactly the commads which you provide in the README:

    Step 1.
    $ CUDA_VISIBLE_DEVICES=0,1,2,3 ./src/bilinear_cnn_fc.py --base_lr 1.0 \
          --batch_size 64 --epochs 55 --weight_decay 1e-8 \
          | tee "[fc-] base_lr_1.0-weight_decay_1e-8-epoch_.log"

    Step 2. 
    $ CUDA_VISIBLE_DEVICES=0,1,2,3 ./src/bilinear_cnn_all.py --base_lr 1e-2 \
          --batch_size 64 --epochs 25 --weight_decay 1e-5 \
          --model "model.pth" \
          | tee "[all-] base_lr_1e-2-weight_decay_1e-5-epoch_.log"

I have trained step1 model and got 76.67% accuracy on test. I use this as initialization for step2 model and finetune all the layers further. But the accuracy saturates at 76.61% and doesn't grow further.

Are there any extra tricks to get the desired performance?

@rohitgajawada
Copy link

Have you tried doing step-2 directly?

@asanakoy
Copy link
Author

@rohitgajawada yes, it gets even lower ~ 57%

@rohitgajawada
Copy link

Oh that is sad, I also need a reproducible bilinear-cnn in pytorch.
In this code after doing the bilinear operation, the output x is only undergoing a sqrt operation. In the original paper, they do: sign(x) * sqrt(|x|) instead. Could this be a cause of reduced accuracy or am I missing something out?

@rohitgajawada
Copy link

Never mind, saw issue #4

@asanakoy
Copy link
Author

That's not a problem, since they compute a Gram matrix after Relu. Do you have any other good repository in mind?

@rohitgajawada
Copy link

rohitgajawada commented Oct 31, 2018

Ya my bad, realized it immediately after commenting :P If I find another repo that is able to reach the required accuracy, I'll notify you

@rohitgajawada
Copy link

Hi @HaoMood , any updates with this problem? Were you able to obtain the 84% test accuracy?

@HaoMood
Copy link
Owner

HaoMood commented Jan 6, 2019

It is weird, since random seeds are fixed and I had tried it several times to make it can be re-implemented before this submission.

Maybe you can give some details about your hardware, such as the GPU used as well as CUDA and cuDNN version.

@HaoMood HaoMood closed this as completed Jan 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants