Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About reproduction #2

Closed
zhangpj opened this issue Sep 3, 2020 · 5 comments
Closed

About reproduction #2

zhangpj opened this issue Sep 3, 2020 · 5 comments

Comments

@zhangpj
Copy link

zhangpj commented Sep 3, 2020

Hi, thank you for sharing this project. Good job! I tried to run this project, but there are some questions that confuse me.

1 When running train_sfrs_dist.sh, Loss_hard and Loss_soft are like follows: Loss_hard << soft-weight(0.5)*Loss_soft, so, does Loss_hard make a small contribution or even negligible? and Loss_soft does not seem to converge, have you ever seen a similar phenomenon when you train the network?

Epoch: [4-7][160/320]	Time 0.672 (0.674)	Data 0.069 (0.077)	Loss_hard 0.018 (0.052)	Loss_soft 1.749 (2.275)
Epoch: [4-7][170/320]	Time 0.670 (0.672)	Data 0.065 (0.076)	Loss_hard 0.041 (0.050)	Loss_soft 2.780 (2.272)
Epoch: [4-7][180/320]	Time 0.671 (0.671)	Data 0.063 (0.075)	Loss_hard 0.015 (0.049)	Loss_soft 1.535 (2.251)
Epoch: [4-7][190/320]	Time 0.665 (0.670)	Data 0.063 (0.074)	Loss_hard 0.005 (0.049)	Loss_soft 1.572 (2.239)
Epoch: [4-7][200/320]	Time 0.666 (0.669)	Data 0.060 (0.073)	Loss_hard 0.019 (0.048)	Loss_soft 2.144 (2.230)
Epoch: [4-7][210/320]	Time 0.667 (0.668)	Data 0.063 (0.073)	Loss_hard 0.022 (0.049)	Loss_soft 2.122 (2.247)
Epoch: [4-7][220/320]	Time 0.658 (0.668)	Data 0.055 (0.072)	Loss_hard 0.005 (0.048)	Loss_soft 1.374 (2.239)
Epoch: [4-7][230/320]	Time 0.504 (0.667)	Data 0.047 (0.071)	Loss_hard 0.028 (0.047)	Loss_soft 1.855 (2.239)
Epoch: [4-7][240/320]	Time 0.665 (0.667)	Data 0.061 (0.071)	Loss_hard 0.201 (0.048)	Loss_soft 3.224 (2.247)
Epoch: [4-7][250/320]	Time 0.668 (0.666)	Data 0.063 (0.070)	Loss_hard 0.001 (0.047)	Loss_soft 1.920 (2.239)
Epoch: [4-7][260/320]	Time 0.660 (0.666)	Data 0.068 (0.070)	Loss_hard 0.037 (0.047)	Loss_soft 2.350 (2.240)
Epoch: [4-7][270/320]	Time 0.658 (0.666)	Data 0.062 (0.069)	Loss_hard 0.068 (0.047)	Loss_soft 3.046 (2.240)
Epoch: [4-7][280/320]	Time 0.717 (0.668)	Data 0.060 (0.069)	Loss_hard 0.019 (0.048)	Loss_soft 2.411 (2.233)
Epoch: [4-7][290/320]	Time 0.693 (0.669)	Data 0.060 (0.068)	Loss_hard 0.096 (0.048)	Loss_soft 3.048 (2.247)
Epoch: [4-7][300/320]	Time 0.669 (0.670)	Data 0.059 (0.068)	Loss_hard 0.091 (0.049)	Loss_soft 3.546 (2.255)
Epoch: [4-7][310/320]	Time 0.669 (0.670)	Data 0.064 (0.068)	Loss_hard 0.014 (0.049)	Loss_soft 2.299 (2.247)
Epoch: [4-7][320/320]	Time 0.629 (0.669)	Data 0.026 (0.067)	Loss_hard 0.057 (0.048)	Loss_soft 3.039 (2.261)

2 The results on pitts250K of the best model in my reproduction are slightly lower than the results of your paper. 89.8% | 95.9% | 97.3% VS 90.7% | 96.4% | 97.6%. The best model in my reproduction is output of 5th epoch of the third generation, instead of convergence at the fourth generation as mentioned in the paper. Is the best model the output of the last iteration when you train?

3 I only use one GPU (2080ti), the other parameters are default. I don't Know if the inferior results are duo to too few GPUs, or is there something else I need to pay attention to?

@yxgeee
Copy link
Owner

yxgeee commented Sep 4, 2020

  1. The loss seems normal. The convergence may be slow in later epochs.
  2. The best model selected by validation results may not achieve the optimal performance on the test set. The model reported in the paper was selected from the last epoch of the 4th generation. Since there may exist training randomness, it is recommended to test the five checkpoints in the last generation and choose the best-performing one.
  3. If you use the default settings on one GPU, only one triplet will be adopted for training in each mini-batch. Try to modify --tuple-size in the training scripts to adopt more triplets on one GPU for training. In my experiments, I adopted 4 GPUs and one triplet on each GPU, thus a batch of 4 triplets is used. In case the GPU memory is not enough for 4 triplets on only one 2080TI, maybe you need to decrease the learning rate to fit your batch size.

@zhangpj
Copy link
Author

zhangpj commented Sep 5, 2020

@yxgeee All right, thanks for your suggestion, I will have a try.

@zhangpj zhangpj closed this as completed Sep 5, 2020
@zhangpj
Copy link
Author

zhangpj commented Oct 12, 2020

@yxgeee Hi, in your paper, you also evaluated SFRS on Oxford 5k, Paris 6k and Holidays datasets, can you share source codes that evaluate SFRS on those datasets, or tell me how do you evaluate SFRS on Oxford 5k, Paris 6k and Holidays datasets?

@zhangpj zhangpj reopened this Oct 12, 2020
@yxgeee
Copy link
Owner

yxgeee commented Oct 12, 2020

My colleague helps me test SFRS on retrieval datasets, and maybe I will merge these code in this repo after re-organization. We strictly follow all the same settings (e.g. image size, augmentation, etc.) as SARE and NetVLAD. So you could also refer to their code for evaluation details.

@zhangpj
Copy link
Author

zhangpj commented Oct 12, 2020

Ok, thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants