Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I cannot reproduce the results #19

Closed
sydney0zq opened this issue Nov 27, 2020 · 1 comment
Closed

I cannot reproduce the results #19

sydney0zq opened this issue Nov 27, 2020 · 1 comment

Comments

@sydney0zq
Copy link

Hi, thanks for releasing the codes and I run your code completely but the results cannot be reproduced at all.

I use DAVIS 2017 as my training set, and evaluate the checkpoints on DAVIS 2017 validation set.

However the results are:

G/J/F: 67.2/65.2/69.2

But in your paper:

G/J/F: 72.3/69.9/74.7

I notice you use 4 GPUS with 16GB memory each and here I only has 4 GPUs with 11GB memory each. I think the hardware difference should NOT make such a significant difference. Could you please have an explanation on that because there are a few fellows issuing the same problems below.

Thanks in advance.

And what's more, your paper says it is a transductive method, however, your codes are TOTALLY different with the equations in your paper, and the masks are not semi-supervised, it is fully supervised by cross-entropy loss.

Please explain the issue which I think it is an essential problem in your paper.

@eastOffice
Copy link
Collaborator

Hello,

We have contacted you vial email and provided our recent training log, and our advice are as follows:

  1. Make sure you are able to reproduce our paper's result using our pretrained model, which is linked in github repository.
  2. With 4 11GB GPUs, if you can't train with default batch size (16), you have to adjust parameters accordingly, eg. reduce lr.
  3. Check Pytorch and cuda version, Pytorch 1.0 and cuda 10.0 are recommended.

As for your question with our paper, we suggest you go to https://davischallenge.org/challenge2017/index.html and read related papers. Our learning method during training time is supervised by the mask annotations from all frames in all videos. The problem is semi-supervised, because during inference time, only the annotation of the first frame is given. All VOS papers in semi-supervised literature follow this setting.

Thank you for your interest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants