-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not able to reproduce the results listed in the paper with my trained model #7
Comments
Due to the offset overflow of the deformable convolution, the training process may be collapsed. However, we never met this problem when we trained the final model (which takes 500K iters). For convenience, we provide our loss curves as follows: I hope these could help you. |
@Paper99 I also encounter this problem on my machine, which GPU card did you use for training? I use V100 32G, and it will collapse at about 300k iter. |
Hi, we use 8 V100 (16G) GPUs or 8 1080ti GPUs to train our model. |
@Paper99 How do you suggest to solve the collapse problem? If we do clipping on the DCN module's weight, I cannot confirm the range ... Do we have another replaceable module to avoid the issue? |
@Paper99 Hello, how do you select the final release checkpoint, The 50w iter checkpoint or select the best among several final instances? |
Just choose the best. |
same question. |
Hi, Is the problem solved? I have the same problem. Thank you. |
I met a problem of mode collapse when step number is larger than 300K, and with the final model I got, I am not able to reproduce the result shown int the paper. Can you give your loss curve? @Paper99
The text was updated successfully, but these errors were encountered: