-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation confusing on whether SSD and RetinaNet count background as class object #4106
Comments
@douglasrizzo Thanks for reporting. Blame me for copy-pasting the doc string of RetinaNet which had exactly the same issue. The |
@datumbox Thanks for the reply. I actually believe the RetinaNet documentation suffers from a similar problem. The documentation for the
The function
But when a
Since |
I just saw your PR fixes both documents. |
@datumbox |
Sorry to reopen the issue, but the other question was not answered: |
@Xonxt The documentation was improved to reflect on he situation. To answer your question, the num_classes should include the background which is encoded with 0. During inference the model predicts labels starting from 1. |
The documentation for the
SSD
class mentions that we should not count the background as an object class when passing the number of classes as a parameter to instantiate an SSD object.vision/torchvision/models/detection/ssd.py
Line 144 in 9596668
However, further down in the same file, an SSD object is instantiated in a function that explicitly says that the background should be counted as an object class, but this is not taken into account in the code (i.e. I did not see
num_classes
be decremented by one when creating the SSD object).vision/torchvision/models/detection/ssd.py
Line 589 in 9596668
Here is the documentation for this function, which says we should include the background in the number of classes.
vision/torchvision/models/detection/ssd.py
Line 563 in 9596668
This is confusing. Should we or should we not count the background as an object class when instantiating the SSD? In either case, how should object classes be ID'd during training?
As an example, with Faster RCNN, the background is counted as an object class (with ID 0 reserved for it) and actual object classes are identified during training starting from ID 1. What should be the procedure for SSD?
I have also opened a topic in the forums, since this is both a personal question of mine as well as a possible issue in the docs (or the code).
The text was updated successfully, but these errors were encountered: