-
Notifications
You must be signed in to change notification settings - Fork 1.4k
2509 update focalloss to use sigmoid (7/July) #2513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fc195d0 to
f29ebce
Compare
|
Thanks for the quick fix. Thanks. |
|
Hi @wyli , in paper (https://arxiv.org/pdf/1708.02002.pdf) it uses Hi @ristoh @ericspod , could you please help to double check it? |
|
Hi @yiheng-wang-nv, the footnote you cited is for describing the formulation. In section 4 classification subnet, it's clear: "for each of the A anchors and K object classes... Finally sigmoid activations are attached to output the KA replacing the sigmoid with softmax should work as well (for classification with mutually exclusive classes), for example in their source code they support both: tensorflow implements the sigmoid version https://github.com/tensorflow/addons/blob/e83e71cf07f65773d0f3ba02b6de66ec3b190db7/tensorflow_addons/losses/focal_loss.py |
f939af1 to
c39edcf
Compare
Signed-off-by: Wenqi Li <wenqil@nvidia.com>
Signed-off-by: Wenqi Li <wenqil@nvidia.com>
Signed-off-by: Wenqi Li <wenqil@nvidia.com>
c39edcf to
db64d08
Compare
I would like to add an extra comment. For anchor-based detector, sigmoid is more widely adopted. So the classification branch actually performs multi-label classification. At inference stage, each anchor box will be classified to the class with highest score, even when the scores do not fulfill |
Signed-off-by: Wenqi Li <wenqil@nvidia.com>
57a6011 to
dfb3318
Compare
ericspod
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to go though perhaps some more explanation of the computation would help to clarify why this differs from the "standard" definitions of focal loss most commonly seen, why things are done in log space for instance.
Signed-off-by: Wenqi Li <wenqil@nvidia.com>

Signed-off-by: Wenqi Li wenqil@nvidia.com
Fixes #2509
Description
using sigmoid activation to align with the literature
the change is verified:
previously gamma=0 reduces it to
nn.CrossEntropyLossnow gamma=0 reduces it to
nn.BCEWithLogitsLossStatus
Ready
Types of changes
./runtests.sh -f -u --net --coverage../runtests.sh --quick --unittests.make htmlcommand in thedocs/folder.