Classification Loss: CE vs BCE #3

glenn-jocher · 2018-09-04T12:29:02Z

When developing the training code I found that replacing Binary Cross Entropy (BCE) loss with Cross Entropy (CE) loss significantly improves Precision, Recall and mAP. All show about 2X improvements using CE, though the YOLOv3 paper states these loss terms as BCE in darknet.

The two loss terms are on lines 162 and 163 of models.py. If anyone has any insight into this phenomenon I'd be very interested to hear it. For now you can swap the two back and forth. Note that SGD does not converge using either BCE or CE, so that issue appears independent of this one.

The text was updated successfully, but these errors were encountered:

xyutao · 2018-09-06T08:24:30Z

BCE computes sigmoid predictions independently for each class, while CE introduces inter-class competition. For BCE, an instance is allowed to be both class-A and class-B at the same time, which is better for multi-label task (e.g. OpenImage dataset). But for single-label instances (e.g. COCO), using BCE could cause high-score false positives and harm the AP.

nirbenz · 2018-10-10T14:03:58Z

While this is true, in theory, it is also clearly stated in the YOLOv3 paper that BCE is a big part of the models' general success (in COCO and PASCAL-VOC).
Looking at models.py I am actually not sure the commented out lines do that. It is supposed to be a binary classification (BCE) per class - in a somewhat one-vs-all manner.

dtmoodie · 2019-03-14T17:53:17Z

Hello,

I'm working on getting BCE loss to work in a multi-label task, the majority of my classes follow a hierarchical one vs all type classification, but a few leaves of my hierarchical tree could have multiple states. I'm experimenting with using BCE for the entire tree as in the original darknet paper, but I have yet to get any good results. My loss decreases significantly, but in the end my classification predictions are completely wrong. (Calling a car a street sign ^_-)
Has anyone else had any success with getting BCE to work?

glenn-jocher · 2023-11-15T08:07:32Z

@dtmoodie hello,

Using BCE for hierarchical multi-label classification can be challenging, especially if some leaves in the hierarchical tree have multiple states. This can lead to unexpected results with misclassified predictions.

One potential approach to consider is adapting the loss function or the model architecture to better handle the hierarchical structure and multiple states. Additionally, experimenting with different loss functions or model configurations tailored to hierarchical multi-label classification tasks might yield improved results.

If you'd like further guidance on this, feel free to consult the Ultralytics Docs for additional insights and considerations while experimenting with BCE loss in your multi-label task.

Keep up the great work, and best of luck with your experimentation!

glenn-jocher mentioned this issue Sep 5, 2018

Resume training from official yolov3 weights #2

Closed

glenn-jocher added the help wanted Extra attention is needed label Sep 9, 2018

glenn-jocher self-assigned this Sep 9, 2018

glenn-jocher mentioned this issue Sep 13, 2018

Model Loss #12

Closed

glenn-jocher closed this as completed Feb 11, 2019

YourGc mentioned this issue May 6, 2019

RuntimeError: reduce failed to synchronize: device-side assert triggered #263

Closed

feitiandemiaomi mentioned this issue Jun 13, 2019

RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one #331

Closed

dagap mentioned this issue Jul 24, 2019

Guideline for training a model from scratch #389

Closed

sanazss mentioned this issue Jul 29, 2019

error on using parallel gpu #404

Closed

TOMLEUNGKS mentioned this issue Sep 15, 2019

Training was interrupted after the first epoch #499

Closed

chrisway613 mentioned this issue Apr 3, 2020

Exception with NMS when using gpus #1004

Closed

rimu123 mentioned this issue Apr 15, 2020

Train my own dataset #1035

Closed

winnerCR7 mentioned this issue Jul 3, 2020

After interrupting training, load weights/last.pt to continue training #1368

Closed

thibault390 mentioned this issue May 7, 2021

Abandon (core dumped) #1755

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Classification Loss: CE vs BCE #3

Classification Loss: CE vs BCE #3

glenn-jocher commented Sep 4, 2018 •

edited

xyutao commented Sep 6, 2018

nirbenz commented Oct 10, 2018

dtmoodie commented Mar 14, 2019 •

edited

glenn-jocher commented Nov 15, 2023

Classification Loss: CE vs BCE #3

Classification Loss: CE vs BCE #3

Comments

glenn-jocher commented Sep 4, 2018 • edited

xyutao commented Sep 6, 2018

nirbenz commented Oct 10, 2018

dtmoodie commented Mar 14, 2019 • edited

glenn-jocher commented Nov 15, 2023

glenn-jocher commented Sep 4, 2018 •

edited

dtmoodie commented Mar 14, 2019 •

edited