Why BCE is used instead of CE with Softmax? #11

JoyHuYY1412 · 2020-07-10T19:30:07Z

Each task seems to be a multi-class classification, so why not using nn.CrossEntropyLoss?

brjathu · 2020-07-17T13:48:42Z

To minimize catastrophic forgetting, it's better not to pull down the previously learned distributions, even if they are not present in the training data.

salman-h-khan closed this as completed Dec 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why BCE is used instead of CE with Softmax? #11

Why BCE is used instead of CE with Softmax? #11

JoyHuYY1412 commented Jul 10, 2020

brjathu commented Jul 17, 2020

Why BCE is used instead of CE with Softmax? #11

Why BCE is used instead of CE with Softmax? #11

Comments

JoyHuYY1412 commented Jul 10, 2020

brjathu commented Jul 17, 2020