Wrong derivation of negative gradient of sigmoid+BCE #5

GeraldHan · 2021-11-19T07:51:55Z

Sorry for the wrong derivation of the negative gradient for Sigmoid+BCE loss.
The correct negative gradient is

$$ \nabla \mathcal{H}_i= y_i - \sigma(\mathcal{H}_i) $$

In theory, as long as the pseudo label has a negative correlation with the bias model prediction, it is able to mine the hard examples.
The wrong gradient in the paper is actually an approximation of $\nabla \mathcal{H}_i$. That's why it still works well.

Murphyzc · 2023-07-20T01:30:37Z

What's reason about this statement "In theory, as long as the pseudo label has a negative correlation with the bias model prediction, it is able to mine the hard examples."?

GeraldHan mentioned this issue Jul 20, 2023

A question about the biased models' negative gradient of its loss #10

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong derivation of negative gradient of sigmoid+BCE #5

Wrong derivation of negative gradient of sigmoid+BCE #5

GeraldHan commented Nov 19, 2021

Murphyzc commented Jul 20, 2023

Wrong derivation of negative gradient of sigmoid+BCE #5

Wrong derivation of negative gradient of sigmoid+BCE #5

Comments

GeraldHan commented Nov 19, 2021

Murphyzc commented Jul 20, 2023