Add numerically stable cross entropy loss #856

jgraving · 2022-04-25T14:40:52Z

I was using the parametric UMAP cross entropy loss function for another project and ran into some odd and intermittent issues with numerical stability, which I could not pinpoint. My solution was to modify the loss to calculate the log probabilities directly and use a reparameterized repellent term for the cross entropy from Section 8.1 of Shi et al. 2022 (https://arxiv.org/abs/2111.08851) log(1 - sigmoid(logits)) = log(sigmoid(logits)) - logits. This seemed to solve any issues I was having. I have not tested this directly with your code base, but thought it might be useful nonetheless.

As a side note, I switched the (0,1] threshold implemented with clip_by_value to use sigmoid for (0, 1). You could replace log_sigmoid(x) = -softplus(-x) with the equivalent rectifier log_hard_sigmoid(x) = -relu(-x) if you prefer to keep the hard threshold.

timsainb · 2022-04-25T16:43:21Z

Thanks Jake, I'll try to run some tests asap!

lmcinnes · 2022-04-25T20:22:12Z

Looks good to me, but I'll wait for @timsainb final word. Thanks for this.

timsainb · 2022-04-25T21:39:23Z

It looks like it actually converges a bit faster with this change:
https://colab.research.google.com/drive/1x_ol37YILLGxTMkGjpjhpdN11zKAhjwm?usp=sharing

jgraving · 2022-04-26T13:13:05Z

Excellent! Glad it works. The failed checks don't appear to be related to my PR, but let me know if there's anything else needed on my end

lmcinnes · 2022-04-26T14:01:52Z

The failed checks are just how the CI is setup, so don't worry about those. Thanks for this, it seems like it will be useful to many.

Add numerically stable cross entropy loss

17b2089

timsainb self-assigned this Apr 25, 2022

lmcinnes merged commit 2c5232f into lmcinnes:master Apr 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add numerically stable cross entropy loss #856

Add numerically stable cross entropy loss #856

jgraving commented Apr 25, 2022

timsainb commented Apr 25, 2022

lmcinnes commented Apr 25, 2022

timsainb commented Apr 25, 2022

jgraving commented Apr 26, 2022

lmcinnes commented Apr 26, 2022

Add numerically stable cross entropy loss #856

Add numerically stable cross entropy loss #856

Conversation

jgraving commented Apr 25, 2022

timsainb commented Apr 25, 2022

lmcinnes commented Apr 25, 2022

timsainb commented Apr 25, 2022

jgraving commented Apr 26, 2022

lmcinnes commented Apr 26, 2022