Add soft capping to reversible embedding layer #1718

mattdangerw · 2024-07-30T18:34:06Z

Forgetting the final output soft-cap is a really easy mistake, and worse, outputs will still look plausible for generations without the softcap, just with worse actual results.

Adding this to our reversible embedding layer will be much more robust. As long as you use the layer to compute logits over the vocab, you can no longer forget the soft-cap.

Before this fix, we were missing it from our actual CausalLM functional model output, meaning soft-capping was not applied during training!

Forgetting the final output soft-cap is a really easy mistake, and worse, outputs will still look plausible for generations without the softcap, just with worse actual results. Adding this to our reversible embedding layer will be much more robust. As long as you use the layer to compute logits over the vocab, you can no longer forget the soft-cap.

SamanehSaadat

Thanks, Matt!

Forgetting the final output soft-cap is a really easy mistake, and worse, outputs will still look plausible for generations without the softcap, just with worse actual results. Adding this to our reversible embedding layer will be much more robust. As long as you use the layer to compute logits over the vocab, you can no longer forget the soft-cap.

mattdangerw requested review from grasskin and SamanehSaadat July 30, 2024 18:34

mattdangerw force-pushed the logit-soft-cap-fix branch from 1a288cb to 7f5dc3b Compare July 30, 2024 18:43

SamanehSaadat approved these changes Jul 30, 2024

View reviewed changes

mattdangerw merged commit 7b932cd into keras-team:master Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add soft capping to reversible embedding layer #1718

Add soft capping to reversible embedding layer #1718

Uh oh!

mattdangerw commented Jul 30, 2024 •

edited

Loading

Uh oh!

SamanehSaadat left a comment

Uh oh!

Uh oh!

Add soft capping to reversible embedding layer #1718

Add soft capping to reversible embedding layer #1718

Uh oh!

Conversation

mattdangerw commented Jul 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SamanehSaadat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mattdangerw commented Jul 30, 2024 •

edited

Loading