Attention improvements: · google/neural-tangents@253ddea

Commit

Attention improvements:

1) Add support for positional embeddings in attention
2) Add support for sqrt-scaling in attention for identity nonlinearity.
3) Add support for various nonlinearities in 1/d attention
4) Improve numerical accuracy of attention layer by manually fusing multiplicative constants, i.e. doing (c1 * c2) * (A @ B) instead of (c1 * A) @ (c2 * B).
5) Minor tweaks to docstrings.

Co-authored-by: Jiri Hron <jirihron@google.com>
PiperOrigin-RevId: 321938062

Loading branch information

romanngg and Jiri Hron committed Jul 23, 2020

1 parent 841a33a commit 253ddea

0 comments on commit `253ddea`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `253ddea`

Commit

There are no files selected for viewing

0 comments on commit 253ddea

0 comments on commit `253ddea`