Transformer encoder #97

yushiyangk · 2022-10-26T19:16:17Z

TransformerPositionalEncoding

$$ \mathrm{PE}_{i, 2z} = \sin \left( \frac{i}{10000^{2z/d}} \right) $$

$$ \mathrm{PE}_{i, 2z + 1} = \cos \left( \frac{i}{10000^{2z/d}} \right) $$

where $i$ is the sequence position, $2z$ and $2z+1$ are the dimensions of the input embedding, and $d$ is the dimensionality of the input embedding.

The multiplicative factors $\frac{1}{10000^{2z/d}}$ are precomputed during object creation as they are constant for all $i$.

The full PE is initially precomputed for all $i$ up to 256 (configurable). This is then extended and stored if the module is called with a sequence length larger than the initial value.

Returns a 2D tensor matching the last two dimensions of the input tensor to TransformerEncoder.

FeedForward

Simple 2-layer fully connected neural network with relu activation. This is kept as a private class for now. If we want to make this to be exported it should probably be in a separate file.

TransformerEncoderLayer

As described in Vaswani et al.

TransformerEncoder

The full encoder half of the Transformer, using a Sequential containing arbitrary number of TransformerEncoderLayers.

This includes the positional encoding, but does not include any initial embedding of an input sequence into vectors (which would be separately done by e.g. word2vec)

yushiyangk · 2022-10-28T05:15:56Z

The test TransformerEncoder > calculates gradient is causing bun wiptest to fail silently with exit code 139 (segfault) or 138. This occurs in the result.backward() call.

Reproduced by me locally but could not be reproduced by @cryptodeal locally. We're both using macOS 12.6 and bun 0.2.2. Same error occurred for me when I used bun 0.1.13.

bwasti · 2022-10-28T21:04:46Z

~~do you have a local file called libflashlight.0.dylib anywhere? if not, maybe try bun install --force to update that dylib locally~~

oops never mind, I see it repros in the CI

cryptodeal · 2022-10-28T22:25:54Z

I know GitHub Actions Darwin runners are x86_64, so wanted to see if I could repro on my 2015 MBP (since it's AMD64), but damn, all 344 tests pass locally on that device as well (running Bun v0.2.2 and OSx 12.6).

Additional things I've tried in an attempt to induce a reproduction of this error (none of which have yet worked to repro):

clear bun dev cache rm -rf ~/.bun/install/cache (in case there was some weird caching of shumai dependency)

yushiyangk · 2022-10-30T16:51:17Z

Tests are now passing for me too, after upgrading ArrayFire from 3.8.1 to 3.8.2.

As discussed on discord, the CI is still running version 3.8.1; the issue will probably be resolved once that's upgraded to 3.8.2 as well.

yushiyangk · 2022-10-30T19:37:56Z

Expect this to be fixed by #105. Please merge this after #103

bwasti · 2022-10-31T19:05:08Z

after rebase, would you mind putting the comments in this PR as comments in the transformer module?

we're using typedoc for comment formats https://typedoc.org/example/

yushiyangk · 2022-11-01T21:31:33Z

after rebase, would you mind putting the comments in this PR as comments in the transformer module?

we're using typedoc for comment formats https://typedoc.org/example/

Oh yes, I should definitely do that.

Edit: Added to all the transformer modules

bwasti · 2022-11-02T01:34:04Z

awesome work!

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 26, 2022

yushiyangk force-pushed the transformer branch 6 times, most recently from 3e721ed to 8e8f967 Compare October 27, 2022 17:29

yushiyangk mentioned this pull request Oct 30, 2022

Masking in attention and allow cross-attention #103

Merged

yushiyangk force-pushed the transformer branch from 8e8f967 to 704704a Compare October 30, 2022 19:36

yushiyangk marked this pull request as ready for review October 30, 2022 19:52

yushiyangk added 3 commits November 1, 2022 20:15

Positional encoding

a9e25a5

Feed forward network

7bb893e

Transformer encoder

57f9f6d

yushiyangk force-pushed the transformer branch from 704704a to 57f9f6d Compare November 1, 2022 20:15

Added documentation comments for Transformer

230ce62

bwasti merged commit 528ca3e into facebookresearch:main Nov 2, 2022

yushiyangk mentioned this pull request Nov 3, 2022

Transformer decoder #114

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transformer encoder #97

Transformer encoder #97

yushiyangk commented Oct 26, 2022 •

edited

yushiyangk commented Oct 28, 2022 •

edited

bwasti commented Oct 28, 2022 •

edited

cryptodeal commented Oct 28, 2022 •

edited

yushiyangk commented Oct 30, 2022 •

edited

yushiyangk commented Oct 30, 2022 •

edited

bwasti commented Oct 31, 2022

yushiyangk commented Nov 1, 2022 •

edited

bwasti commented Nov 2, 2022

Transformer encoder #97

Transformer encoder #97

Conversation

yushiyangk commented Oct 26, 2022 • edited

TransformerPositionalEncoding

FeedForward

TransformerEncoderLayer

TransformerEncoder

yushiyangk commented Oct 28, 2022 • edited

bwasti commented Oct 28, 2022 • edited

cryptodeal commented Oct 28, 2022 • edited

yushiyangk commented Oct 30, 2022 • edited

yushiyangk commented Oct 30, 2022 • edited

bwasti commented Oct 31, 2022

yushiyangk commented Nov 1, 2022 • edited

bwasti commented Nov 2, 2022

yushiyangk commented Oct 26, 2022 •

edited

yushiyangk commented Oct 28, 2022 •

edited

bwasti commented Oct 28, 2022 •

edited

cryptodeal commented Oct 28, 2022 •

edited

yushiyangk commented Oct 30, 2022 •

edited

yushiyangk commented Oct 30, 2022 •

edited

yushiyangk commented Nov 1, 2022 •

edited