New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transformer encoder #97
Conversation
3e721ed
to
8e8f967
Compare
The test Reproduced by me locally but could not be reproduced by @cryptodeal locally. We're both using macOS 12.6 and bun 0.2.2. Same error occurred for me when I used bun 0.1.13. |
oops never mind, I see it repros in the CI |
I know GitHub Actions Darwin runners are Additional things I've tried in an attempt to induce a reproduction of this error (none of which have yet worked to repro):
|
Tests are now passing for me too, after upgrading ArrayFire from 3.8.1 to 3.8.2. As discussed on discord, the CI is still running version 3.8.1; the issue will probably be resolved once that's upgraded to 3.8.2 as well. |
8e8f967
to
704704a
Compare
after rebase, would you mind putting the comments in this PR as comments in the transformer module? we're using typedoc for comment formats https://typedoc.org/example/ |
704704a
to
57f9f6d
Compare
Oh yes, I should definitely do that. Edit: Added to all the transformer modules |
awesome work! |
TransformerPositionalEncoding
where$i$ is the sequence position, $2z$ and $2z+1$ are the dimensions of the input embedding, and $d$ is the dimensionality of the input embedding.
The multiplicative factors$\frac{1}{10000^{2z/d}}$ are precomputed during object creation as they are constant for all $i$ .
The full PE is initially precomputed for all$i$ up to 256 (configurable). This is then extended and stored if the module is called with a sequence length larger than the initial value.
Returns a 2D tensor matching the last two dimensions of the input tensor to TransformerEncoder.
FeedForward
Simple 2-layer fully connected neural network with relu activation. This is kept as a private class for now. If we want to make this to be exported it should probably be in a separate file.
TransformerEncoderLayer
As described in Vaswani et al.
TransformerEncoder
The full encoder half of the Transformer, using a Sequential containing arbitrary number of TransformerEncoderLayers.
This includes the positional encoding, but does not include any initial embedding of an input sequence into vectors (which would be separately done by e.g. word2vec)