[WIP] Add Attention is All you need transformer and Translation example #422

andr-ec · 2020-03-26T00:17:49Z

This pull request adds the transformer from "attention is all you need"/"the annotated transformer" into the models folder. And adds a translation example for WMT'14 English-German data.
I'd love to hear any feedback, and any help would be great!

This would close #148

TODO:

…op is complete

googlebot · 2020-03-26T00:18:23Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.

What to do if you already signed the CLA

Individual signers

It's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.

Corporate signers

Your company has a Point of Contact who decides which employees are authorized to participate. Ask your POC to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the Google project maintainer to go/cla#troubleshoot (Public version).
The email used to register you as an authorized contributor must be the email used for the Git commit. Check your existing CLA data and verify that your email is set on your git commits.
The email used to register you as an authorized contributor must also be attached to your GitHub account.

ℹ️ Googlers: Go here for more info.

andr-ec · 2020-03-26T00:19:16Z

@googlebot I signed it!

googlebot · 2020-03-26T00:19:29Z

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

brettkoonce · 2020-03-26T01:30:02Z

SUGOI DESU NE

brettkoonce · 2020-03-26T02:12:35Z

Models/Translation/Attention.swift

+import TensorFlow
+
+///// Input to an attention layer.
+//public struct AttentionInput<Scalar: TensorFlowFloatingPoint>: Differentiable {


might clean this out if it's not in use

brettkoonce · 2020-03-26T02:14:48Z

Examples/Transformer-Translation/main.swift

+
+struct WMTTranslationTask {
+    // https://nlp.stanford.edu/projects/nmt/
+    // WMT'14 English-German data


WMT == 💪!

…nd dataset

texasmichelle · 2020-04-02T02:25:10Z

Thank you for putting this together! A translation example would indeed be a fantastic addition to the repo.

Between GPT-2 and BERT, we're trying to get a handle on the code duplication among transformer-based models. We'd like to reuse code among common concepts, which a translation example could also utilize. If you're up for it, we could use some help with #436 before compounding the problem by adding more transformer code.

Breaking this up into smaller PRs would also be helpful, such as the standalone dataset or at least leveraging existing components (because how many separate implementations of multi-head attention do we really need? Apropos multi-head attention, it appears you could easily use existing libraries since the code looks nearly identical, but your name is listed as author in the header so maybe there are some major changes).

marcrasi · 2020-04-15T17:30:59Z

Hi, there hasn't been activity on this PR for a while. We're going to close it to keep our list of open PRs small. Of course, feel free to reopen any time if you have time to come back and work on this!

andr-ec added 23 commits March 25, 2020 18:07

added base translation model and package

0ddb147

WIP for tokenizers

847480b

added full text preprocessing and started creating training loop

b1a20dd

working attention

4d78760

working model on forward pass

7793186

working forwards pass

7950233

cleaning up code

046b55e

comments

7488aed

working training loop

12f3975

updated training step

398a416

to gpu

101a444

removed python import

cac0e88

added foundation import

77554c5

fixed import in wrong file

312bfda

reduced batch size

e7d38bb

added package to allow import of translation models

4345737

updated batch and sequence length to defualts

6e0930d

updated learning rate to that in paper

958531e

made required methods public, fixed vocab to lookup correct values

444f53d

added requirements for greedy decoding

d959992

working greedy decoding, working ignoreIndex for padding, training lo…

0f9039e

…op is complete

moved custom crossentropy to utilities

d6e1a57

made softmax public

819e1ec

brettkoonce reviewed Mar 26, 2020

View reviewed changes

cleaned up comments and code organization

3c3b674

andr-ec added 11 commits March 26, 2020 11:52

formatting

f7ba238

added validation loop

a2c6787

reformatted to use dataset helpers, much more effecient with memory a…

9234b64

…nd dataset

organized project structure and started using existing vocab

ee0e3dd

fix vocabulary loading and imports

0366a69

moved extensions, added <unk> token, added decode function

456bece

fixing encoding

c92bbdd

added initialization to many params

1250800

fixing initializations

0f018bc

added init to activations

2731c83

removed init from attention

9ac1672

texasmichelle assigned texasmichelle and xihui-wu Apr 1, 2020

marcrasi closed this Apr 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add Attention is All you need transformer and Translation example #422

[WIP] Add Attention is All you need transformer and Translation example #422

andr-ec commented Mar 26, 2020 •

edited

googlebot commented Mar 26, 2020

andr-ec commented Mar 26, 2020

googlebot commented Mar 26, 2020

brettkoonce commented Mar 26, 2020

brettkoonce Mar 26, 2020

brettkoonce Mar 26, 2020

texasmichelle commented Apr 2, 2020

marcrasi commented Apr 15, 2020

[WIP] Add Attention is All you need transformer and Translation example #422

[WIP] Add Attention is All you need transformer and Translation example #422

Conversation

andr-ec commented Mar 26, 2020 • edited

googlebot commented Mar 26, 2020

What to do if you already signed the CLA

Individual signers

Corporate signers

andr-ec commented Mar 26, 2020

googlebot commented Mar 26, 2020

brettkoonce commented Mar 26, 2020

brettkoonce Mar 26, 2020

Choose a reason for hiding this comment

brettkoonce Mar 26, 2020

Choose a reason for hiding this comment

texasmichelle commented Apr 2, 2020

marcrasi commented Apr 15, 2020

andr-ec commented Mar 26, 2020 •

edited