Attention

An Implementation of ''Attention Is All You Need''

"Attention is All You Need" ,(Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, arxiv, 2017). This implementation is done in keras with tensorflow.

The paper presented a novel sequence to sequence framework that engaged the self-attention mechanism with feed forward network instead of Recurrent network structure, and achieve the state-of-the-art performance on WMT 2014 English-to-German translation task. (2017/06/12)

The architecture is named transformer and comprises of encoder and decoder with 6 stacks each.

Each encoder layer contain 2 sublayers with;

Multihead Self Attention Layer
Feed Forward Layer

Also the decoder layer has 3;

Feed Forward layer
Encoder to Decoder layer
Self Attention Layer.

Parameter settings:

  batch_size=64
  d_inner_hid=1024
  d_k=64
  d_v=64
  d_model=512
  d_word_vec=512
  dropout=0.1
  embs_share_weight=False
  n_head=8
  n_layers=6
  n_warmup_steps=4000
  proj_share_weight=True

Requirements

Python 3
Numpy
Tensorflow
Keras

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Attention.ipynb		Attention.ipynb
Copy_of_Copy_of_text_generation.ipynb		Copy_of_Copy_of_text_generation.ipynb
Copy_of_nmt_with_attention.ipynb		Copy_of_nmt_with_attention.ipynb
README.md		README.md
data_handler.py		data_handler.py
preprocess.py		preprocess.py
test.txt		test.txt
train.txt		train.txt
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attention.ipynb

Attention.ipynb

Copy_of_Copy_of_text_generation.ipynb

Copy_of_Copy_of_text_generation.ipynb

Copy_of_nmt_with_attention.ipynb

Copy_of_nmt_with_attention.ipynb

README.md

README.md

data_handler.py

data_handler.py

preprocess.py

preprocess.py

test.txt

test.txt

train.txt

train.txt

transformer.py

transformer.py

Repository files navigation

Attention

An Implementation of ''Attention Is All You Need''

"Attention is All You Need" ,(Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, arxiv, 2017). This implementation is done in keras with tensorflow.

The architecture is named transformer and comprises of encoder and decoder with 6 stacks each.

Each encoder layer contain 2 sublayers with;

Also the decoder layer has 3;

Parameter settings:

Requirements

About

Releases

Packages

Languages

SunYanCN/Attention

Folders and files

Latest commit

History

Repository files navigation

Attention

An Implementation of ''Attention Is All You Need''

"Attention is All You Need" ,(Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, arxiv, 2017). This implementation is done in keras with tensorflow.

The architecture is named transformer and comprises of encoder and decoder with 6 stacks each.

Each encoder layer contain 2 sublayers with;

Also the decoder layer has 3;

Parameter settings:

Requirements

About

Resources

Stars

Watchers

Forks

Languages