2022_the_annotated_transformer

Goal

This Repository is an up-to-date version of Harvard nlp's Legacy code and a Refactoring of the jupyter notebook version as a shell script version.

Key points

We have re-factored Harvard NLP's Annotated Trasformer into a shell script version.
Dataset utilized Multi30K. (The dataset is small, so you can see the results quickly even on computers with low specifications.)
We provide the Colab version along with the shell script version, making it easy to modify the model and test the method.

https://colab.research.google.com/drive/1SrRmC_Ti8IepeHFNBZBjNxl_wkTSJReC?usp=sharing
Loss Graph can be drawn.
BLEU Score can be measured.

file structure

├── models
│   ├── __init__.py
│   ├── blocks
│   │   ├── __init__.py
│   │   ├── decoder_layer.py
│   │   ├── encoder_layer.py
│   ├── embedding
│   │   ├── __init__.py
│   │   ├── positional_encoding.py
│   │   └── token_embedding.py
│   ├── layers
│   │   ├── __init__.py
│   │   ├── layer_norm.py
│   │   ├── multi_headed_attention.py
│   │   ├── position_wise_feed_forward.py
│   │   └── sublayer_connection.py
│   ├── model
│   │   ├── __init__.py
│   │   ├── decoder.py
│   │   ├── encoder_decoder.py
│   │   ├── encoder.py
│   │   ├── generator.py
│   └── util.py
├── result
│   ├── loss_graph.png
│   ├── train_loss.txt
│   └── valid_loss.txt
├── saved
├── utils
    ├── __init__.py
    ├── batch.py
    ├── batch_size_fn.py
    ├── bleu.py
    ├── data_loader.py
    ├── epoch_time.py
    ├── greedy_decode.py
    ├── label_smoothing.py
    ├── make_model.py
    ├── NoamOpt.py
    ├── run_epoch.py
    ├── simple_loss_compute.py
    └── tokenizer.py
├── README.md
├── test.py
├── train.py
├── config.py
├── data.py
└── graph.py

Training Result

Train Validation loss graph

Test set(unseen data) Translation Example

Test set(unseen data) BLEU Score Average: 35.870847920953594

Reference

https://nlp.seas.harvard.edu/2018/04/03/attention.html

https://jalammar.github.io/illustrated-transformer/

https://www.facebook.com/groups/TensorFlowKR/permalink/1618169785190740/

https://github.com/hyunwoongko/transformer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models

models

result

result

utils

utils

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

config.py

config.py

data.py

data.py

graph.py

graph.py

test.py

test.py

train.py

train.py

Repository files navigation

2022_the_annotated_transformer

Goal

Key points

file structure

Training Result

Train Validation loss graph

Test set(unseen data) Translation Example

Reference

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
models		models
result		result
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
data.py		data.py
graph.py		graph.py
test.py		test.py
train.py		train.py

License

serotoninpm/2022_the_annotated_transformer

Folders and files

Latest commit

History

Repository files navigation

2022_the_annotated_transformer

Goal

Key points

file structure

Training Result

Train Validation loss graph

Test set(unseen data) Translation Example

Reference

About

Resources

License

Stars

Watchers

Forks

Languages