tinyLM

From scratch implementation of a 52M parameter Transformer, following the Vaswani et al. paper. The training will be focused on English-Spanish translation, using 142k+ sentence pairs (to be expanded).

Originally, I was going to code it all by hand (no autograd, no nn.Modules, etc.) as I have done with other projects, but I realized that in order to maximize my learning, I'd just use all the built-in PyTorch utils.

Benchmark

Task: English → Spanish translation
Metric: SacreBLEU (BLEU‑4)
Decoding: greedy
Checkpoint: tinyLM_bs16_lr1e-4_layers6.pt
Test subset: 200 sentences from the held‑out split

Metric	Score
BLEU‑4	19.49

So far, not too shabby considered that in Vaswani et al. their highest BLEU was 26.4. They could have been using a different metric (not SacreBLEU), but still interesting. Will try to get this as high as possible with the next couple training runs

Run it yourself:

python -m evals.eval_bleu.py --ckpt weights/tinyLM_bs16_lr1e-4_layers6.pt --limit 200

Notes:

SacreBLEU is the standard BLEU implementation.
Scores vary by checkpoint, decoding strategy, and test size.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
assets		assets
data		data
evals		evals
weights		weights
.gitattributes		.gitattributes
.gitignore		.gitignore
README.MD		README.MD
config.py		config.py
decoder.py		decoder.py
demo.py		demo.py
embedding.py		embedding.py
encoder.py		encoder.py
layers.py		layers.py
model.py		model.py
requirements.txt		requirements.txt
tokenizer.py		tokenizer.py
train.py		train.py
translation_dataset.py		translation_dataset.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tinyLM

Benchmark

About

Uh oh!

Releases

Packages

Languages

robPTY/tinyLM

Folders and files

Latest commit

History

Repository files navigation

tinyLM

Benchmark

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages