Thinker

The trained computer

We want to train a model that does numeric computation such as 1 + 2 = 3, what do we need for computation:

input
reusable computer unit -> repeat transformer block
memory -> concatenated embeddings accessed via cross attention
algorithm, that gives the desired output

in our case computer & algorithm will be merged in the model
memory will be intermediates latent states concatenated

the algorithm we decided to learn are (from easy to difficult) :

copy input to output, with some variations
addition
multiplication
number factorisation

Task 3 in particular will help to test how this method performs in variable complexity. Since the last task highly relies on memory to reduce computation we will observe how the model model will use the given memory.

Checkout :

toy_model.py
on going experiment log

Based on the observed result we could re-use the same approach on Language Modeling Task following the original ideas.

About the model
The model is a cross-attention latent-based transformer (like Perceiver):

layer weight sharing to allow reuseable compute block
hidden latent vector as information passing
cross attention on input
cross attention on past latent (wider information passing)

here's a visual

here's a draft of the initial idea

Similar ideas:

Looped Transformers - paper - x_post - code

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
exp_logs		exp_logs
model		model
.gitignore		.gitignore
Readme.md		Readme.md
ToDo.md		ToDo.md
draft_model.py		draft_model.py
experiment.log.md		experiment.log.md
generate-embedding.py		generate-embedding.py
ideas-draft.md		ideas-draft.md
model_utils.py		model_utils.py
numbers_data.py		numbers_data.py
th1nker_run.ipynb		th1nker_run.ipynb
th1nker_run.py		th1nker_run.py
th1nker_run_lightning.py		th1nker_run_lightning.py
thinker_model.py		thinker_model.py
toy_model.py		toy_model.py
train_param.txt		train_param.txt
utils.py		utils.py
utils_viz.py		utils_viz.py
visual-explanation.svg		visual-explanation.svg

domguia/thinker

Folders and files

Latest commit

History

Repository files navigation

Thinker

About

Resources

Stars

Watchers

Forks

Languages