Transformer-Knowledge

We test the performance of Kaparthy's minGPT Transformer model on a simple knowledge task.

First we pretrain minGPT on Wikipedia text about notable individuals, then we finetune it on name-birthplace pairs of the form:

Q: Where was [person] born?
A: [place]

We then test its "knowledge" by asking it to predict the birthplaces of individuals in the same question answer format as above.

Attention

Two variants of self attention are tested: standard Masked multi-headed self-attention, and Synthesizer attention. The model using Masked multi-headed self-attention achieves an accuracy of ~20%. The model using the synthesizer variant, which is a form of attention that eschews the use of pairwise dot products, achieves ~17% accuracy.

Synthesizer attention: $Y_{i} = softmax(ReLU(XA_{i} + b_{1})B_{i} + b_{2})(XV_{i})$

Pretraining

For specifics, see the CharCorruptionDataset class in dataset.py. To pretrain on the wikipedia text, a piece of text is randomly truncrated and masked. For every such piece of text, $x$, its label, $y$, is truncated by a single character at the beginning (i.e. $y = x[1:]$), such that, for each character, the model is trying to preidct the next character in the masked string $x$

I.e.
Original: Khatchig Mouradian. Khatchig Mouradian is a journalist, writer and translator born in Lebanon .
x: Khatchig Mouradian. Khatchig Mouradian is a jour⁇and tran⁇nalist, writer ⁇□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□
y: hatchig Mouradian. Khatchig Mouradian is a jour⁇and tran⁇nalist, writer ⁇□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□□

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Trasnformer		Trasnformer
README.md		README.md
command_center.ipynb		command_center.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trasnformer

Trasnformer

README.md

README.md

command_center.ipynb

command_center.ipynb

Repository files navigation

Transformer-Knowledge

Attention

Pretraining

About

Releases

Packages

Languages

rgivhan/Transformer-Knowledge

Folders and files

Latest commit

History

Repository files navigation

Transformer-Knowledge

Attention

Pretraining

About

Resources

Stars

Watchers

Forks

Languages