Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
imgs		imgs
weights		weights
README.md		README.md
vit.ipynb		vit.ipynb

Repository files navigation

ViT: Visual Transformer Networks

Toy implementation of An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale on MNIST.

Chunking Images

tiles = einops.rearrange(images, 'b c (h t1) (w t2) -> b (h w) c t1 t2', t1=tile_size, t2=tile_size)

Attention

Positional Embeddings

Misslabelled Images

Confusion Matrix

Heatmap

About

toy implementation of ViT

swe-to-mle.pages.dev/posts/vit-vision-transformer/

transformers vit

Report repository

Releases

No releases published

Packages

No packages published

Languages

Jupyter Notebook 100.0%