Skip to content

hardaatbaath/vision_transformer-pytorch

Repository files navigation

Vision Transformer

This is the implementation of the paper An Image is Worth 16x16 Words. Thanks to Brian Pulfer and his medium article on the topic, they were extremely helpful and a valuable resource.

More comments will be added in the future to make the process of understanding the code easier.

I have used the MNIST dataset for this code.

Citation:

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2020).
An Image is Worth 16x16 Words.
ArXiv.

About

Implementation of "An Image is worth 16 x 16 words"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages