early_convolutions_vit_pytorch

(Unofficial) PyTorch implementation of the paper "Early Convolutions Help Transformers See Better"

Example usage can be found in this notebook.

This model does appear to outperform the original ViT paper for the same amount of training computation (comparable flops from 1 fewer transformer block and same number of training epochs.)

As a starting point for the original ViT ("An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale") implementation in PyTorch, I used Phil Wang's repo https://github.com/lucidrains/vit-pytorch/.

Both notebooks will use the GPU if it's available according to torch. The training is quite slow on CPU. I tried training on CPU and got more than a 60x speed up switching to an RTX 2070 (your speedup will, of course, depend on the CPU and GPU).

Bibtex paper citations:

@misc{xiao2021early,
      title={Early Convolutions Help Transformers See Better}, 
      author={Tete Xiao and Mannat Singh and Eric Mintun and Trevor Darrell and Piotr Dollár and Ross Girshick},
      year={2021},
      eprint={2106.14881},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@misc{dosovitskiy2020image,
    title   = {An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
    author  = {Alexey Dosovitskiy and Lucas Beyer and Alexander Kolesnikov and Dirk Weissenborn and Xiaohua Zhai and Thomas Unterthiner and Mostafa Dehghani and Matthias Minderer and Georg Heigold and Sylvain Gelly and Jakob Uszkoreit and Neil Houlsby},
    year    = {2020},
    eprint  = {2010.11929},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

Planned updates:

Example usage in readme
Script version of notebook that saves weights and is more flexible regarding input data (intelligently deals with class number, etc)
PyTorch Lightning version
CLI for model training and weight saving
General cleanup and improvements (values from paper are currently hard-coded into the model and there's no testing, logging, etc)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
notebooks		notebooks
vitc		vitc
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notebooks

notebooks

vitc

vitc

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

setup.py

setup.py

Repository files navigation

early_convolutions_vit_pytorch

Bibtex paper citations:

Planned updates:

About

Releases

Packages

Languages

License

Jack-Etheredge/early_convolutions_vit_pytorch

Folders and files

Latest commit

History

Repository files navigation

early_convolutions_vit_pytorch

Bibtex paper citations:

Planned updates:

About

Resources

License

Stars

Watchers

Forks

Languages