The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 36,458 5,127 Updated Mar 9, 2026

xmu-xiaoma666 / FightingCV-Paper-Reading

⭐⭐⭐FightingCV Paper Reading, which helps you understand the most advanced research work in an easier way 🍀 🍀 🍀

Shell 822 89 Updated Apr 20, 2023

lucidrains / vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Python 24,983 3,485 Updated Feb 11, 2026

microsoft / Cream

This is a collection of our NAS and Vision Transformer work.

Python 1,823 239 Updated Jul 25, 2024

facebookresearch / mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

Python 8,232 1,345 Updated Jul 23, 2024

leoxiaobin / CvT

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Python 229 38 Updated Jul 4, 2022

snap-research / EfficientFormer

EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]

Python 1,108 94 Updated Aug 13, 2023

hunto / LightViT

Official implementation for paper "LightViT: Towards Light-Weight Convolution-Free Vision Transformers"

Python 144 10 Updated Jul 26, 2022

google-research / maxvit

[ECCV 2022] Official repository for "MaxViT: Multi-Axis Vision Transformer". SOTA foundation models for classification, detection, segmentation, image quality, and generative modeling...

Jupyter Notebook 489 38 Updated Jun 2, 2023

raoyongming / DynamicViT

[NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

Jupyter Notebook 649 80 Updated Jul 11, 2023

ggjy / CMT.pytorch

CMT Pytorch implementation of our CVPR 2022 paper CMT: Convolutional Neural Networks Meet Vision Transformers (https://arxiv.org/pdf/2107.06263.pdf).

Python 103 16 Updated Jul 1, 2022

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 157,586 32,331 Updated Mar 8, 2026

ChristophReich1996 / MaxViT

PyTorch reimplementation of the paper "MaxViT: Multi-Axis Vision Transformer" [ECCV 2022].

Python 164 18 Updated Jul 12, 2023

sail-sg / metaformer

MetaFormer Baselines for Vision (TPAMI 2024)

Python 495 31 Updated Jun 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

l-sf

Achievements

Achievements

Block or report l-sf

transformer

microsoft / Swin-Transformer

dk-liang / Awesome-Visual-Transformer

open-mmlab / awesome-vit

mli / transformers-benchmarks

xmu-xiaoma666 / External-Attention-pytorch

huggingface / pytorch-image-models