ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding

The Implementation of ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding

arXiv | PDF

ImageNet

Model Name	Resolution	Params	GFLOPs	@Top-1	Download
ParFormer-B1	224X224	11M	1.5	80.5	model
ParFormer-B2	224X224	23M	3.4	82.1	model
ParFormer-B3	224X224	34M	6.5	83.1	model

Prerequisites

conda virtual environment is recommended.

conda install pytorch torchvision cudatoolkit=11.8 -c pytorch
pip install timm==0.6.13
pip install wandb
pip install fvcore

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The training and validation data are expected to be in the train folder and val folder respectively:

|-- /path/to/imagenet/
    |-- train
    |-- val

Single machine multi-GPU training

We provide an example training script train_imnet.sh using PyTorch distributed data parallel (DDP).

To train ParFormer-B1 on an 2-GPU machine:

sh train_imnet.sh parformer_b1 2

Tips: specify your data path and experiment name in the script!

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
detection		detection
models		models
README.md		README.md
datasets.py		datasets.py
engine.py		engine.py
finetune.sh		finetune.sh
main.py		main.py
measure.py		measure.py
optim_factory.py		optim_factory.py
train_imnet.sh		train_imnet.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding

ImageNet

Prerequisites

Data preparation

Single machine multi-GPU training

About

Releases

Packages

Languages

novendrastywn/ParFormer-CAPE-2024

Folders and files

Latest commit

History

Repository files navigation

ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding

ImageNet

Prerequisites

Data preparation

Single machine multi-GPU training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages