PiT

论文：Rethinking Spatial Dimensions of Vision Transformers
官方项目：naver-ai/pit
模型代码：pit.py

验证集数据处理：

# 图像后端：pil
# 输入图像大小：224x224
transforms = T.Compose([
    T.Resize(248, interpolation='bicubic'),
    T.CenterCrop(224),
    T.ToTensor(),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

模型细节：

Model	Model Name	Params (M)	FLOPs (G)	Top-1 (%)	Top-5 (%)	Pretrained Model
PiT-Ti	pit_ti	4.9	0.7	72.91	91.40	Download
PiT-XS	pit_xs	10.6	1.4	78.18	94.16	Download
PiT-S	pit_s	23.5	2.9	81.08	95.33	Download
PiT-B	pit_b	73.8	12.5	82.44	95.71	Download
PiT-Ti distilled	pit_ti_distilled	4.9	0.7	74.54	92.10	Download
PiT-XS distilled	pit_xs_distilled	10.6	1.4	79.31	94.36	Download
PiT-S distilled	pit_s_distilled	23.5	2.9	81.99	95.79	Download
PiT-B distilled	pit_b_distilled	73.8	12.5	84.14	96.86	Download

引用：

@article{heo2021pit,
    title={Rethinking Spatial Dimensions of Vision Transformers},
    author={Byeongho Heo and Sangdoo Yun and Dongyoon Han and Sanghyuk Chun and Junsuk Choe and Seong Joon Oh},
    journal={arXiv: 2103.16302},
    year={2021},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pit.md

pit.md

PiT

Files

pit.md

Latest commit

History

pit.md

File metadata and controls

PiT