vit-pytorch

vit model from tensorflow

google vision transformer

converted from https://github.com/google-research/vision_transformer

referenced from https://github.com/lucidrains/vit-pytorch

model table

model type	input_size	pytorch_weights
ViT-B_16	224*224	B_16-224 提取码: 9mrd

useage

from vit_pytorch import VIT_B16_224

model = VIT_B16_224()
model.load_state_dict(torch.load('imagenet21k+imagenet2012_ViT-B_16-224.pth'))

input_size = 224
img = torch.randn(1, 3, input_size, input_size)

preds = model(img) # (1, 1000)

caution

this model using mean value of 127.5 and normlized with 127.5 the default normlize should replaced by

normalize_tf = transforms.Normalize(mean=[0.5, 0.5, 0.5],std=[0.5, 0.5, 0.5])

it will reduce 5% cls accurecy in imagenet 1K cls task

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
load_tf_pretain.py		load_tf_pretain.py
model.py		model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vit-pytorch

google vision transformer

model table

useage

caution

About

Releases

Packages

Languages

likelyzhao/vit-pytorch

Folders and files

Latest commit

History

Repository files navigation

vit-pytorch

google vision transformer

model table

useage

caution

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages