Visual-Transformers

Unofficial implimentation of Visual Transformers: Token-based Image Representation and Processing for Computer Vision paper.

Usage:

python main.py task_mode learning_mode data --model --weights, where:

task_mode: classification or semantic_segmentation for corresponding task
learning_mode: train to train --model from scratch, test to validate --model with --weights on validation data.
data: path to dataset, in case of classification should be path to image net, in case of semantic segmentation to coco.
--model:
○ classification: ResNet18 or VT_ResNet18 (will be used by default).
○ semantic segmentation: PanopticFPN or VT_FPN (will be used by default).
--weights must be provided if learning_mode equals to test, won't be used in train mode.
--from_pretrained uses to continue training from some point, should be state_dict that contains model_state_dict, optimizer_state_dict and epoch.

Results:

final metrics and losses after 15 and 5 epochs of classification and semantic segmentation respectively:

	ResNet18	VT-ResNet18
Training accuracy	0.664675	0.672889
Validation accuracy	0.691541	0.696929

Training loss	1.312150	1.249382
Validation loss	1.173559	1.114401

	Panoptic FPN	VT-FPN
Training mIOU	8.0968	7.0343
Validation mIOU	4.3148	3.2351

Training loss	2.044084	2.068598
Validation loss	2.101253	2.120928

loss and metric curves of classification and semantic segmentation:

cross entropy loss	accuracy

pixel-wise cross entropy loss	mIOU

Efficiency and parameters

	Params (M)	FLOPs (M)	Forward-backward pass (s)
ResNet18	11.2	822	0.016
VT-ResNet18	12.7	543	0.02

Panoptic FPN	16.4	67412	0.08
VT-FPN	40.3	110019	0.062

Weights:

classification: ResNet18, VT-ResNet18
semantic segmentation: Panoptic FPN, VT-FPN

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
data		data
models		models
Experiments.ipynb		Experiments.ipynb
README.md		README.md
main.py		main.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

models

models

Experiments.ipynb

Experiments.ipynb

README.md

README.md

main.py

main.py

utils.py

utils.py

Repository files navigation

Visual-Transformers

Usage:

Results:

Weights:

About

Releases

Packages

Languages

AndreyBocharnikov/Visual-Transformers

Folders and files

Latest commit

History

Repository files navigation

Visual-Transformers

Usage:

Results:

Weights:

About

Resources

Stars

Watchers

Forks

Languages