Skip to content

[Not Official] Implementation of CvT, Convolutions to Vision Transformers

Notifications You must be signed in to change notification settings

russellgeum/Convolutions-to-Vision-Transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 

Repository files navigation

CvT, Convolutions to Vision Transformers

Implementation of CvT, Convolutions to Vision Transformers.
This repository gives vision attention and embedding Layer.
Reference Paper

Folder

ㄴmodel_layer
    ㄴtokenizer.py
        class ImageTokenizer
        class ImageStacker
    ... ...
    ㄴtransforemr.py
        class LineTokenConvTransformer  
        class ConvTokenConvTransformer  
        class SelfConvTransfomer  
        class CrossConvTransformer

Usage

CvT with Linear Tokenizer

tensor = torch.ones([8, 3, 16, 16]) # torch.Size([8, 3, 16, 16])
layer   = LineTokenConvTransformer((16, 16), (4, 4), 3, 3)
outputs = layer(tensor)             # torch.Size([8, 3, 4, 4])

CvT with Convolution Tokenizer

tensor = torch.ones([8, 3, 16, 16]) # torch.Size([8, 3, 16, 16])
layer   = ConvTokenConvTransformer((16, 16), (4, 4), 3, 3)
outputs = layer(tensor)             # torch.Size([8, 3, 4, 4])

Self-Attention of CvT (SA-CvT)

tensor = torch.ones([8, 3, 16, 16]) # torch.Size([8, 3, 16, 16])
layer   = SelfConvTransfomer((16, 16), 3, 2)
outputs = layer(tensor)             # torch.Size([8, 3, 16, 16])

print(outputs.shape)

Cross-Attention of CvT (CA-CvT)

tensor1 = torch.ones([8, 3, 16, 16]) # torch.Size([8, 3, 16, 16])
tensor2 = torch.ones([8, 3, 16, 16]) # torch.Size([8, 3, 16, 16])

layer   = CrossConvTransformer((16, 16), 3, 1)
outputs = layer(tensor1, tensor2)    # torch.Size([8, 3, 16, 16])

Acknowledgement

Base CvT code is borrowed from @rishikksh20
repo: https://github.com/rishikksh20/convolution-vision-transformers
Base Embedding code is borrowed from @FrancescoSaverioZuppichini
repo: https://github.com/FrancescoSaverioZuppichini/ViT

Related works

About

[Not Official] Implementation of CvT, Convolutions to Vision Transformers

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published