Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SimMIM and non-ViT backbones #1135

Open
faris-k opened this issue Apr 7, 2023 · 1 comment
Open

SimMIM and non-ViT backbones #1135

faris-k opened this issue Apr 7, 2023 · 1 comment

Comments

@faris-k
Copy link

faris-k commented Apr 7, 2023

I'm noticing very subpar performance from SimMIM on my task compared to MAE, and this also seems to be an issue on the Imagenette benchmarks. I was wondering what might be causing this, and whether we'd still see performance issues with a non-ViT backbone. Is it possible to use backbones like convnets and Swin transformers with the current implementation of SimMIM? I'm curious how you'd need to change the forward_encoder method to do so, and whether images_to_tokens could be generalized to other backbones.

def forward_encoder(self, images, batch_size, idx_mask):
    # pass all the tokens to the encoder, both masked and non masked ones
	tokens = self.backbone.images_to_tokens(images, prepend_class_token=True)
	tokens_masked = utils.mask_at_index(tokens, idx_mask, self.mask_token)
	return self.backbone.encoder(tokens_masked)
@guarin
Copy link
Contributor

guarin commented Apr 11, 2023

Hi @faris-k! We noticed this as well but I didn't look into it yet. From a quick glance at a code I believe the linear decoder head might be missing. The reference implementation is here: https://github.com/microsoft/SimMIM/blob/d3e29bcac950b83edc34ca33fe4404f38309052c/models/simmim.py#L104

But I guess a simple linear layer might be enough in our setup.

Also note that we measure performance using KNN which is lower than linear eval/finetuning. ViT based architectures generally require finetuning for good performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants