Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use the pretrained model uniformer_base_in1k.pth as my backbone ? #20

Closed
hongsheng-Z opened this issue Feb 25, 2022 · 16 comments
Closed

Comments

@hongsheng-Z
Copy link

There are some problems when I use the pre-trained model uniformer_base_in1k.pth as my backbone?
missing keys: ['patch_embed1.norm.weight', 'patch_embed1.norm.bias', 'patch_embed1.proj.weight', 'patch_embed1.proj.bias', 'patch_embed2.norm.weight', .....
unexpected keys: ['model']

@Andy1621
Copy link
Collaborator

Have you used the latest version? The bug has been fixed in the latest version as follows:

def get_pretrained_model(self, cfg):
if cfg.UNIFORMER.PRETRAIN_NAME:
checkpoint = torch.load(model_path[cfg.UNIFORMER.PRETRAIN_NAME], map_location='cpu')
if 'model' in checkpoint:
checkpoint = checkpoint['model']
elif 'model_state' in checkpoint:
checkpoint = checkpoint['model_state']

@Andy1621
Copy link
Collaborator

As there is no more activity, I am closing the issue, don't hesitate to reopen it if necessary.

@hongsheng-Z
Copy link
Author

Ok, thanks. I have applied your model (ImageNet-1K pretrained with Token Labeling (224x224): uniformer_base_tl_224.pth) as the backbone to my visual tracking. But from the current training logs, it seems that your model is not as good as other backbones (such as swinT, Resnet50) in this task

@Andy1621
Copy link
Collaborator

Andy1621 commented Mar 1, 2022

@hongsheng-Z Can you try uniformer_small_224? It has the same FLOPs as swinT, and I'm not sure whether you use the proper hyperparameter for the larger model. As expected, UniFormer works for most downstream tasks.

Moreover, I am not sure whether you have used the code of the new version, since I have updated the model config, where head_dim=32. The previous head_dim=64 does not match the pre-trained weights, thus the performance will be slow.

Besides, head_dim=32 requires more GPU memory for downstream tasks, and I'm retrain the models with head_dim=64.

@Andy1621
Copy link
Collaborator

Andy1621 commented Mar 1, 2022

Someone also meets similar problems because of the wrong model config, but the performance is normal with right config.

I suggest you can check the model config ~~

@Andy1621
Copy link
Collaborator

Andy1621 commented Mar 1, 2022

By the way, for the downstream tasks, you'd better freeze BN.
I forget to freeze BN in my experiments. Such a common trick will also improve the performance.

@Andy1621 Andy1621 reopened this Mar 1, 2022
@hongsheng-Z
Copy link
Author

Thank you very much for your careful reply, but I don't know how to freeze BN, can you provide the relevant reference code?

@Andy1621
Copy link
Collaborator

Andy1621 commented Mar 1, 2022

@Andy1621
Copy link
Collaborator

Andy1621 commented Mar 5, 2022

@hongsheng-Z Hi! Does the new pre-trained model work for your task?

@hongsheng-Z
Copy link
Author

Yes, it seems it has worked. But I still don't know how to freeze BN, and I'm not sure which BatchNorm layer in the uniformer should be frozen. Thanks for your excellent work.

@Andy1621
Copy link
Collaborator

Andy1621 commented Mar 5, 2022

@hongsheng-Z Freezing BN is a trick for the downstream task. The BN should be frozen if your batch is too small, like 2 for each GPU for object detection. If your batch is large enough (>8 for each GPU), freezing BN does not help. Besides, you can use SyncBN as well.

For freezing BN, you can simply set eval() for all the BN in the backbone. BN is used in CBlock in UniFormer.
You can find the code in MMDetection

  def train(self, mode=True):
      """Convert the model into training mode while keep normalization layer
      freezed."""
      super(ResNet, self).train(mode)
      self._freeze_stages()
      if mode and self.norm_eval:
          for m in self.modules():
              # trick: eval have effect on BatchNorm only
              if isinstance(m, _BatchNorm):
                  m.eval()

@Andy1621
Copy link
Collaborator

@hongsheng-Z Hi! Does UniFormer work for your task now?

@hongsheng-Z
Copy link
Author

yeah! Thank you very much for your patience in replying

@hongsheng-Z
Copy link
Author

Thanks for your excellent work, I have used it as the backbone for tracking tasks. In order to explain its validity it might be possible to use the structure diagram in your paper such as Figure 3 (may be slightly changed, like SwinT uses Swin Transformer), not sure if this is allowed or not.

@hongsheng-Z hongsheng-Z reopened this Mar 31, 2022
@Andy1621
Copy link
Collaborator

Andy1621 commented Apr 1, 2022

Thanks! Never mind to do it!

@Andy1621
Copy link
Collaborator

As there is no more activity, I am closing the issue, don't hesitate to reopen it if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants