Adding VideoMAE to HuggingFace Transformers #23

NielsRogge · 2022-06-22T12:21:18Z

Hi VideoMAE team :)

I've implemented VideoMAE as a fork of 🤗 HuggingFace Transformers, and I'm going to add it soon to the library (see huggingface/transformers#17821). Here's a notebook that illustrates inference with it: https://colab.research.google.com/drive/1ZX_XnM0ol81FbcxrFS3nNLkmn-0fzvQk?usp=sharing

The reason I'm adding VideoMAE is because I really like the simplicity of it, it was literally a single line of code change from ViT (nn.Conv2d -> nn.Conv3d).

As you may or may not know, any model on the HuggingFace hub has its own Github repository. E.g. the VideoMAE-base checkpoint fine-tuned on Kinetics-400 can be found here: https://huggingface.co/nielsr/videomae-base. If you check the "files and versions" tab, it includes the weights. The model hub uses git-LFS (large file storage) to use Git with large files such as model weights. This means that any model has its own Git commit history!

A model card can also be added to the repo, which is just a README.

Are you interested in creating an organization on the hub, such that we can store all model checkpoints there (rather than under my user name)?

Let me know!

Kind regards,

Niels
ML Engineer @ HuggingFace

The text was updated successfully, but these errors were encountered:

yztongzhan · 2022-06-23T09:57:23Z

Hi @NielsRogge! Thanks for your suggestions! We have created a org for https://huggingface.co/videomae. I want to know how to upload our models correctly?

yztongzhan · 2022-07-07T07:24:21Z

Hi @NielsRogge! Is there any update?

NielsRogge · 2022-07-07T13:58:59Z

Hi @yztongzhan,

I just worked a bit further on it, I've implemented VideoMAEForPreTraining now as well, which includes the decoder and loss computation. The PR is now ready for review and will be reviewed by my colleagues.

Also, would it be possible to create an organization on the hub for Multimedia Computing Group, Nanjing University, with a short name (rather than just VideoMAE)? Cause otherwise people will have to do:

from transformers import VideoMAEModel

model = VideoMAEForVideoClassification.from_pretrained("VideoMAE/videomae-base-finetuned-kinetics")

for instance, which means they have to type quite a lot of videomae 😂 also, if there would be newer models coming out that are also part of the research of the same organization (such as AdaMixer), it makes sense to upload them to the same organization on the hub.

Regards,

Niels

wanglimin · 2022-07-08T01:00:51Z

Hi @NielsRogge ,

Thanks for your update. We have created an organization account on the hub:

https://huggingface.co/MCG-NJU

You can use this organization for storing our model checkpoints. BTW, you could also include our other repo such as AdaMixer and MixFormer.

Best,
Limin

wanglimin · 2022-07-26T14:09:36Z

@NielsRogge Any update？

NielsRogge · 2022-08-02T09:39:15Z

Hi @wanglimin,

the model will soon be added to the library. I'll transfer the weights to the MCG-NJYU organization today.

Are you interested in collaborating on a script for easy fine-tuning?

NielsRogge · 2022-08-02T10:05:32Z

I've currently transferred 3 models: https://huggingface.co/models?other=videomae.

To make the model names not too long, I would use the following names:

model_names = [
        # Kinetics-400 checkpoints (short = pretrained only for 800 epochs instead of 1600)
        "videomae-base-short",
        "videomae-base-short-finetuned-kinetics",
        "videomae-base",
        "videomae-base-finetuned-kinetics",
        "videomae-large",
        "videomae-large-finetuned-kinetics",
        # Something-Something-v2 checkpoints (short = pretrained only for 800 epochs instead of 2400)
        "videomae-base-short-ssv2",
        "videomae-base-short-finetuned-ssv2",
        "videomae-base-ssv2",
        "videomae-base-finetuned-ssv2",
    ]

Is that ok for you? Also, are you interested in adding model cards to the repos on the hub? Each model has its own git repo, and the model card is just a README (Markdown file).

NielsRogge · 2022-08-08T12:19:28Z

Hi @wanglimin,

VideoMAE has been added to the library! https://huggingface.co/docs/transformers/main/en/model_doc/videomae

Checkpoints are on the hub: https://huggingface.co/models?other=videomae

yztongzhan · 2022-08-08T12:55:54Z

Hi @NielsRogge! Thanks again for your efforts! We will add these links in README.

wanglimin · 2022-08-09T00:42:29Z

@NielsRogge , Thanks a lot for your help!

yztongzhan closed this as completed Oct 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding VideoMAE to HuggingFace Transformers #23

Adding VideoMAE to HuggingFace Transformers #23

NielsRogge commented Jun 22, 2022

yztongzhan commented Jun 23, 2022

yztongzhan commented Jul 7, 2022

NielsRogge commented Jul 7, 2022 •

edited

Loading

wanglimin commented Jul 8, 2022

wanglimin commented Jul 26, 2022

NielsRogge commented Aug 2, 2022

NielsRogge commented Aug 2, 2022

NielsRogge commented Aug 8, 2022

yztongzhan commented Aug 8, 2022

wanglimin commented Aug 9, 2022

Adding VideoMAE to HuggingFace Transformers #23

Adding VideoMAE to HuggingFace Transformers #23

Comments

NielsRogge commented Jun 22, 2022

yztongzhan commented Jun 23, 2022

yztongzhan commented Jul 7, 2022

NielsRogge commented Jul 7, 2022 • edited Loading

wanglimin commented Jul 8, 2022

wanglimin commented Jul 26, 2022

NielsRogge commented Aug 2, 2022

NielsRogge commented Aug 2, 2022

NielsRogge commented Aug 8, 2022

yztongzhan commented Aug 8, 2022

wanglimin commented Aug 9, 2022

NielsRogge commented Jul 7, 2022 •

edited

Loading