Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding VideoMAE to HuggingFace Transformers #23

Closed
NielsRogge opened this issue Jun 22, 2022 · 10 comments
Closed

Adding VideoMAE to HuggingFace Transformers #23

NielsRogge opened this issue Jun 22, 2022 · 10 comments

Comments

@NielsRogge
Copy link

Hi VideoMAE team :)

I've implemented VideoMAE as a fork of 🤗 HuggingFace Transformers, and I'm going to add it soon to the library (see huggingface/transformers#17821). Here's a notebook that illustrates inference with it: https://colab.research.google.com/drive/1ZX_XnM0ol81FbcxrFS3nNLkmn-0fzvQk?usp=sharing

The reason I'm adding VideoMAE is because I really like the simplicity of it, it was literally a single line of code change from ViT (nn.Conv2d -> nn.Conv3d).

As you may or may not know, any model on the HuggingFace hub has its own Github repository. E.g. the VideoMAE-base checkpoint fine-tuned on Kinetics-400 can be found here: https://huggingface.co/nielsr/videomae-base. If you check the "files and versions" tab, it includes the weights. The model hub uses git-LFS (large file storage) to use Git with large files such as model weights. This means that any model has its own Git commit history!

A model card can also be added to the repo, which is just a README.

Are you interested in creating an organization on the hub, such that we can store all model checkpoints there (rather than under my user name)?

Let me know!

Kind regards,

Niels
ML Engineer @ HuggingFace

@yztongzhan
Copy link
Collaborator

Hi @NielsRogge! Thanks for your suggestions! We have created a org for https://huggingface.co/videomae. I want to know how to upload our models correctly?

@yztongzhan
Copy link
Collaborator

Hi @NielsRogge! Is there any update?

@NielsRogge
Copy link
Author

NielsRogge commented Jul 7, 2022

Hi @yztongzhan,

I just worked a bit further on it, I've implemented VideoMAEForPreTraining now as well, which includes the decoder and loss computation. The PR is now ready for review and will be reviewed by my colleagues.

Also, would it be possible to create an organization on the hub for Multimedia Computing Group, Nanjing University, with a short name (rather than just VideoMAE)? Cause otherwise people will have to do:

from transformers import VideoMAEModel

model = VideoMAEForVideoClassification.from_pretrained("VideoMAE/videomae-base-finetuned-kinetics")

for instance, which means they have to type quite a lot of videomae 😂 also, if there would be newer models coming out that are also part of the research of the same organization (such as AdaMixer), it makes sense to upload them to the same organization on the hub.

Regards,

Niels

@wanglimin
Copy link
Contributor

Hi @NielsRogge ,

Thanks for your update. We have created an organization account on the hub:

https://huggingface.co/MCG-NJU

You can use this organization for storing our model checkpoints. BTW, you could also include our other repo such as AdaMixer and MixFormer.

Best,
Limin

@wanglimin
Copy link
Contributor

@NielsRogge Any update?

@NielsRogge
Copy link
Author

Hi @wanglimin,

the model will soon be added to the library. I'll transfer the weights to the MCG-NJYU organization today.

Are you interested in collaborating on a script for easy fine-tuning?

@NielsRogge
Copy link
Author

I've currently transferred 3 models: https://huggingface.co/models?other=videomae.

To make the model names not too long, I would use the following names:

model_names = [
        # Kinetics-400 checkpoints (short = pretrained only for 800 epochs instead of 1600)
        "videomae-base-short",
        "videomae-base-short-finetuned-kinetics",
        "videomae-base",
        "videomae-base-finetuned-kinetics",
        "videomae-large",
        "videomae-large-finetuned-kinetics",
        # Something-Something-v2 checkpoints (short = pretrained only for 800 epochs instead of 2400)
        "videomae-base-short-ssv2",
        "videomae-base-short-finetuned-ssv2",
        "videomae-base-ssv2",
        "videomae-base-finetuned-ssv2",
    ]

Is that ok for you? Also, are you interested in adding model cards to the repos on the hub? Each model has its own git repo, and the model card is just a README (Markdown file).

@NielsRogge
Copy link
Author

Hi @wanglimin,

VideoMAE has been added to the library! https://huggingface.co/docs/transformers/main/en/model_doc/videomae

Checkpoints are on the hub: https://huggingface.co/models?other=videomae

@yztongzhan
Copy link
Collaborator

Hi @NielsRogge! Thanks again for your efforts! We will add these links in README.

@wanglimin
Copy link
Contributor

@NielsRogge , Thanks a lot for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants