Roadmap of MMAction2 #19

hellock · 2020-07-13T13:30:06Z

We keep this issue open to collect feature requests from users and hear your voice. Our monthly release plan is also available here.

You can either:

Suggest a new feature by leaving a comment.
Vote for a feature request with 👍 or be against with 👎. (Remember that developers are busy and cannot respond to all feature requests, so vote for your most favorable one!)
Tell us that you would like to help implement one of the features in the list or review the PRs. (This is the greatest things to hear about!)

d-li14 · 2020-07-14T11:26:42Z

I suppose it would be interesting to add CSN and X3D by FAIR into the supported model family.
I also have an interest in helping implement/review them if time permits.

hellock · 2020-07-14T11:40:33Z

I suppose it would be interesting to add CSN and X3D by FAIR into the supported model family.
I also have an interest in helping implement/review them if time permits.

CSN is in the plan of next release. It would be great if you would like to help with the implementation of X3D.

Amazingren · 2020-07-15T09:31:35Z

I strongly recommend adding the support for dataset FineGym99 with video dataset_type, it would be more convenient for users to validate the ideas for fine-grained action recognition or localization tasks. Hoping this would come true in a not so long future!

irvingzhang0512 · 2020-07-16T14:05:17Z

it will be nice if mmaction2 could support ava dataset and spatio-temporal action detection models.

q5390498 · 2020-07-20T07:59:59Z

it will be nice if mmaction2 can give some pretrained backbone models for user,, such as ResNet3dSlowFast and so on.

hellock · 2020-07-21T12:46:47Z

it will be nice if mmaction2 could support ava dataset and spatio-temporal action detection models.

Yes it is in the plan.

hellock · 2020-07-21T12:51:17Z

it will be nice if mmaction2 can give some pretrained backbone models for user,, such as ResNet3dSlowFast and so on.

There are already lots of pretrained models in the model zoo.

IDayday · 2020-07-27T02:42:09Z

It will be better if the model can output in video format such as mp4. I have tired the demo.py, it feedbacks text.

dreamerlin · 2020-08-03T05:16:47Z

It now supports to output video format and gif format in demo.py.

innerlee · 2020-08-03T05:25:12Z

@dreamerlin could you pls sort out all feature requests in one grand post here, so that we can easily track the status? 🏃

tianyuan168326 · 2020-08-08T06:13:38Z

Introducing Multi-Grid or mixed precision training strategy would be helpful for faster prototype iteration.

JJBOY · 2020-08-08T12:27:50Z

In the action localization task，you provided the code to get the AUC metric for action proposal evaluation.
Could you also provide the classification results to get the mAP?

IDayday · 2020-08-31T13:15:18Z

It can be used to recognize real-time videos with webcamera or something else? thanks

makecent · 2020-10-01T14:25:28Z

There are many trained models in Model Zoo, while all of them are just used to test the performance of the proposed works. Do you plan to make them available for backbone pre-training? Say I may want to use the i3d pre-trained on kinetics-400 as the pre-trained backbone of my own model. It seems that we don't have much choice of pre-trained backbones except a Resnet50 on ImageNet.

dreamerlin · 2020-10-01T14:30:01Z

There are many trained models in Model Zoo, while all of them are just used to test the performance of the proposed works. Do you plan to make them available for backbone pre-training? Say I may want to use the i3d pre-trained on kinetics-400 as the pre-trained backbone of my own model. It seems that we don't have much choice of pre-trained backbones except a Resnet50 on ImageNet.

To use the pre-trained model for the whole network, the new config adds the link of pre-trained models in the load_from. See Tutorial 1: Finetuning Models # Use Pre-Trained Model and example. And to use backbone pre-training, you can change pretrained value in the backbone dict, The unexpected keys will be ignored.

makecent · 2020-10-02T07:04:49Z

There are many trained models in Model Zoo, while all of them are just used to test the performance of the proposed works. Do you plan to make them available for backbone pre-training? Say I may want to use the i3d pre-trained on kinetics-400 as the pre-trained backbone of my own model. It seems that we don't have much choice of pre-trained backbones except a Resnet50 on ImageNet.

To use the pre-trained model for the whole network, the new config adds the link of pre-trained models in the load_from. See Tutorial 1: Finetuning Models # Use Pre-Trained Model and example. And to use backbone pre-training, you can change pretrained value in the backbone dict, The unexpected keys will be ignored.

Wow! Fantastic! I think you can mention this feature somewhere in case others, like me, may don't know that they directly use pre-trained weights of the whole model for the backbone.

vikizhao156 · 2020-11-01T09:28:16Z

Could you please support X3D

dreamerlin · 2020-11-03T06:18:01Z

Could you please support X3D

Here is the X3D config files. https://github.com/open-mmlab/mmaction2/tree/master/configs/recognition/x3d

ahkarami · 2020-11-21T22:03:38Z

Could you please add Video Action/Activity Temporal Segmentation models?

ahkarami · 2020-11-21T22:04:49Z

Also, could you please add Video models on MovieNet data set?

mikeyEcology · 2020-12-11T23:39:26Z

Hi, I'm struggling to train a model using a dataset structured like the AVA dataset. Does anyone have a config file that they have used for this type of dataset that they would be willing to share? There is a code to create an ava dataset, but I haven't been able to find any config files. Otherwise, is there a different framework I can train where I have bounding boxes in the training data?
Thank you

wwdok · 2020-12-14T10:05:56Z

Recently I learned about action localization/detection/segmentation(They seem to be the same thing ), it seems that it can generate a file like caption, i found it very interesting and practical. I will be very apreciate it if mmaction2 could have the action localization demo and more docs about it, thanks !

irvingzhang0512 · 2020-12-18T10:39:39Z

Very happy to have spatio-temporal action detection model today... Two related features could be very helpful:

spatio-temporal action detection online/video demo.
train spatio-temporal action detection models with custom categories.(eg. choose sit/stand/lie, ignore all other categories)

F9393 · 2020-12-22T21:16:01Z

Do you have a plan to add flow models for TSN and I3D?

jin-s13 · 2021-01-06T09:16:37Z

How about adding some models for temporal action segmentation?

jayleicn · 2021-01-15T02:07:17Z

Thanks for the great repo! Do you have plans adding S3D and S3D-G from https://arxiv.org/abs/1712.04851? They achieve better performance than the I3D model while runs much faster. Here is a reproduced implementation of the S3D model: https://github.com/kylemin/S3D. And for S3D-G model https://github.com/antoine77340/S3D_HowTo100M/blob/master/s3dg.py, https://github.com/tensorflow/models/blob/master/research/slim/nets/s3dg.py

sijun-zhou · 2021-02-24T03:32:56Z

Thanks in advance for this great unceasing progressing repo.

Recently, I saw that on ava-kinetics challenge, the new method 'Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization' has a very good performance and take the lead of nearly 6 percent to the second place in the competition 2020. And I think is a good candidate to enrich the area of spatio temporal action localization in mmaction2.

Will you consider to include this network?
I have also open a request on #641

tianxianhao · 2021-02-25T08:35:18Z

Could you please add the algorithm proposed in the paper of AVA dataset [1]. It is helpful for comparing experiment for spatio-temporal action localization when using AVA dataset. The model is consist of Faster-Rcnn and I3D.

Reference:
[1] Gu C, Sun C, Ross D A, et al. Ava: A video dataset of spatio-temporally localized atomic visual actions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 6047-6056.

f-guitart · 2021-04-21T17:16:26Z

Is there any plan or current work for multi modal action classification?

innerlee · 2021-04-22T01:16:41Z

@f-guitart We have audios https://github.com/open-mmlab/mmaction2/tree/master/configs/recognition_audio

irvingzhang0512 · 2021-04-22T16:19:37Z

Maybe MMAction2 could support some of the models and datasets from PytorchVideo

SubarnaTripathi · 2021-05-14T05:55:48Z

Do you plan to support Action Genome dataset and model ?

rlleshi · 2021-05-28T12:55:37Z

Add output predictions as JSON in long_video_demo.py (currently, only video is supported). #862

I have implemented this but need to polish it so that it's clean and similar to the rest of the codebase here. Will do a PR in the future.

Deep-learning999 · 2021-06-05T12:48:41Z

Hope to have Kinetics-TPS FineAction MultiSports data set support, pre-training model, training and web video inference demo

Deep-learning999 · 2021-08-07T15:15:49Z

I hope to use posec3d to realize bone-based spatiotemporal motion detection

connor-john · 2022-01-21T02:40:42Z

Add demo scripts for temporal action detection models

Was mentioned in #746 any progress?

baigwadood · 2022-06-30T20:37:01Z

Hope to have web_cam demo for posec3d in near future.

abdulazizab2 · 2022-07-21T21:10:50Z

Do you plan to add a new model to spatio-temporal action detection?

The ACRN (Actor Centric Relation Network) is great. However, ACAR adopts the previous work and builds on it with better results.

MooresS · 2022-11-17T01:19:33Z

I would appreciate it if you could add ViViT.Because I feel there are few transformer-based method for action recognition in MMaction2.

zsz00 · 2022-12-27T09:36:59Z

Hope to have two-stream dataset support , such as in slow-fast : rgb for slow-pathway and optical flow for fast-pathway.
such as use rgb and optical flow in tsm .

Wrc0217 · 2023-05-31T09:32:44Z

怎么用自己的数据集跑bsn模型呢，有没有具体的步骤，我需要做些什么，谢谢

hellock pinned this issue Jul 13, 2020

hellock added good first issue Good for newcomers help wanted Extra attention is needed labels Jul 13, 2020

Amazingren mentioned this issue Jul 15, 2020

About the supporting of FineGym datasets #13

Closed

innerlee mentioned this issue Jul 27, 2020

Demo.py #56

Closed

innerlee mentioned this issue Aug 31, 2020

real-time recognition #153

Closed

innerlee mentioned this issue Sep 1, 2020

Do you have the plan to add the person detection function #154

Closed

innerlee mentioned this issue Sep 23, 2020

experiment results on UCF-101 and HMDB-51 for R(2+1)D and I3D backbone. #199

Closed

wwdok mentioned this issue Nov 23, 2020

Pretrained model for HMDB51 #371

Closed

dreamerlin mentioned this issue Dec 1, 2020

Finetune pretrained model with custom number of classes #408

Closed

innerlee mentioned this issue Dec 14, 2020

Iteration Plan - Dec 2020 #412

Closed

15 tasks

innerlee mentioned this issue Feb 24, 2021

will consider to include 'Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization'? #641

Closed

kennymckormick unpinned this issue Feb 15, 2022

kennymckormick pinned this issue Feb 15, 2022

cir7 unpinned this issue May 30, 2023

Roadmap of MMAction2 #19

Roadmap of MMAction2 #19

Comments

hellock commented Jul 13, 2020 • edited by dreamerlin Loading

d-li14 commented Jul 14, 2020

hellock commented Jul 14, 2020

Amazingren commented Jul 15, 2020

irvingzhang0512 commented Jul 16, 2020 • edited Loading

q5390498 commented Jul 20, 2020

hellock commented Jul 21, 2020

hellock commented Jul 21, 2020

IDayday commented Jul 27, 2020

dreamerlin commented Aug 3, 2020

innerlee commented Aug 3, 2020

tianyuan168326 commented Aug 8, 2020

JJBOY commented Aug 8, 2020

IDayday commented Aug 31, 2020

makecent commented Oct 1, 2020

dreamerlin commented Oct 1, 2020 • edited Loading

makecent commented Oct 2, 2020

vikizhao156 commented Nov 1, 2020

dreamerlin commented Nov 3, 2020

ahkarami commented Nov 21, 2020

ahkarami commented Nov 21, 2020

mikeyEcology commented Dec 11, 2020

wwdok commented Dec 14, 2020

irvingzhang0512 commented Dec 18, 2020 • edited Loading

F9393 commented Dec 22, 2020

jin-s13 commented Jan 6, 2021

jayleicn commented Jan 15, 2021 • edited Loading

sijun-zhou commented Feb 24, 2021 • edited Loading

tianxianhao commented Feb 25, 2021

f-guitart commented Apr 21, 2021

innerlee commented Apr 22, 2021

irvingzhang0512 commented Apr 22, 2021

SubarnaTripathi commented May 14, 2021 • edited Loading

rlleshi commented May 28, 2021

Deep-learning999 commented Jun 5, 2021

Deep-learning999 commented Aug 7, 2021

connor-john commented Jan 21, 2022

baigwadood commented Jun 30, 2022

abdulazizab2 commented Jul 21, 2022

MooresS commented Nov 17, 2022

zsz00 commented Dec 27, 2022 • edited Loading

Wrc0217 commented May 31, 2023

hellock commented Jul 13, 2020 •

edited by dreamerlin

Loading

irvingzhang0512 commented Jul 16, 2020 •

edited

Loading

dreamerlin commented Oct 1, 2020 •

edited

Loading

irvingzhang0512 commented Dec 18, 2020 •

edited

Loading

jayleicn commented Jan 15, 2021 •

edited

Loading

sijun-zhou commented Feb 24, 2021 •

edited

Loading

SubarnaTripathi commented May 14, 2021 •

edited

Loading

zsz00 commented Dec 27, 2022 •

edited

Loading