Action Recognition [no longer maintained]

In this repo we study the problem of action recognition(recognizing actions in videos) on UCF101 famous dataset.

Here, I reimplemented two-stream approach for action recognition using pre-trained Xception networks for both streams(Look at references).

Live demo on Colab

Just clone Live Demo Two-steam net.ipynb notebook to your drive and run the cells on Google Colab (Something like the demo gif will be generated in video format)

Get started:

A full demo of the code in the repo can be found in Action_Recognition_Walkthrough.ipynb notebook.

Please clone Action_Recognition_Walkthrough.ipynb notebook to your drive account and run it on Google Colab on python3 GPU-enabled instance.

Environment and requirements:

This code requires python 3.6,

Tensorflow 1.11.0 (GPU enabled-the code uses keras associated with Tensorflow)
Imgaug 0.2.6
opencv 3.4.2.17
numpy 1.14.1

All of these requirements are satisfied by (python3 Colab GPU-enabled instance) Just use it and the notebook Action_Recognition_Walkthrough.ipynb will install the rest :)

Dataset:

I used UCF101 dataset originally found here.

Also the dataset is processed and published by feichtenhofer/twostreamfusion)

RGB images(single zip file split into three parts)

wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.001
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.002
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_jpegs_256.zip.003

Optical Flow u/v frames(single zip file split into three parts)

wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_tvl1_flow.zip.001
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_tvl1_flow.zip.002
wget http://ftp.tugraz.at/pub/feichtenhofer/tsfusion/data/ucf101_tvl1_flow.zip.003

Code Features:

You have variety of models to exchange between them easily.
Saves checkpoints on regular intervals and those checkpoints are synchronized to google drive using Drive API which means you can resume training anywhere for any Goggle Colab Instance.
Accesses the public models on my drive and you can resume and fine-tune them at different time stamps. Where the name of every checkpoint is as follows, EPOCH.BEST_TOP_1_ACC.CURRENT_TOP_1_ACC for example this which is 300-0.84298-0.84166.zip in folder heavy-mot-xception-adam-1e-05-imnet at this checkpoint,
- epoch=300
- best top 1 accuracy was 0.84298 (obtained in checkpoint before 300)
- the current accuracy is 0.84166
- in the experiment heavy-mot-xception-adam-1e-05-imnet

Models:

I used pre-trained models on imagenet provided by keras applications here.

The best results are obtained using Xception architecture.

Network	Top1-Acc
Spatial VGG19 stream	~75%
Spatial Resnet50 stream	81.2%
Spatial Xception stream	86.04%
------------------------	-------
Motion Resnet50 stream	~75%
Motion xception stream	84.4%
------------------------	-------
Average fusion	91.25%
------------------------	-------
Recurrent network fusion	91.7%

Pre-trained Model

All the pre-trained models could be found here.

It's the same drive folder accessed by the code while training and resuming training from a checkpoint.

Reference Papers:

Nice implementations of two-stream approach:

[1] Nice two-stream reimplementation using pytorch using resnets My code is inspired by this repo.
[2] Two-stream-pytorch
[3] Hidden-Two-Stream

Future directions:

[1] Hidden-Two-stream Which achieves real-time performance by using a deep neural net for generating the optical flow.
[2] Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? Discuses how 3d convolutions is the perfect architecture for videos and Kinetics dataset pre-training could retrace imagenet pre-training.
[3] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Useful links:

[1] awesome-action-recognition

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.idea		.idea
UCF_list		UCF_list
configs		configs
evaluation		evaluation
frame_dataloader		frame_dataloader
models		models
testing video samples		testing video samples
utils		utils
.directory		.directory
.gitignore		.gitignore
Action Recognition Walkthrough.ipynb		Action Recognition Walkthrough.ipynb
LICENSE		LICENSE
Live_Demo_Two_steam_net.ipynb		Live_Demo_Two_steam_net.ipynb
average_fusion_demo.py		average_fusion_demo.py
evaluate_streams.py		evaluate_streams.py
generate_motion_feature_dataset.py		generate_motion_feature_dataset.py
generate_spatial_feature_dataset.py		generate_spatial_feature_dataset.py
motion_trainer.py		motion_trainer.py
readme.md		readme.md
recurrent_fusion_trainer.py		recurrent_fusion_trainer.py
spatial_trainer.py		spatial_trainer.py
upload.sh		upload.sh

License

mohammed-elkomy/two-stream-action-recognition

Folders and files

Latest commit

History

Repository files navigation

Action Recognition [no longer maintained]

Live demo on Colab

Get started:

Environment and requirements:

Dataset:

Code Features:

Models:

Pre-trained Model

Reference Papers:

Nice implementations of two-stream approach:

Future directions:

Useful links:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages