Skip to content

Group k-Sparse Temporal Convolutional Neural Networks: Pre-training for Video Classification


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



12 Commits

Repository files navigation

Group k-Sparse Temporal Convolutional Neural Networks: Pre-training for Video Classification

Python source code for reproducing the experiments described in the paper

Paper (IEEE Xplore)

Code is mostly self-explanatory via file, variable and function names; but more complex lines are commented.
Designed to require minimal setup overhead, using as much Keras and sacred integration and reusability as possible.

Installing dependencies

Installing Python 3.7.9 on Ubuntu 20.04.2 LTS:

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python3.7

Installing CUDA 10.0:

sudo bash cuda_10.0.130_410.48_linux --override
echo 'export PATH=/usr/local/cuda-10.0/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc

Installing cuDNN 7.6.5:

# if link is broken, login and download from nvidia:
tar -xvzf cudnn-10.0-linux-x64-v7.6.5.32.tgz
sudo cp -r cuda/include/* /usr/local/cuda-10.0/include/
sudo cp -r cuda/lib64/* /usr/local/cuda-10.0/lib64/

Installing Python packages with pip:

python3.7 -m pip install h5py==2.10.0 ipython==7.16.1 keras==2.2.4 matplotlib==3.3.2 numpy==1.19.2 pillow==8.1.0 pywavelets==1.1.1 sacred==0.8.2 scikit-learn==0.23.2 scipy==1.5.2 tensorflow-gpu==1.14.0 tqdm==4.56.0

Running the code

Reproduction should be as easy as executing this in the root folder (after installing all dependencies):

python3.7 -m IPython experiments/ with groupwtacrnn nospatial seed=123

In general:

python3.7 -m IPython experiments/ with algorithm optional_config seed=number

where dataset is either:

  • mnistrotated : the Rotated MNIST video set, artificially generated by rotating and picking the top left corner,
  • cifar10scanned : the Scanned CIFAR-10 video set, artificially generated by sliding a window,
  • coil100 : the COIL-100 natural video set, placing objects on turning table;
  • necanimal : the NEC Animal natural video set, placing animal figures on turning table;

algorithm is either:

  • wtacnn : Winner-Take-All (WTA) Time Distributed CNN Autoencoder,
  • wtacrnn : Winner-Take-All (WTA) Recurrent CNN Autoencoder,
  • randominitcnn : Glorot Initialized Time Distributed CNN,
  • randominitcrnn : Glorot Initialized Recurrent CNN,
  • denoisingcnn : Denoising Time Distributed CNN Autoencoder,
  • denoisingcrnn : Denoising Recurrent CNN Autoencoder,
  • vgg19 : ImageNet Pretrained Time Distributed VGG19,
  • groupwtacnn : Group k-Sparse Time Distributed CNN Autoencoder,
  • groupwtacrnn : Group k-Sparse Recurrent CNN Autoencoder;

and optional_config is either nothing (both spatial and lifetime sparsity enabled by default), or:

  • nospatial : disable spatial sparsity,
  • nolifetime : disable lifetime sparsity.

seed : 123 in all of our experiments, should yield very similar numbers as in the table of our paper

Directory and file structure:

        : base class, the original WTA autoencoder baseline method
        : subclass, WTA with recurrent connections
        : subclass, no pretraining baseline method
        : subclass, no pretraining with recurrent connections
        : subclass, input dropout autoencoder baseline method
        : subclass, input dropout with recurrent connections
        : subclass, imagenet pretraining baseline method
        : subclass, our group k-sparse autoencoder
        : subclass, our group k-sparse autoencoder with recurrent connections
     : base class, loads Rotated MNIST data set and generates given number of labeled samples
     : subclass, same but for Scanned CIFAR-10
     : subclass, same but for COIL-100
     : subclass, same but for NEC Animal
           : config file for hyperparameters, loads Rotated MNIST data set and an algorithm,
                                                conducts experiment
           : same, but for Scanned CIFAR-10
           : same, but for COIL-100
           : same, but for NEC Animal
results/ : experimental results will be saved to this directory with sacred package
utils/ : custom Keras layer classes, including
                       ConvMinimalRNN2D : the convolutional minimal recurrent layer : custom Keras/Tensorflow operations, including
                    n_p : p-norm computation
                    group_norms : grouped p-norm computation
                    ksparse : top-k masking activation function
                    group_ksparse : our grouped top-k masking activation function : functions for backwards compatibility for saving all kinds of figures : functions for saving video frame figures : functions for ZCA whitening : additional things, including
                    VideoSequence : Keras Sequence subclass generating random videos


  title={Group k-sparse temporal convolutional neural networks: unsupervised pretraining for video classification},
  author={Milacski, Zolt{\'a}n {\'A} and P{\'o}czos, Barnab{\'a}s and L{\H{o}}rincz, Andr{\'a}s},
  booktitle={2019 International Joint Conference on Neural Networks (IJCNN)},


In case of any questions, feel free to create an issue here on GitHub, or mail me at


Group k-Sparse Temporal Convolutional Neural Networks: Pre-training for Video Classification







No releases published


No packages published
