Skip to content

PyTorch implementation for Convolutional Hierarchical Attention Network for Query-Focused Video Summarization paper, accepted by AAAI 2020 conference.

Notifications You must be signed in to change notification settings

srkds/CHAN-QFVS-PyTorch-Implementation

Repository files navigation

Convolutional Hierarchical Attention Network for Query-Focused Video Summarization (CHAN): A PyTorch Implementation

This is a PyTorch implementation of the "Convolutional Hierarchical Attention Network for Query-Focused Video Summarization", which is accepted by AAAI 2020 conference.

Note: This project is stil a work in progress

🎥 Model Details

Parallel Computing Model Simple Model Diagram

📑 Dataset

📈 Loss Function and Evaluation Method

📊 Results

Here is the result video summary for the query FOOD and HANDS. The model generated a ~4:30 minute summary which contains clips that either have food or hands in frame from a ~4-hour long video which contains diverse scenes like library, mall, driving, shop, etc.

food_hands_2times_epoch_nice_result.mp4

Installation

Step 1: Install dependencies

pip install -r requirements.txt

Step 2: Run the model

python main.py

Model Settings and Experiment Details

Todo

  • Add self attention and query focused global attention.

🙏 Acknowledgement

The implementation and understanding of this paper is being done as part of my research progress under the guidance of Prof. Payal Prajapati.

The evaluation code is being borrowed from EgoVLPv2.

The code is inspired by CHAN implementation: https://github.com/ckczzj/CHAN

About

PyTorch implementation for Convolutional Hierarchical Attention Network for Query-Focused Video Summarization paper, accepted by AAAI 2020 conference.

Topics

Resources

Stars

Watchers

Forks

Languages