Skip to content

xiaobai1217/Unseen-Modality-Interaction

Repository files navigation

Yunhua Zhang, Hazel Doughty, Cees G.M. Snoek

Screenshot 2024-01-22 at 16 38 34

This is the demo code for the video classification task using EPIC-Kitchens, with RGB and audio modalities.

Demo Code

Environment

  • Python 3.8.5
  • torch 1.12.1+cu113
  • torchaudio 0.12.1+cu113
  • torchvision 0.13.1+cu113
  • mmcv-full 1.7.0

Dataset

We download the RGB and optical flow frames from the official website of EPIC-Kitchens, and extract the audio files ourselves from the videos by extract_audio.py.

Run Demo

  • We provide the splits for training, validation and testing in the epic-annotations folder.

  • To run the code: python train.py --lr 1e-1 --batch_size 96 --save_name 1e-1

  • We finetuned the model by reduced learning rates, as specified in bash.sh.

About

This is the official code for NeurIPS 2023 paper "Learning Unseen Modality Interaction"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages