Bi-Directional Modality Fusion Network For Audio-Visual Event Localization

This repo holds the code for the work presented on ICASSP 2022 [Paper]

Prerequisites

We provide the implementation in PyTorch for the ease of use.

Install the requirements by runing the following command:

pip install -r requirements.txt

Code and Data Preparation

We highly appreciate @YapengTian for the shared features and code.

Download Features

Two kinds of features (i.e., Visual features and Audio features) are required for experiments.

Visual Features: You can download the VGG visual features from here.

Audio Features: You can download the VGG-like audio features from here.

Additional Features: You can download the features of background videos here, which are required for the experiments of the weakly-supervised setting.

After downloading the features, please place them into the data folder. The structure of the data folder is shown as follows:

data
|——audio_features.h5
|——audio_feature_noisy.h5
|——labels.h5
|——labels_noisy.h5
|——mil_labels.h5
|——test_order.h5
|——train_order.h5
|——val_order.h5
|——visual_feature.h5
|——visual_feature_noisy.h5

Download Datasets (Optional)

You can download the AVE dataset from the repo here.

Training and testing BMFN in a fully-supervised setting

Training

bash supv_train.sh
# The argument "--snapshot_pref" denotes the path for saving checkpoints and code.

Evaluating

bash supv_test.sh

After training, there will be a checkpoint file whose name contains the accuracy on the test set and the number of epoch.

Training and testing BMFN in a Weakly-supervised setting

Training

bash weak_train.sh

Evaluating

bash weak_test.sh

Citation

Please cite the following paper if you feel this repo useful to your research

@inproceedings{inproceedings,
author = {Liu, Shuo and Quan, Weize and Liu, Yuan and Yan, Dong‐Ming},
year = {2022},
month = {03},
pages = {},
title = {Bi-Directional Modality Fusion Network For Audio-Visual Event Localization},
doi = {10.1109/ICASSP43922.2022.9746280}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
dataset		dataset
model		model
torchh		torchh
utils		utils
README.md		README.md
_VF.py		_VF.py
current_configs.yaml		current_configs.yaml
requirements.txt		requirements.txt
supv_main.py		supv_main.py
supv_test.sh		supv_test.sh
supv_train.sh		supv_train.sh
weakly_main.py		weakly_main.py
weakly_test.sh		weakly_test.sh
weakly_train.sh		weakly_train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bi-Directional Modality Fusion Network For Audio-Visual Event Localization

Prerequisites

Code and Data Preparation

Download Features

Download Datasets (Optional)

Training and testing BMFN in a fully-supervised setting

Training and testing BMFN in a Weakly-supervised setting

Citation

About

Releases

Packages

Contributors 2

Languages

weizequan/BMFN

Folders and files

Latest commit

History

Repository files navigation

Bi-Directional Modality Fusion Network For Audio-Visual Event Localization

Prerequisites

Code and Data Preparation

Download Features

Download Datasets (Optional)

Training and testing BMFN in a fully-supervised setting

Training and testing BMFN in a Weakly-supervised setting

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages