Audio Question Answering (AQA)

PyTorch code accompanies our Interspeech 2023 paper:

Multi-Scale Attention for Audio Question Answering [arXiv]

Requirements

python3.6 +
pytorch1.6.0
tensorboardX
ffmpeg

Usage

Clone this repo
```
https://github.com/GeWu-Lab/MWAFM.git
```
Download data

Clotho-AQA and AQA-MUSIC-AVQA
Data pre-processing

We follow exact the same setting data format as MUSIC AVQA.

Notice: We examined the original annotation files of Clotho-AQA and found that the official open-source annotations were not cleansed, resulting in discrepancies where different annotators provided different answers for the same question. As a result, we performed a simple filtering process where we considered a question to have the correct answer if it had at least two identical answers Based on this filtering process, we obtained a new and more accurate annotation file. The files in 'metadata' folder are described as follows
- 'single_word_[train/val/test].csv', Does not contain samples with answers yes and no.
- 'single_word_[train/val/test]_clean.csv', Does not contain samples with answers yes and no. (Cleaned data)
- 'clotho_aqa_[train/val/test]_clean.csv', Contains samples with answers yes and no. (Cleaned data)
- 'binary_[train/val/test]_clean.csv', Include only samples with answers yes and no. (Cleaned data)

Train and evaluate

Training

python main_MWAFM.py --mode train

Testing

python main_MWAFM.py --mode test

Citation

If you find this work useful, please consider citing it.


@ARTICLE{Li2023MultiScale,
  title	= {Multi-Scale Attention for Audio Question Answering},
  author	= {Guangyao li, Yixin Xu, Di Hu},
  journal	= {Proc. INTERSPEECH},
  year	= {2023},
}

Acknowledgement

This research was supported by Public Computing Cloud, Renmin University of China.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
metadata		metadata
nets		nets
scripts		scripts
conifg.py		conifg.py
data_generator.py		data_generator.py
main_MWAFM.py		main_MWAFM.py
readme.md		readme.md
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metadata

metadata

nets

nets

scripts

scripts

conifg.py

conifg.py

data_generator.py

data_generator.py

main_MWAFM.py

main_MWAFM.py

readme.md

readme.md

utils.py

utils.py

Repository files navigation

Audio Question Answering (AQA)

Requirements

Usage

Citation

Acknowledgement

About

Releases

Packages

Languages

GeWu-Lab/MWAFM

Folders and files

Latest commit

History

Repository files navigation

Audio Question Answering (AQA)

Requirements

Usage

Citation

Acknowledgement

About

Resources

Stars

Watchers

Forks

Languages