mAQA

This repository contains data and codes for our Interspeech 2023 Paper: Towards Multi-Lingual Audio Question Answering.

Data: mClothoAQA

The primary challenge for multi-lingual AQA is obtaining high-quality labeled data. To tackle this issue, we developed the mClothoAQA dataset, a multi-lingual AQA dataset.
We accomplished this by translating questions and answers from the ClothoAQA dataset (English) into seven additional languages. The languages included are French (fr), Hindi (hi), German (de), Spanish (es), Italian (it), Dutch (nl), and Portuguese (pt).
The mClothoAQA dataset contains a total of 1991 audio files. Each language variant consists of 35838 question-answer pairs.

Downloading the Audios

Download the clotho-AQA dataset from https://zenodo.org/record/6473207.
Extract the zip file and place the audio files in dataset/audio_files/ directory.

Downloading the Multi-lingual mClothoAQA QA CSV Files

Copy the csv files for 8 languages from mClothoAQA/ directory to metadata/ directory.
Run split_dataset.py to generate csv files for binary answers and single word answers.
In mClothoAQA/, I have shared the splitted files so you can skip the above splitting step.

Feature Extraction

Audio

Utilize the openL3 open source Python library for calculating deep audio embeddings. Install openL3 via pip install openl3.
Execute the extract_features.py script.
Upon completion, locate the stored audio embeddings at dataset/features.

Text QA

To encode the input question into word embeddings, we utilize FastText pre-trained word vectors. These word vectors for 157 languages can be accessed here.
Download word vectors for the following languages and keep them in dataset/word_embedding/ directory. English (en), French (fr), Hindi (hi), German (de), Spanish (es), Italian (it), Dutch (nl), and Portuguese (pt).

Training the model

To train the model run train.py.
The model checkpoint will be saved for every 10 epochs. If you want to continue training from a saved checkpoint, assign the checkpoint path to pre_trained_model_path variable in train.py.

Inference

Once the model is trained, update the variables model_dir and model_path in run_inference.py and execute the file.

Acknowledgement

We'd like to acknowledge AquaNet repository for serving as the basis of our project.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
dataset/metadata		dataset/metadata
mClothoAQA		mClothoAQA
LICENSE		LICENSE
README.md		README.md
conifg.py		conifg.py
data_generator.py		data_generator.py
extract_features.py		extract_features.py
model.py		model.py
run_inference.py		run_inference.py
split_dataset.py		split_dataset.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mAQA

Data: mClothoAQA

Downloading the Audios

Downloading the Multi-lingual mClothoAQA QA CSV Files

Feature Extraction

Audio

Text QA

Training the model

Inference

Acknowledgement

About

Releases

Packages

Languages

License

swarupbehera/mAQA

Folders and files

Latest commit

History

Repository files navigation

mAQA

Data: mClothoAQA

Downloading the Audios

Downloading the Multi-lingual mClothoAQA QA CSV Files

Feature Extraction

Audio

Text QA

Training the model

Inference

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages