Audio Pair with Difference dataset (APwD dataset)

APwD dataset is a pair of sounds with differences and text describing the differences. It is prepared by Daiki Takeuchi and members in NTT CS lab. The APwD dataset is designed for research that introduces auxilary textual information into content-based audio retrieval. Similar sound pairs are synthesized from the existing datasets for audio tagging, FSD50K and ESC50, and the differences are described based on synthesizing method. For details, please refer to the paper [1]. If you use the APwD dataset in your work, please cite this paper where it was introduced.

[1] Daiki Takeuchi, Yasunori Ohishi, Daisuke Niizumi, Noboru Harada and Kunio Kashino, "Introducing auxiliary text query-modifier to content-based audio retrieval," in Proc of INTERSPEECH, 2022.

Usage

Preparing FSD50K and ESC-50
Download FSD50K and ESC50. You can download them from the following URLs
FSD50k: https://zenodo.org/record/4060432
ESC-50: https://github.com/karolpiczak/ESC-50
After downloading, make a note of the directory where each wav file is saved (it will be used in the next step).
Modifying setting
In utils.py, rewrite the contents of the two variables (FSD50K and ESC50) to your environment The variables are defined at the beginning of the file as follows: directories FSD50K and ESC-50. Please enter the directory of the data saved in the previous step.
Synthesizing dataset
Run synthesize_dataset.sh.

License

See the file named LICENSE

Authors

Daiki Takeuchi
Yasunori Ohishi
Daisuke Niizumi
Noboru Harada
Kunio Kashino

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
datasets		datasets
org_audio_csv		org_audio_csv
.gitignore		.gitignore
LISENCE		LISENCE
README.md		README.md
modify_org_csv.py		modify_org_csv.py
preprocess_org_audio.py		preprocess_org_audio.py
synthesize_audio.py		synthesize_audio.py
synthesize_dataset.sh		synthesize_dataset.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets

datasets

org_audio_csv

org_audio_csv

.gitignore

.gitignore

LISENCE

LISENCE

README.md

README.md

modify_org_csv.py

modify_org_csv.py

preprocess_org_audio.py

preprocess_org_audio.py

synthesize_audio.py

synthesize_audio.py

synthesize_dataset.sh

synthesize_dataset.sh

utils.py

utils.py

Repository files navigation

Audio Pair with Difference dataset (APwD dataset)

Usage

License

Authors

About

Releases

Packages

nttcslab/apwd-dataset

Folders and files

Latest commit

History

Repository files navigation

Audio Pair with Difference dataset (APwD dataset)

Usage

License

Authors

About

Resources

Stars

Watchers

Forks