active-fsl

This repo contains the scripts to generate the SONYC-FSD-SED dataset presented in the paper "Active Few-Shot Learning for Sound Event Detection." (INTERSPEECH 2022)

SONYC-FSD-SED

SONYC-FSD-SED is an open dataset of programmatically mixed audio clips that simulates audio data in an environmental sound monitoring system, where sound class occurrences and co-occurrences exhibit seasonal periodic patterns. We use recordings collected from the Sound of New York City (SONYC) acoustic sensor network as backgrounds, and single-labeled clips in the FSD50K dataset as foreground events to generate 576,591 10-second strongly-labeled soundscapes with Scaper (including 111,294 additional test data for the experiment of sampling window). Instead of sampling foreground sound events uniformly, we simulate the occurrence probability of each class at different times in a year, creating more realistic temporal characteristics.

Due to the large size of the dataset, instead of releasing the raw audio files, we release the source material and soundscape annotations in JAMS format, which can be used to reproduce SONYC-FSD-SED using Scaper.

To reproduce SONYC-FSD-SED:

Download all files from Zenodo.
Extract .tar.gz files. You will get

SONYC_FSD_SED.source: 96 SONYC backgrounds and 10,158 foreground sounds in .wav format, 2GB.
SONYC_FSD_SED.annotations: 465,467 annotation files, 57GB.
SONYC_FSD_SED_add_test.annotations: 111,294 annotation files for additional test data, 14GB.
vocab.json: 87 classes, each class is then labeled by its index in the list in following experiments. 0-42: train, 43-56: val, 57-86: test.
occ_prob_per_cl.pkl: Occurrence probability for each foreground sound class.

Install Scaper
Generate soundscapes from jams files by running the command. Set annpaths and sourcepath to the extracted folders, and savepath to the desired path to save output audio files.

python generate_soundscapes.py \
--sourcepath PATH-TO-SONYC_FSD_SED.source \
--annpath PATH-TO-SONYC_FSD_SED.annotations \
--savepath PATH-TO-SAVE-OUTPUT

Note that this will generate 465,467 audio files with a size of ~765GB to the folder SONYC_FSD_SED.audio at the set savepath.

If you also want to generate additional test data (used in the paper for the experiment of sampling window), change the annpath

python generate_soundscapes.py \
--sourcepath PATH-TO-SONYC_FSD_SED.source \
--annpath PATH-TO-SONYC_FSD_SED_add_test.annotations \
--savepath PATH-TO-SAVE-OUTPUT

This will generate 111,294 audio files with a size of ~191GB to the folder SONYC_FSD_SED.audio at the set savepath. These are additional 2-year-worth of test data (past year and future year).

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
generate_soundscapes.py		generate_soundscapes.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

generate_soundscapes.py

generate_soundscapes.py

Repository files navigation

active-fsl

SONYC-FSD-SED

About

Releases

Packages

Languages

License

wangyu/active-fsl

Folders and files

Latest commit

History

Repository files navigation

active-fsl

SONYC-FSD-SED

About

Resources

License

Stars

Watchers

Forks

Languages