Skip to content
DCASE Challenge 2019 - Task 5 Urban Sound Tagging
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
urban-sound-tagging
README.md

README.md

DCASE 2019 Challenge: Task 5 - Urban Sound Tagging

This repository contains code and annotation files to reproduce our output.csv file system outputs that was submitted to take part in the Task 5 (Urban Sound Tagging) of the DCASE 2019 Challenge.

Download the dataset

The dataset used to reproduce the system outputs can be found at the following link.

The dataset consists of the Task 5 development set and evaluation audio files, as well as audio files extracted from several sound classes from the following open external datasets: FSDKaggle2018, FSDnoisy18k, UrbanSound8k, Urban-SED, ESC-50-master.

The audio data extracted from various sound classes of the open external datasets were stitched and split into 10-seconds audio files to fit the model training for this challenge.

Installation

Please follow the installation guide that can be found at DCASE 2019 Challenge: Task 5 - Urban Sound Tagging Baseline system and ensure that your system is able to get the baseline system output.

Setting up

This setup guide assumes that your sonyc-ust directory is located at your home directory and that you have already run the following command:

export SONYC_UST_PATH=~/sonyc-ust

Extract the aforementioned dataset you downloaded into the sonyc-ust/data directory.

tar xf ./sonyc_ust_linus_all_files.tar.gz -C ~/sonyc-ust/data/

Activate the sonyc-ust environment

source activate sonyc-ust

Once you are in the sonyc-ust environment, pip install the OpenL3 python library.

pip install openl3

Extract the audio embeddings with OpenL3

mkdir ~/sonyc-ust/features/l3mel256emb512
cd ~/sonyc-ust/data
openl3 ./sonyc_ust_linus_all_files --content-type env --input-repr mel256 --embedding-size 512 --output ~/sonyc-ust/features/l3mel256emb512

Clone this repository

git clone https://github.com/linusng/sonyc-ust-challenge-2019.git
cd sonyc-ust-challenge-2019/urban-sound-tagging

Train our submitted model 1

# Fine-level
python classify_l3.py ./annotations_1.csv $SONYC_UST_PATH/data/dcase-ust-taxonomy.yaml $SONYC_UST_PATH/features/l3mel256emb512 $SONYC_UST_PATH/output baseline_fine --label_mode fine --num_hidden_layers 4

# Coarse-level
python classify_l3.py ./annotations_1.csv $SONYC_UST_PATH/data/dcase-ust-taxonomy.yaml $SONYC_UST_PATH/features/l3mel256emb512 $SONYC_UST_PATH/output baseline_coarse --label_mode coarse --num_hidden_layers 4

Train our submitted model 2

# Fine-level
python classify_l3.py ./annotations_2.csv $SONYC_UST_PATH/data/dcase-ust-taxonomy.yaml $SONYC_UST_PATH/features/l3mel256emb512 $SONYC_UST_PATH/output baseline_fine --label_mode fine --num_hidden_layers 3

# Coarse-level
python classify_l3.py ./annotations_2.csv $SONYC_UST_PATH/data/dcase-ust-taxonomy.yaml $SONYC_UST_PATH/features/l3mel256emb512 $SONYC_UST_PATH/output baseline_coarse --label_mode coarse --num_hidden_layers 3

Train our submitted model 3

# Fine-level
python classify_l3.py ./annotations_3.csv $SONYC_UST_PATH/data/dcase-ust-taxonomy.yaml $SONYC_UST_PATH/features/l3mel256emb512 $SONYC_UST_PATH/output baseline_fine --label_mode fine --num_hidden_layers 3 --num_epochs 20

# Coarse-level
python classify_l3.py ./annotations_3.csv $SONYC_UST_PATH/data/dcase-ust-taxonomy.yaml $SONYC_UST_PATH/features/l3mel256emb512 $SONYC_UST_PATH/output baseline_coarse --label_mode coarse --num_hidden_layers 3 --num_epochs 20

Train our submitted model 4

# Fine-level
python classify_l3.py ./annotations_3.csv $SONYC_UST_PATH/data/dcase-ust-taxonomy.yaml $SONYC_UST_PATH/features/l3mel256emb512 $SONYC_UST_PATH/output baseline_fine --label_mode fine --num_hidden_layers 3 --hidden_layer_size 256 --num_epochs 20

# Coarse-level
python classify_l3.py ./annotations_3.csv $SONYC_UST_PATH/data/dcase-ust-taxonomy.yaml $SONYC_UST_PATH/features/l3mel256emb512 $SONYC_UST_PATH/output baseline_coarse --label_mode coarse --num_hidden_layers 3 --hidden_layer_size 256 --num_eopchs 20

Contact

Please feel free to contact me at linusng@outlook.com if you have any questions regarding replicating the system outputs.

You can’t perform that action at this time.