Entry to the Bird Audio Detection challenge

Based on Densely Connected Convolutional Networks (DenseNets) Theano/Lasagne

About the BAD challenge

To get information about the challenge, please visit its Website. The purpose was to propose a solution to the binary classification task of detecting singing birds in 10-s duration audio files.

Requirements

Theano (0.9.0.dev3)
Lasagne (0.2.dev1)
h5py (2.6.0)
scikit-learn (0.17)
fuel (0.2.0)
MIR toolbox for feature extraction with Matlab

Usage

1- Feature extraction

Features: 56 log-Mel F-BANK coefficients, 58 bands, hop size: 50 ms, frame size: 100 ms, fmin: 50 Hz, fmax: 22050 Hz

MIR_extract_logSpectrumBands.m: extracts F-BANK coefficients from WAV files
create_hdf5_ff1010bird_public.py: creates an HDF5 file with Train, Valid and Test subsets

2- To train a densenet for 30 epochs:

python train.py densenet 30

3- To test a model:

python test.py densenet <model path> <HDF5 datafile path> <CSV id file path>

Example (the HDF5 data file is not provided in this repo):

python test.py densenet models/bad16_densenet_bn_static-fbank-0.019326000-sub4.npz hdf5/bad2016test_melLogSpec56.hdf5 hdf5/Test_files.csv

Model Architecture

The code builds the following model. It is based on this recipe.

input layer: (None, 1, 200, 56)
first conv layer: (None, 32, 200, 56)
dense block 0: (None, 107, 200, 56)
transition 0: (None, 107, 100, 28)
dense block 1: (None, 182, 100, 28)
transition 1: (None, 182, 50, 14)
dense block 2: (None, 257, 50, 14)
post Global pool layer: (None, 257)
output layer: (None, 2)

total number of layers: 74
number of parameters in model: 328004

Each dense block corresponds to 5x[BatchNorm - ReLu - Conv3x3]
Each transition block corresponds to 1x[Conv1x1 - Max-Pool2x2]

Saliency maps

To generate saliency maps with guided backprop (based on this recipe):

python saliency_maps.py densenet <modelpath>

Example:

For any questions, please email me: thomas.pellegrini@irit.fr

If you use this code, please consider citing my paper:

T. Pellegrini. Densely Connected CNNs for Bird Audio Detection. In Proc. European Signal and Image Processing Conference (EUSIPCO 2017), EURASIP, pp. 1734-1738, September 2017, Kos

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
hdf5		hdf5
models		models
LICENSE_Lasagne		LICENSE_Lasagne
MIR_extract_logSpectrumBands.m		MIR_extract_logSpectrumBands.m
README.md		README.md
config.py		config.py
create_hdf5_ff1010bird_public.py		create_hdf5_ff1010bird_public.py
data_utils.py		data_utils.py
densenet.png		densenet.png
input_possaliency_0056c188-b8a5-46d7-ab1e.png		input_possaliency_0056c188-b8a5-46d7-ab1e.png
model_utils.py		model_utils.py
output_utils.py		output_utils.py
saliency_maps.py		saliency_maps.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Entry to the Bird Audio Detection challenge

About the BAD challenge

Requirements

Usage

Model Architecture

Saliency maps

About

Releases

Packages

Languages

topel/bird_audio_detection_challenge

Folders and files

Latest commit

History

Repository files navigation

Entry to the Bird Audio Detection challenge

About the BAD challenge

Requirements

Usage

Model Architecture

Saliency maps

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages