#### Clone Repository


In [2]:
!git clone https://github.com/josafatburmeister/BirdSongIdentification.git

Cloning into 'BirdSongIdentification'...
remote: Enumerating objects: 1556, done.[K
remote: Counting objects: 100% (678/678), done.[K
remote: Compressing objects: 100% (441/441), done.[K
remote: Total 1556 (delta 453), reused 448 (delta 234), pack-reused 878[K
Receiving objects: 100% (1556/1556), 510.85 KiB | 7.00 MiB/s, done.
Resolving deltas: 100% (1043/1043), done.


#### Install Dependencies

In [11]:
%cd /content/BirdSongIdentification/src
!python3 -m pip install -r requirements-colab.txt


/content/BirdSongIdentification
Collecting argparse>=1.4.0
  Using cached argparse-1.4.0-py2.py3-none-any.whl (23 kB)
Installing collected packages: argparse
Successfully installed argparse-1.4.0


**Important hint**: After installing the dependencies, restart the Google Colab runtime so that the installed dependencies are loaded. To do this, go to the "Runtime" menu item and select "Restart runtime".

#### Set Import Path

In [1]:
import sys

sys.path.append('/content/BirdSongIdentification/src')

#### Setting Up the File Manager

In [2]:
from general import FileManager

file_manager = FileManager("/content/data")

#### Setting Up the Logger

In [3]:
import logging

from general import logger

logger.setLevel(logging.VERBOSE)

#### Pipeline Stage 1: Data Download

In [4]:
# download audio files and metadata from Xeno-Canto

from downloader import XenoCantoDownloader

xc_downloader = XenoCantoDownloader(file_manager)

species_list=["Turdus merula, song, call", "Erithacus rubecula, song, call"]

xc_downloader.create_datasets(
    species_list=species_list,
    use_nips4b_species_list=False,
    maximum_samples_per_class=10,
    maximum_recording_length=180,
    test_size=0.4,
    min_quality="A",
    sound_types=["song", "call"],
    sexes=None,
    life_stages=None,
    exclude_special_cases=True,
    maximum_number_of_background_species=0,
    clear_audio_cache=False,
    clear_label_cache=False)

Download label file for Turdus merula...:   0%|          | 0/10 [00:00<?, ?it/s]

Sound types for Turdus merula: {'song', 'call'}
[I 210911 10:23:44 utils:157] NumExpr defaulting to 2 threads.
6 train samples for Turdus merula_song
2 val samples for Turdus merula_song
2 test samples for Turdus merula_song
6 train samples for Turdus merula_call
2 val samples for Turdus merula_call
2 test samples for Turdus merula_call


Download label file for Erithacus rubecula...:   0%|          | 0/8 [00:00<?, ?it/s]

Sound types for Erithacus rubecula: {'song', 'call'}
6 train samples for Erithacus rubecula_song
2 val samples for Erithacus rubecula_song
2 test samples for Erithacus rubecula_song
6 train samples for Erithacus rubecula_call
2 val samples for Erithacus rubecula_call
2 test samples for Erithacus rubecula_call
Empty dir /content/data/train/audio.


Download train set...:   0%|          | 0/24 [00:00<?, ?it/s]

download https://www.xeno-canto.org/438242/download
download https://www.xeno-canto.org/36670/download
download https://www.xeno-canto.org/357491/download
download https://www.xeno-canto.org/434485/download
download https://www.xeno-canto.org/644950/download
download https://www.xeno-canto.org/446367/download
download https://www.xeno-canto.org/357612/download
download https://www.xeno-canto.org/360361/download
download https://www.xeno-canto.org/533277/download
download https://www.xeno-canto.org/113965/download
download https://www.xeno-canto.org/437188/download
download https://www.xeno-canto.org/321417/download
download https://www.xeno-canto.org/102796/download
download https://www.xeno-canto.org/616863/download
download https://www.xeno-canto.org/253288/download
download https://www.xeno-canto.org/467058/download
download https://www.xeno-canto.org/43568/download
download https://www.xeno-canto.org/513313/download
download https://www.xeno-canto.org/513312/download
download https

Download val set...:   0%|          | 0/8 [00:00<?, ?it/s]

download https://www.xeno-canto.org/445472/download
download https://www.xeno-canto.org/92898/download
download https://www.xeno-canto.org/62979/download
download https://www.xeno-canto.org/477788/download
download https://www.xeno-canto.org/100946/download
download https://www.xeno-canto.org/287909/download
download https://www.xeno-canto.org/596070/download
download https://www.xeno-canto.org/350923/download
Empty dir /content/data/test/audio.


Download test set...:   0%|          | 0/8 [00:00<?, ?it/s]

download https://www.xeno-canto.org/76795/download
download https://www.xeno-canto.org/363302/download
download https://www.xeno-canto.org/666898/download
download https://www.xeno-canto.org/28538/download
download https://www.xeno-canto.org/299253/download
download https://www.xeno-canto.org/260238/download
download https://www.xeno-canto.org/469490/download
download https://www.xeno-canto.org/669818/download


In [5]:
# download NIPS4BPlus dataset

from downloader import NIPS4BPlusDownloader

nips4bplus_downloader = NIPS4BPlusDownloader(file_manager)

species_list=["Turdus merula, song, call", "Erithacus rubecula, song, call"]

nips4bplus_downloader.download_nips4bplus_dataset(species_list=species_list)

Empty dir /content/data/nips4bplus/.
Download NIPS4BPlus audio files...
download http://sabiod.univ-tln.fr/nips4b/media/birds/NIPS4B_BIRD_CHALLENGE_TRAIN_TEST_WAV.tar.gz
Unzip NIPS4BPlus audio files...
Download NIPS4BPlus label files...
download https://ndownloader.figshare.com/files/16334603
Unzip NIPS4BPlus label files...
download https://ndownloader.figshare.com/files/13390469


#### Pipeline Stage 2: Spectrogram Creation

In [6]:
from spectrograms import SpectrogramCreator

spectrogram_creator = SpectrogramCreator(
    chunk_length=1000,
    audio_file_manager=file_manager,
    spectrogram_file_manager=file_manager,
    include_noise_samples=True)

spectrogram_creator.create_spectrograms_for_datasets(datasets=["train", "val", "test"],
                                                     signal_threshold=3,
                                                     noise_threshold=1,
                                                     clear_spectrogram_cache=False)

spectrogram_creator.create_spectrograms_for_datasets(datasets=["nips4bplus", "nips4bplus_filtered"],
                                                     signal_threshold=0,
                                                     noise_threshold=0,
                                                     clear_spectrogram_cache=False)

Empty dir /content/data/train/spectrograms.


Create spectrograms for train set:   0%|          | 0/24 [00:00<?, ?it/s]

Empty dir /content/data/val/spectrograms.


Create spectrograms for val set:   0%|          | 0/8 [00:00<?, ?it/s]

Empty dir /content/data/test/spectrograms.


Create spectrograms for test set:   0%|          | 0/8 [00:00<?, ?it/s]

Empty dir /content/data/nips4bplus/spectrograms.


Create spectrograms for nips4bplus set:   0%|          | 0/569 [00:00<?, ?it/s]

Empty dir /content/data/nips4bplus_filtered/spectrograms.


Create spectrograms for nips4bplus_filtered set:   0%|          | 0/56 [00:00<?, ?it/s]

#### Pipeline Stage 3: Model Training and Hyperparameter Tuning


In [7]:
# run hyperparameter tuning for batch size and learning rate

from training import hyperparameter_tuner

tuner = hyperparameter_tuner.HyperparameterTuner(
    file_manager,
    architecture="resnet18",
    experiment_name="Tuning of batch size and learning rate",
    batch_size=[32, 64, 128],
    early_stopping=True,
    include_noise_samples=True,
    layers_to_unfreeze=["layer3", "layer4", "avg_pool", "fc"],
    learning_rate=[0.01, 0.001, 0.0001],
    learning_rate_scheduler="cosine",
    monitor="f1-score",
    multi_label_classification=True,
    multi_label_classification_threshold=0.5,
    number_epochs=10,
    number_workers=0,
    optimizer="Adam",
    patience=3,
    p_dropout=0,
    track_metrics=False,
    wandb_entity_name="",
    wandb_key="",
    wandb_project_name="",
    weight_decay=0)

tuner.tune_model()

Hyperparameter Tuning 

-------------------------
batch_size = 32
learning_rate = 0.01


Device set to: cpu


Label distribution of val set
Erithacus_rubecula_call : 49
Erithacus_rubecula_song : 157
Turdus_merula_call : 45
Turdus_merula_song : 183
noise : 183
Total: 617




Label distribution of train set
Erithacus_rubecula_call : 182
Erithacus_rubecula_song : 340
Turdus_merula_call : 106
Turdus_merula_song : 338
noise : 340
Total: 1306


Device set to: cpu
Setup resnet18 model: 
Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth


  0%|          | 0.00/44.7M [00:00<?, ?B/s]

* layer3 has been unfrozen.
* layer4 has been unfrozen.
* fc has been unfrozen.


Number of species: 4


Epoch 1/10
----------

| metric          |  Erithacus_rubecula_call  |  Erithacus_rubecula_song  |  Turdus_merula_call  |  Turdus_merula_song  |
|-----------------|---------------------------|---------------------------|----------------------|----------------------|
| f1-score        |          0.7390           |          0.5705           |        0.0182        |        0.6600        |
| precision       |          0.7925           |          0.6540           |        0.2500        |        0.7509        |
| recall          |          0.6923           |          0.5059           |        0.0094        |        0.5888        |
|                 |                           |                           |                      |                      |
| true-positives  |         126.0000          |         172.0000          |        1.0000        |       199.0000       |
| true-negatives  

In [4]:
# train model with fixed hyperparameters

from training import training

trainer = training.ModelTrainer(
    file_manager,
    architecture="resnet18",
    experiment_name="Test run",
    batch_size=64,
    early_stopping=False,
    is_hyperparameter_tuning=False,
    include_noise_samples=True,
    layers_to_unfreeze=["layer3", "layer4", "avg_pool", "fc"],
    learning_rate=0.0001,
    learning_rate_scheduler="cosine",
    multi_label_classification=True,
    multi_label_classification_threshold=0.5,
    number_epochs=10,
    number_workers=0,
    optimizer="Adam",
    p_dropout=0,
    track_metrics=False,
    wandb_entity_name="",
    wandb_key="",
    wandb_project_name="",
    weight_decay=0)

best_average_model, best_minimum_model, best_models_per_class = trainer.train_model()

Device set to: cpu


Label distribution of train set
Erithacus_rubecula_call : 182
Erithacus_rubecula_song : 340
Turdus_merula_call : 106
Turdus_merula_song : 338
[I 210911 18:24:11 utils:157] NumExpr defaulting to 2 threads.
noise : 340
Total: 1306




Label distribution of val set
Erithacus_rubecula_call : 49
Erithacus_rubecula_song : 157
Turdus_merula_call : 45
Turdus_merula_song : 183
noise : 183
Total: 617


Device set to: cpu
Setup resnet18 model: 
* layer3 has been unfrozen.
* layer4 has been unfrozen.
* fc has been unfrozen.


Number of species: 4


Epoch 1/10
----------
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)

| metric          |  Erithacus_rubecula_call  |  Erithacus_rubecula_song  |  Turdus_merula_call  |  Turdus_merula_song  |
|-----------------|---------------------------|---------------------------|----------------------|----------------------|
| f1-score        |          0.3677           |          0.5961           |        0.

#### Pipeline Stage 4: Model Evaluation

In [5]:
from training import model_evaluator

for confidence_threshold in [0.3, 0.4, 0.5]:
    evaluator = model_evaluator.ModelEvaluator(file_manager,
                                               architecture="resnet18",
                                               batch_size=32,
                                               include_noise_samples=True,
                                               multi_label_classification=True,
                                               multi_label_classification_threshold=confidence_threshold,
                                               track_metrics=False)

    evaluator.evaluate_model(model=best_average_model,
                             model_name=f"test_model_{confidence_threshold}_test",
                             dataset="test")
    evaluator.evaluate_model(model=best_average_model,
                             model_name=f"test_model_{confidence_threshold}_nips4bplus",
                             dataset="nips4bplus")
    evaluator.evaluate_model(model=best_average_model,
                             model_name=f"test_model_{confidence_threshold}_nips4bplus_filtered",
                             dataset="nips4bplus_filtered")

Device set to: cpu


Label distribution of test set
Erithacus_rubecula_call : 28
Erithacus_rubecula_song : 53
Turdus_merula_call : 39
Turdus_merula_song : 213
noise : 68
Total: 401


Model performance of test_model_0.3_test on test set:

| metric          |  Erithacus_rubecula_call  |  Erithacus_rubecula_song  |  Turdus_merula_call  |  Turdus_merula_song  |
|-----------------|---------------------------|---------------------------|----------------------|----------------------|
| f1-score        |          0.6000           |          0.3077           |        0.4865        |        0.8501        |
| precision       |          1.0000           |          0.2241           |        0.5143        |        0.8120        |
| recall          |          0.4286           |          0.4906           |        0.4615        |        0.8920        |
|                 |                           |                           |                      |                      |
| true-positives  |          1