## Install dependencies

In [1]:
!python3 -m pip install -r requirements-notebook.txt

Collecting argparse>=1.4.0
  Using cached argparse-1.4.0-py2.py3-none-any.whl (23 kB)
Installing collected packages: argparse
Successfully installed argparse-1.4.0
You should consider upgrading via the '/Users/josafat/BirdSongIdentification/env-2/bin/python3 -m pip install --upgrade pip' command.[0m


## Setup file path manager


In [2]:
from general import PathManager

path_manager = PathManager("./data")

## Setup logging

In [3]:
import logging

from general import logger

logger.setLevel(logging.VERBOSE)

## Step 1: Data Download

In [3]:
# download audio files and metadata from Xeno-Canto

from downloader import XenoCantoDownloader

xc_downloader = XenoCantoDownloader(path_manager)

species_list=["Turdus merula, song, call", "Erithacus rubecula, song, call"]

xc_downloader.create_datasets(
    species_list=species_list,
    use_nips4b_species_list=False,
    maximum_samples_per_class=10,
    maximum_recording_length=180,
    test_size=0.4,
    min_quality="A",
    sound_types=["song", "call"],
    sexes=None,
    life_stages=None,
    exclude_special_cases=True,
    maximum_number_of_background_species=0,
    clear_audio_cache=False,
    clear_label_cache=False,
    )

Download label file for Turdus merula...:   0%|          | 0/10 [00:00<?, ?it/s]

Sound types for Turdus merula: {'call', 'song'}
6 train samples for Turdus merula_call
2 val samples for Turdus merula_call
2 test samples for Turdus merula_call
6 train samples for Turdus merula_song
2 val samples for Turdus merula_song
2 test samples for Turdus merula_song


Download label file for Erithacus rubecula...:   0%|          | 0/8 [00:00<?, ?it/s]

Sound types for Erithacus rubecula: {'call', 'song'}
6 train samples for Erithacus rubecula_call
2 val samples for Erithacus rubecula_call
2 test samples for Erithacus rubecula_call
6 train samples for Erithacus rubecula_song
2 val samples for Erithacus rubecula_song
2 test samples for Erithacus rubecula_song
empty dir ./data/train/audio


Download train set...:   0%|          | 0/24 [00:00<?, ?it/s]

download https://www.xeno-canto.org/623199/download
download https://www.xeno-canto.org/399869/download
download https://www.xeno-canto.org/442772/download
download https://www.xeno-canto.org/541500/download
download https://www.xeno-canto.org/113965/download
download https://www.xeno-canto.org/127260/download
download https://www.xeno-canto.org/36671/download
download https://www.xeno-canto.org/357493/download
download https://www.xeno-canto.org/434611/download
download https://www.xeno-canto.org/446369/download
download https://www.xeno-canto.org/647755/download
download https://www.xeno-canto.org/41657/download
download https://www.xeno-canto.org/513312/download
download https://www.xeno-canto.org/513311/download
download https://www.xeno-canto.org/619708/download
download https://www.xeno-canto.org/109534/download
download https://www.xeno-canto.org/435964/download
download https://www.xeno-canto.org/357342/download
download https://www.xeno-canto.org/642751/download
download https

Could not download file with id 400921. Reason: [Errno 17] File exists: './data/cache/audio'


empty dir ./data/val/audio


Download val set...:   0%|          | 0/8 [00:00<?, ?it/s]

download https://www.xeno-canto.org/92912/download
download https://www.xeno-canto.org/111123/download
download https://www.xeno-canto.org/62980/download
download https://www.xeno-canto.org/445473/download
download https://www.xeno-canto.org/596061/download
download https://www.xeno-canto.org/350243/download
download https://www.xeno-canto.org/283965/download
download https://www.xeno-canto.org/477052/download
empty dir ./data/test/audio


Download test set...:   0%|          | 0/8 [00:00<?, ?it/s]

download https://www.xeno-canto.org/76796/download
download https://www.xeno-canto.org/669411/download
download https://www.xeno-canto.org/363303/download
download https://www.xeno-canto.org/41721/download
download https://www.xeno-canto.org/669420/download
download https://www.xeno-canto.org/299243/download
download https://www.xeno-canto.org/217671/download
download https://www.xeno-canto.org/468476/download


In [8]:
# download NIPS4BPlus dataset

from downloader import NIPS4BPlusDownloader

nips4bplus_downloader = NIPS4BPlusDownloader(path_manager)

species_list=["Turdus merula, song, call", "Erithacus rubecula, song, call"]

nips4bplus_downloader.download_nips4bplus_dataset(species_list=species_list)

empty dir ./data/nips4bplus/
Download NIPS4BPlus audio files...
cached http://sabiod.univ-tln.fr/nips4b/media/birds/NIPS4B_BIRD_CHALLENGE_TRAIN_TEST_WAV.tar.gz
Unzip NIPS4BPlus audio files...
Download NIPS4BPlus label files...
download https://ndownloader.figshare.com/files/16334603
Unzip NIPS4BPlus label files...
download https://ndownloader.figshare.com/files/13390469



## Step 2: Spectrogram Creation

In [11]:
from spectrograms import SpectrogramCreator

spectrogram_creator = SpectrogramCreator(
    1000,
    path_manager,
    include_noise_samples=True)

spectrogram_creator.create_spectrograms_for_splits(splits=["train", "val", "test", "nips4bplus", "nips4bplus_all"],
                                                   signal_threshold=3, noise_threshold=1, clear_spectrogram_cache=False)

empty dir ./data/train/spectrograms


Create spectrograms for train set:   0%|          | 0/23 [00:00<?, ?it/s]

empty dir ./data/val/spectrograms


Create spectrograms for val set:   0%|          | 0/8 [00:00<?, ?it/s]

empty dir ./data/test/spectrograms


Create spectrograms for test set:   0%|          | 0/8 [00:00<?, ?it/s]

empty dir ./data/nips4bplus/spectrograms


Create spectrograms for nips4bplus set:   0%|          | 0/56 [00:00<?, ?it/s]

empty dir ./data/nips4bplus_all/spectrograms


Create spectrograms for nips4bplus_all set:   0%|          | 0/569 [00:00<?, ?it/s]

empty dir ./data/val/spectrograms


Create spectrograms for val set:   0%|          | 0/8 [00:00<?, ?it/s]

empty dir ./data/test/spectrograms


Create spectrograms for test set:   0%|          | 0/8 [00:00<?, ?it/s]

empty dir ./data/nips4bplus/spectrograms


Create spectrograms for nips4bplus set:   0%|          | 0/56 [00:00<?, ?it/s]

empty dir ./data/nips4bplus_all/spectrograms


Create spectrograms for nips4bplus_all set:   0%|          | 0/569 [00:00<?, ?it/s]

## Step 3: Model Training

In [3]:
# run hyperparameter tuning for batch size and learning rate

from training import hyperparameter_tuner

tuner = hyperparameter_tuner.HyperparameterTuner(
    path_manager,
    architecture="resnet18",
    experiment_name="Tuning of batch size and learning rate",
    batch_size=[32, 64, 128],
    early_stopping=True,
    include_noise_samples=True,
    layers_to_unfreeze=["layer3", "layer4", "avg_pool", "fc"],
    learning_rate=[0.01, 0.001, 0.0001],
    learning_rate_scheduler="cosine",
    monitor="f1_score",
    multi_label_classification=True,
    multi_label_classification_threshold=0.5,
    number_epochs=1,
    number_workers=0,
    optimizer="Adam",
    patience=3,
    p_dropout=0,
    track_metrics=False,
    wandb_entity_name="",
    wandb_key="",
    wandb_project_name="",
    weight_decay=0
)

tuner.tune_model()

Hyperparameter Tuning 

-------------------------
batch_size = 32
learning_rate = 0.01




Label distribution of train set
Erithacus_rubecula_call : 208
Erithacus_rubecula_song : 342
Turdus_merula_call : 121
Turdus_merula_song : 456
noise : 456
Total: 1583




Label distribution of val set
Erithacus_rubecula_call : 88
Erithacus_rubecula_song : 113
Turdus_merula_call : 36
Turdus_merula_song : 176
noise : 176
Total: 589


Device set to: cpu
Setup resnet18 model: 
* layer3 has been unfrozen.
* layer4 has been unfrozen.
* fc has been unfrozen.


Device set to: cpu
Number of species: 4


Epoch 1/1
----------

| metric          |  Erithacus_rubecula_call  |  Erithacus_rubecula_song  |  Turdus_merula_call  |  Turdus_merula_song  |
|-----------------|---------------------------|---------------------------|----------------------|----------------------|
| f1-score        |          0.7109           |          0.6557           |        0.1069        |        0.6515        |
| precision       |   

In [3]:
# train model with fixed hyperparameters

from training import training

trainer = training.ModelTrainer(
    path_manager,
    architecture="resnet18",
    experiment_name="Test run",
    batch_size=64,
    early_stopping=False,
    is_hyperparameter_tuning=False,
    include_noise_samples=True,
    layers_to_unfreeze=["layer3", "layer4", "avg_pool", "fc"],
    learning_rate=0.0001,
    learning_rate_scheduler="cosine",
    multi_label_classification=True,
    multi_label_classification_threshold=0.5,
    number_epochs=10,
    number_workers=0,
    optimizer="Adam",
    p_dropout=0,
    track_metrics=False,
    wandb_entity_name="",
    wandb_key="",
    wandb_project_name="",
    weight_decay=0
)

best_average_model, best_minimum_model, best_models_per_class = trainer.train_model()



Label distribution of train set
Erithacus_rubecula_call : 208
Erithacus_rubecula_song : 342
Turdus_merula_call : 121
Turdus_merula_song : 456
noise : 456
Total: 1583




Label distribution of val set
Erithacus_rubecula_call : 88
Erithacus_rubecula_song : 113
Turdus_merula_call : 36
Turdus_merula_song : 176
noise : 176
Total: 589


Device set to: cpu
Setup resnet18 model: 
* layer3 has been unfrozen.
* layer4 has been unfrozen.
* fc has been unfrozen.


Device set to: cpu
Number of species: 4


Epoch 1/10
----------
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)

| metric          |  Erithacus_rubecula_call  |  Erithacus_rubecula_song  |  Turdus_merula_call  |  Turdus_merula_song  |
|-----------------|---------------------------|---------------------------|----------------------|----------------------|
| f1-score        |          0.7745           |          0.6740           |        0.3278        |        0.7143        |
| precision       |       

## Step 4: Model Evaluation

In [4]:
from training import model_evaluator

for confidence_threshold in [0.3, 0.4, 0.5]:
    evaluator = model_evaluator.ModelEvaluator(path_manager,
                                               architecture="resnet18",
                                               batch_size=32,
                                               include_noise_samples=True,
                                               multi_label_classification=True,
                                               multi_label_classification_threshold=confidence_threshold,
                                               track_metrics=False)

    evaluator.evaluate_model(model=best_average_model, model_name=f"test_model_{confidence_threshold}", split="test")
    evaluator.evaluate_model(model=best_average_model, model_name=f"test_model_{confidence_threshold}",
                             split="nips4bplus")
    evaluator.evaluate_model(model=best_average_model, model_name=f"test_model_{confidence_threshold}",
                             split="nips4bplus_all")



Label distribution of test set
Erithacus_rubecula_call : 35
Erithacus_rubecula_song : 43
Turdus_merula_call : 2
Turdus_merula_song : 159
noise : 99
Total: 338


Model performance of test_model_0.3 on test set:

| metric          |  Erithacus_rubecula_call  |  Erithacus_rubecula_song  |  Turdus_merula_call  |  Turdus_merula_song  |
|-----------------|---------------------------|---------------------------|----------------------|----------------------|
| f1-score        |          0.5818           |          0.3900           |        0.1111        |        0.8800        |
| precision       |          0.8000           |          0.2484           |        0.0625        |        0.9362        |
| recall          |          0.4571           |          0.9070           |        0.5000        |        0.8302        |
|                 |                           |                           |                      |                      |
| true-positives  |          16.0000          |        