### Test performance of Bird species audio machine learning classifier on a human-labeled test set

prepare python environment:

```bash
pip install opensoundscape==0.12.0 torch torchaudio torchvision timm
pip install git+https://github.com/kitzeslab/bioacoustics-model-zoo
pip install git+https://github.com/kitzeslab/name_conversions.git
```

In [29]:
import numpy as np
import pandas as pd
from glob import glob
from pathlib import Path

from matplotlib import pyplot as plt
def figsize(w,h):
    plt.rcParams['figure.figsize']=[w,h]
figsize(15,5) #for big visuals
%config InlineBackend.figure_format = 'retina'
plt.rcParams['pdf.fonttype'] = 42
plt.rcParams['ps.fonttype'] = 42

import bioacoustics_model_zoo as bmz
import opensoundscape as opso
import name_conversions # convert between bird species naming conventions

download labeled dataset

In [None]:
!wget -O annotation_Files.zip https://datadryad.org/stash/downloads/file_stream/641805
!wget -O mp3_Files.zip https://datadryad.org/stash/downloads/file_stream/641807
!unzip annotation_Files.zip -d ./Annotation_Files
!unzip mp3_Files.zip -d ./Recordings

In [None]:
# set the current directory to where the dataset is downloaded
# CHANGE THIS to your data path, eg '.' if downloaded with previous cell
dataset_path = "."

# make a list of all of the selection table files
selections = glob(f"{dataset_path}/Annotation_Files/*/*.txt")

# create a list of audio files, one corresponding to each Raven file
# (Audio files have the same names as selection files with a different extension)
audio_files = [
    f.replace("Annotation_Files", "Recordings").replace(
        ".Table.1.selections.txt", ".mp3"
    )
    for f in selections
]

In [20]:
all_annotations = opso.BoxedAnnotations.from_raven_files(
    selections, annotation_column="Species", audio_files=audio_files
)
all_annotations.df.head()

  all_annotations_df = pd.concat(all_file_dfs).reset_index(drop=True)


Unnamed: 0,audio_file,annotation_file,annotation,start_time,end_time,low_f,high_f,Selection,View,Channel
0,/Users/SML161/labeled_datasets/pnre_ecy3329/Re...,/Users/SML161/labeled_datasets/pnre_ecy3329/An...,BTNW,0.913636,2.202273,4635.1,7439.0,1,Spectrogram 1,1
1,/Users/SML161/labeled_datasets/pnre_ecy3329/Re...,/Users/SML161/labeled_datasets/pnre_ecy3329/An...,EATO,2.236363,2.693182,3051.9,4101.0,2,Spectrogram 1,1
2,/Users/SML161/labeled_datasets/pnre_ecy3329/Re...,/Users/SML161/labeled_datasets/pnre_ecy3329/An...,BTNW,4.234091,6.054545,4196.4,7477.2,3,Spectrogram 1,1
3,/Users/SML161/labeled_datasets/pnre_ecy3329/Re...,/Users/SML161/labeled_datasets/pnre_ecy3329/An...,EATO,5.870454,6.354545,2956.5,4101.0,4,Spectrogram 1,1
4,/Users/SML161/labeled_datasets/pnre_ecy3329/Re...,/Users/SML161/labeled_datasets/pnre_ecy3329/An...,BHCO,6.87764,7.498095,6733.3,10376.5,5,Spectrogram 1,1


the 4-letter codes used for annotations correspond to bird species names. These are called "Alpha" codes

we can use name_conversions package to convert them to English common names

In [None]:
all_annotations.df["annotation"] = all_annotations.df["annotation"].apply(
    name_conversions.alpha_to_common
)

create species presence/absence labels for each non-overlapping 3s audio clip (eta 20 seconds)

In [37]:
clip_labels = all_annotations.clip_labels(
    clip_duration=3, clip_overlap=0, min_label_overlap=0.25
)

number of labels for each species:

In [39]:
clip_labels.sum(0).sort_values(ascending=False)

Eastern Towhee                  4348
Wood Thrush                     1773
American Crow                   1662
Northern Cardinal               1337
Black-throated Green Warbler    1266
Black-capped Chickadee          1075
Tufted Titmouse                  917
Ovenbird                         658
Red-eyed Vireo                   456
Common Yellowthroat              452
Blue Jay                         422
Scarlet Tanager                  419
American Redstart                311
Kentucky Warbler                 254
Blue-gray Gnatcatcher            210
Black-and-white Warbler          192
Hermit Thrush                    172
Blue-headed Vireo                161
Brown-headed Cowbird             160
Red-bellied Woodpecker           108
Northern Flicker                 107
Hooded Warbler                   101
Yellow-billed Cuckoo              87
Ruby-crowned Kinglet              65
Louisiana Waterthrush             64
Blue-winged Warbler               54
Rose-breasted Grosbeak            47
A

## load and apply a machine learning species classifier for bird sounds
the first time you run this line, it will download the model files to your computer

In [40]:
classifier = bmz.HawkEars()

Downloading model from URL...
File hgnet1.ckpt already exists; skipping download.
Loading model from local checkpoint /Users/SML161/nb_opso/preprocessing/hgnet1.ckpt...
Downloading model from URL...
File hgnet2.ckpt already exists; skipping download.
Loading model from local checkpoint /Users/SML161/nb_opso/preprocessing/hgnet2.ckpt...


  mdict = torch.load(model_path, map_location=torch.device("cpu"))


Downloading model from URL...
File hgnet3.ckpt already exists; skipping download.
Loading model from local checkpoint /Users/SML161/nb_opso/preprocessing/hgnet3.ckpt...
Downloading model from URL...
File hgnet4.ckpt already exists; skipping download.
Loading model from local checkpoint /Users/SML161/nb_opso/preprocessing/hgnet4.ckpt...
Downloading model from URL...
File hgnet5.ckpt already exists; skipping download.
Loading model from local checkpoint /Users/SML161/nb_opso/preprocessing/hgnet5.ckpt...


                    This architecture is not listed in opensoundscape.ml.cnn_architectures.ARCH_DICT.
                    It will not be available for loading after saving the model with .save() (unless using pickle=True). 
                    To make it re-loadable, define a function that generates the architecture from arguments: (n_classes, n_channels) 
                    then use opensoundscape.ml.cnn_architectures.register_architecture() to register the generating function.

                    The function can also set the returned object's .constructor_name to the registered string key in ARCH_DICT

                    See opensoundscape.ml.cnn_architectures module for examples of constructor functions
                    


ask the classifier which species are present in the audio files

In [None]:
predictions = classifier.predict(clip_labels, batch_size=64, num_workers=4)



  0%|          | 0/121 [00:00<?, ?it/s]

evaluate performance

In [50]:
from sklearn.metrics import average_precision_score, roc_auc_score

print(
    f"class-averaged ROC AUC: {roc_auc_score(clip_labels.values, predictions[clip_labels.columns].values,average='macro'):.3f}"
)
print(
    f"class-averaged Average Precision: {average_precision_score(clip_labels.values, predictions[clip_labels.columns].values, average='macro'):.3f}"
)

print(
    f"sample-averaged ROC AUC: {roc_auc_score(clip_labels.values, predictions[clip_labels.columns].values,average='micro'):.3f}"
)
print(
    f"sample-averaged Average Precision: {average_precision_score(clip_labels.values, predictions[clip_labels.columns].values, average='micro'):.3f}"
)

class-averaged ROC AUC: 0.854
class-averaged Average Precision: 0.447
sample-averaged ROC AUC: 0.904
sample-averaged Average Precision: 0.709
