Skip to content

dandavison/elaenia

Repository files navigation

Mountain Elaenia

Elaenia is a collection of transfer learning experiments for identifying bird species in audio recordings. In all experiments the classifier is trained on a lower-dimensional embedding of the input audio data, computed using a publicly-released model (either VGGish or BirdNet).

The experiments are implemented using a bare-bones ML training pipeline library called sylph, which I wrote for this project. Sylph allows a pipeline to be defined, comprising a series of data transformation / feature extraction steps, with a final step to train the classifier.

The following Sylph code defines a pipeline which performs preliminary transformations of the raw audio data, computes the spectrogram, computes the VGGish embeddings, trains a classifier, and computes metrics on the test set:

from sylph.learners.svm import SVMLearner
from sylph.pipeline import Compose
from sylph.pipeline import TrainingPipeline
from sylph.transforms.audio import Audio2Audio16Bit
from sylph.transforms.pca import PCA
from sylph.transforms.vggish import Audio2Spectrogram
from sylph.transforms.vggish import Spectrogram2VGGishEmbeddings


pipeline = TrainingPipeline(
    transform=Compose(
        [
            Audio2Audio16Bit(normalize_amplitude=True),
            Audio2Spectrogram(),
            Spectrogram2VGGishEmbeddings(),
            PCA(whiten=True),
        ]
    ),
    learn=SVMLearner(),
)
output = pipeline.run(dataset)
metrics = pipeline.get_metrics(dataset, output)

Example: Melodious and Icterine Warbler contact zone

Melodious Warbler (H. polyglotta) and Icterine Warbler (H. icterina) are similar members of the genus Hippolais that come into contact in a narrow zone in western Europe. Results of classifying audio samples from xeno-canto are illustrated below.

image
Plot symbol type indicates a priori data labels (Melodious Warbler to the south-west, Icterine Warbler to the north-east; labels not used in classification); plot symbol color indicates results of applying the learned classifier. It appears that individuals whose vocalisations are classified ambiguously (pink/red) may be more common closer the zone of contact (extreme western Germany / eastern France, south to Switzerland).

Run the tests

git clone git@github.com:dandavison/elaenia.git
cd elaenia
make init
source env.sh
make test

Mountain Elaenia (Elaenia frantzi) by Daniel Uribe.

About

Bird audio machine learning experiments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published