Skip to content

Unsupervised meta-learning for improving clustering quality in unlabeled bird sound datasets

License

Notifications You must be signed in to change notification settings

ear-team/darksound

Repository files navigation

Meta-Embedded Clustering (MEC)

A new method for improving clustering quality in unlabeled bird sound datasets

In recent years, ecoacoustics has offered an alternative to traditional biodiversity monitoring techniques with the development of passive acoustic monitoring (PAM) systems allowing, among others, to automatically detect and identify species, such as crepuscular and nocturnal tropical birds that are difficult to be tracked by human observers. PAM systems allow generating large audio datasets, but these monitoring techniques still face the challenge to infer ecological information that can be transferred to conservationists. In most cases, several thousand hours of recordings need to be manually labeled by an expert limiting the operability of the systems. Based on the advancement of meta-learning algorithms and unsupervised learning techniques, we propose Meta-Embedded Clustering (MEC), a new method to improve the quality of clustering in unlabeled bird sound datasets.

Meta-Embedded Clustering (MEC)
Meta Embedded Clustering (MEC)

The MEC method is organized in two main steps, with: (a) fine-tuning of a pretrained convolutional neural network (CNN) backbone with different meta-learning algorithms using pseudo-labeled data, and (b) clustering of manually-labeled bird sounds in the latent space based on vector embeddings extracted from the fine-tuned CNN. The MEC method significantly enhanced clustering performance, achieving a 85% improvement over the traditional approach of solely employing CNN features extracted from a general dataset. However, this enhanced performance came with the trade-off of excluding a portion of the data categorized as noise. By improving the quality of clustering in unlabeled bird sound datasets, The MEC method is here designed to facilitate the work of ecoacousticians in managing acoustic units of bird song/call clustered according to their similarities, and in identifying potential clusters of unknown species.

Installation

Download Anaconda and prepare your environment using the command line.

conda create --name darksound python=3.10
conda activate darksound

Install the required libraires using the package installer pip.

pip install -r requirements.txt
# If you cannot build wheels for hdbscan, install it with conda
# conda install -c conda-forge hdbscan

Install darksound package. First go to the root directory of the package darksound (where is the config file pyproject.toml). Then execute the command line.

pip install .

Usage

Darksound Dataset

DOI

Darksound is an open-source and code-based dataset for the evaluation of unsupervised meta-learning algorithms in the context of ecoacoustics. This dataset is composed of regions of interest (ROIs) of nocturnal and crepuscular bird species living in tropical environments that were automatically segmented using the Python package Bambird (Michaud et al., 2023). The dataset is split into two sets, with a training set and a test set.

The dataset is easily accessible and downloadable on Zenodo or can be directly downloaded using Python:

from torchvision.models import DenseNet121_Weights
from torchvision import transforms
# Import the Darksound class
from dataset import Darksound 

# Load DenseNet weights for transformation
weights = DenseNet121_Weights.IMAGENET1K_V1 
# Download the Darksound training set
train_set = Darksound(split='train', transform=transforms.Compose([weights.transforms()]), download=True)

a) Fine-tuning pretrained CNN backbones with meta-learning algorithms

Specific emphasis is placed on meta "metric" learning based algorithms that are used for performing the experiments. More precisely, it is possible to choose between three meta-learning algorithms (Matching Networks, Prototypical Networks and Relation Networks) for fine-tuning four different CNN backbones (ResNet18, VGG16, DenseNet121, AlexNet). This requires to indicate the desired parameters in the config.yaml file and to run the following command:

python fine-tuning.py

b) Clustering vector embeddings extracted from the fine-tuned CNN

Clustering of the vector embeddings is performed using the Meta Embedded Clustering (MEC) method. The objective of the MEC method is to improve the quality of clustering of unlabeled bird sounds datasets in order to determine a number of clusters close to the ground truth. MEC method can be performed on the Darksound dataset by indicating the path to the fine-tuned CNN and running the following command:

python clustering.py --path "embeddings/prototypical-networks-5way-1shot-densenet.pt"

An example of the evaluation of the clustering performances of the MEC method is accessible from this notebook.

Citing this work

Citation Badge

If you find the MEC method useful for your research, please consider citing it as:

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. Please make sure to update tests as appropriate.

Acknowledgements

We would like to thank authors from EasyFSL for open-sourcing their code and publicly releasing checkpoints, and contributors to Bambird for their excellent work in creating labelling function to build cleaner bird song recording dataset.

About

Unsupervised meta-learning for improving clustering quality in unlabeled bird sound datasets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published