Skip to content

CHeggan/MetaAudio-A-Few-Shot-Audio-Classification-Benchmark

Repository files navigation

MetaAudio-A-Few-Shot-Audio-Classification-Benchmark

test

Future Plans (Soon):

  • Release all pre-trained models for community use

News/Updates

  • 24/5/24: Hyperparamater and original results files now available in "Other Files"
  • 10/01/23: New MetaAudio datasets released in MT-SLVR Paper. New sets revolve around few-shot speech classification.
  • 6/9/22: Presented MetaAudio at ICANN22, slides available in repo
  • 01/07/2022: MetaAudio accepted to ICANN22. To be presented in early September 2022.

Citation & Blog Breakdown

A new comprehensive and diverse few-shot acoustic classification benchmark. If you use any code or results from results from this work, please cite the following: ICANN22 Link or arXiv Link

@InProceedings{10.1007/978-3-031-15919-0_19,
author="Heggan, Calum
and Budgett, Sam
and Hospedales, Timothy
and Yaghoobi, Mehrdad",
title="MetaAudio: A Few-Shot Audio Classification Benchmark",
booktitle="Artificial Neural Networks and Machine Learning -- ICANN 2022",
year="2022",
publisher="Springer International Publishing",
address="Cham",
pages="219--230",
isbn="978-3-031-15919-0"
}

A new and (hopefully) more easily digestible blog of MetaAudio can be found here!

Enviroment

IMPORTANT NOTE: This environment auto configuration file appears to be broken (there is an old open issue for it). Unfortunately I do not have the time to properly fix this right now. My personal recommendation is to create a new python 3.8.5 environment and then install a few key packages using the versions listed in the document. After this just try to run an example and install the other things it asks for. Some packages I would recommend using the listed versions (from torch_gpu_env.txt) for:

  • PyTorch
  • Learn2learn
  • NumPy
  • pandas
  • Pysoundfile
  • Torchaudio
  • cudatoolkit

We use miniconda for our experimental setup. For the purposes of reproduction we include the environment file. This can be set up using the following command

conda env create --name metaaudio --file torch_gpu_env.txt

Contents Overview

This repo contains the following:

  • Multiple problem statement setups with accompanying results which can be used moving forward as baselines for few-shot acoustic classification. These include:
    • Normal within-dataset generalisation
    • Joint training to both within and cross-dataset settings
    • Additional data -> simple classifier for cross-dataset
    • Length shifted and stratified problems for variable length dataset setting
  • Standardised meta-learning/few-shot splits for 5 distinct datasets from a variety of sound domains. This includes both baseline (randomly generated splits) as well as some more unique and purposeful ones such as those based on available meta-data and sample length distributions
  • Variety of algorithm implementations designed to deal with few-shot classification, ranging from 'cheap' traditional training pipelines to SOTA Gradient-Based Meta-Learning (GBML) models
  • Both Fixed and Variable length dataset processing pipelines

Algorithm Implementations

Algorithms are custom built, operating on a similar framework with a common set of scripts. Those included in the paper are as follows:

For both MAML & Meta-Curvature we also make use of the Learn2Learn framework.

Datasets

We primarily cover 5 datasets for the majority of our experimentation, these are as follows:

In addition to these however, we also include 2 extra datasets for cross-dataset testing:

as well as a proprietary version of AudioSet we use for pre-training with simple classifiers. We obtained/scraped this dataset using the code from here:

We include sources for all of these datasets in Dataset Processing

About

A new comprehensive and diverse few-shot acoustic classification benchmark.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages