MetaAudio-A-Few-Shot-Audio-Classification-Benchmark

Future Plans (Soon):

Release all pre-trained models for community use

News/Updates

24/5/24: Hyperparamater and original results files now available in "Other Files"
10/01/23: New MetaAudio datasets released in MT-SLVR Paper. New sets revolve around few-shot speech classification.
6/9/22: Presented MetaAudio at ICANN22, slides available in repo
01/07/2022: MetaAudio accepted to ICANN22. To be presented in early September 2022.

Citation & Blog Breakdown

A new comprehensive and diverse few-shot acoustic classification benchmark. If you use any code or results from results from this work, please cite the following: ICANN22 Link or arXiv Link

@InProceedings{10.1007/978-3-031-15919-0_19,
author="Heggan, Calum
and Budgett, Sam
and Hospedales, Timothy
and Yaghoobi, Mehrdad",
title="MetaAudio: A Few-Shot Audio Classification Benchmark",
booktitle="Artificial Neural Networks and Machine Learning -- ICANN 2022",
year="2022",
publisher="Springer International Publishing",
address="Cham",
pages="219--230",
isbn="978-3-031-15919-0"
}

A new and (hopefully) more easily digestible blog of MetaAudio can be found here!

Enviroment

IMPORTANT NOTE: This environment auto configuration file appears to be broken (there is an old open issue for it). Unfortunately I do not have the time to properly fix this right now. My personal recommendation is to create a new python 3.8.5 environment and then install a few key packages using the versions listed in the document. After this just try to run an example and install the other things it asks for. Some packages I would recommend using the listed versions (from torch_gpu_env.txt) for:

PyTorch
Learn2learn
NumPy
pandas
Pysoundfile
Torchaudio
cudatoolkit

We use miniconda for our experimental setup. For the purposes of reproduction we include the environment file. This can be set up using the following command

conda env create --name metaaudio --file torch_gpu_env.txt

Contents Overview

This repo contains the following:

Multiple problem statement setups with accompanying results which can be used moving forward as baselines for few-shot acoustic classification. These include:
- Normal within-dataset generalisation
- Joint training to both within and cross-dataset settings
- Additional data -> simple classifier for cross-dataset
- Length shifted and stratified problems for variable length dataset setting
Standardised meta-learning/few-shot splits for 5 distinct datasets from a variety of sound domains. This includes both baseline (randomly generated splits) as well as some more unique and purposeful ones such as those based on available meta-data and sample length distributions
Variety of algorithm implementations designed to deal with few-shot classification, ranging from 'cheap' traditional training pipelines to SOTA Gradient-Based Meta-Learning (GBML) models
Both Fixed and Variable length dataset processing pipelines

Algorithm Implementations

Algorithms are custom built, operating on a similar framework with a common set of scripts. Those included in the paper are as follows:

For both MAML & Meta-Curvature we also make use of the Learn2Learn framework.

Datasets

We primarily cover 5 datasets for the majority of our experimentation, these are as follows:

In addition to these however, we also include 2 extra datasets for cross-dataset testing:

as well as a proprietary version of AudioSet we use for pre-training with simple classifiers. We obtained/scraped this dataset using the code from here:

AudioSet

We include sources for all of these datasets in Dataset Processing

Name		Name	Last commit message	Last commit date
Latest commit History 184 Commits
Dataset Processing		Dataset Processing
Dataset Splits		Dataset Splits
Examples		Examples
Other Files		Other Files
Paper Results		Paper Results
Small Scale Pre-Training		Small Scale Pre-Training
images		images
CC-BY-NC-4.0.txt		CC-BY-NC-4.0.txt
MetaAudio_ICANN22_slide_deck.pptx		MetaAudio_ICANN22_slide_deck.pptx
README.md		README.md
torch_gpu_env.txt		torch_gpu_env.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset Processing

Dataset Processing

Dataset Splits

Dataset Splits

Examples

Examples

Other Files

Other Files

Paper Results

Paper Results

Small Scale Pre-Training

Small Scale Pre-Training

images

images

CC-BY-NC-4.0.txt

CC-BY-NC-4.0.txt

MetaAudio_ICANN22_slide_deck.pptx

MetaAudio_ICANN22_slide_deck.pptx

README.md

README.md

torch_gpu_env.txt

torch_gpu_env.txt

Repository files navigation

MetaAudio-A-Few-Shot-Audio-Classification-Benchmark

Future Plans (Soon):

News/Updates

Citation & Blog Breakdown

Enviroment

Contents Overview

Algorithm Implementations

Datasets

About

Releases

Packages

Languages

CHeggan/MetaAudio-A-Few-Shot-Audio-Classification-Benchmark

Folders and files

Latest commit

History

Repository files navigation

MetaAudio-A-Few-Shot-Audio-Classification-Benchmark

Future Plans (Soon):

News/Updates

Citation & Blog Breakdown

Enviroment

Contents Overview

Algorithm Implementations

Datasets

About

Topics

Resources

Stars

Watchers

Forks

Languages