Skip to content

fede6590/BEATs-train

Repository files navigation

BEATs fine-tuning pipeline 🎵

This repository is dedicated to building a preliminary fine-tuning pipeline for BEATs, a powerful "Audio Pre-Training with Acoustic Tokenizers" model developed by Microsoft, and you can find the official repository here. This pipeline is a work in progress focused on fine-tuning the model using the ESC-50 dataset before extending its capabilities to handle custom datasets.

The initial implementation of this pipeline was developed by the Norwegian Institute for Nature Research (NINA) using Lightning AI and their work is available here.

UPDATES:

  • Docker image built with:
    • Python 3.10 (slim version image based on Debian bullseye)
    • PyTorch 2.0
    • Torchaudio 2.0
    • Lightning 2.0
  • Optimized Docker image with minimum requirements (less dependencies)
  • Warning errors while training (still in progress)

PENDINGS:

  • Prototypical network training orquestration with config file similarly to the fine-tuning case
  • Bash scripting for data preparation
  • Fine-tuning evaluation metrics on ESC-50 dataset
  • Fine-tuning on custom dataset

NOTE: fine-tuning/retraining the Tokenizer is NOT on the agenda at the moment. This pipeline is designed only for training the feature extractor and the prototypical network.

Data preparation

To get started, follow these steps:

After downloading, extract the contents of the ESC-50 dataset ZIP file inside the data folder. The folder structure within the data directory should be as the following:

  • data/
    • BEATs/
      • BEATs_iter3_plus_AS2M.pt
    • ESC-50-master/
      • audio/
      • meta/
      • ...

Getting started with training

To build the Docker image, use the following command:

docker build -t beats -f Dockerfile .

Fine-tuning

IMPORTANT: fine_tune/config.yaml contains all the customizable parameters for training.

For fine-tuning BEATs on your dataset, use the following commands:

  • with available GPU(s)
    docker run -v "$PWD":/app \
                -v "data":/data \
                --gpus all \
                beats \
                python fine_tune/trainer.py fit --config fine_tune/config.yaml
  • without GPU
    docker run -v "$PWD":/app \
                -v "data":/data \
                beats \
                python fine_tune/trainer.py fit --config fine_tune/config.yaml

Prototypical network

To train the prototypical network, first, create a miniESC50 dataset:

  • with available GPU(s)

    docker run -v "$PWD":/app \
                -v "data":/data \
                --gpus all \
                beats \
                python data_utils/miniESC50.py
  • without GPU

    docker run -v "$PWD":/app \
                -v "data":/data \
                beats \
                python data_utils/miniESC50.py

Then, start the training:

  • with available GPU(s)

    docker run -v "$PWD":/app \
                -v "data":/data \
                --gpus all \
                beats \
                python prototypicalbeats/trainer.py fit --data miniESC50DataModule
  • without GPU

    docker run -v "$PWD":/app \
                -v "data":/data \
                beats \
                python prototypicalbeats/trainer.py fit --data miniESC50DataModule

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published