Skip to content

RoganInglis/AudioLM

Repository files navigation

AudioLM (wip)

PyTorch Lightning Config: Hydra Template
Paper

Description

A PyTorch implementation of AudioLM. Still in early stages and not at the point of running anything yet.

TODO

  • Check for existing implementations of w2v-BERT
    • Don't see anything complete but lucidrains is working on an implementation of AudioLM here which might contain some inspiration later
  • Check for existing implementations of soundstream
    • This repo contains a tflite soundstream model which it might be possible to inspect
    • This repo contains a vector quantization implementation which might be useful
    • This repo contains a soundstream implementation which is missing some features from the original paper but will likely still be useful
  • Have a look at audio-diffusion-pytorch and see if there is anything useful there
    • There is some good dataset info here. Particularly YoutubeDataset sounds interesting and potentially useful
  • Implement w2v-BERT
    • Implement w2v-BERT network
      • Check experimental setup in this paper, which matches w2v-BERT
      • Implement feature encoder
      • Implement contrastive module
      • Implement masked prediction module
      • Implement masked prediction loss
      • Implement contrastive loss
    • Implement w2v-BERT data module
    • Implement w2v-BERT training
  • Implement soundstream
    • Implement soundstream network
    • Implement soundstream data module
    • Implement soundstream training
  • Implement AudioLM
    • Implement AudioLM network
    • Implement AudioLM data module
    • Implement AudioLM training
  • Train on LibriSpeech (version available in torchaudio)
  • Train on music dataset

How to run

Install dependencies

# clone project
git clone https://github.com/YourGithubName/your-repo-name
cd your-repo-name

# [OPTIONAL] create conda environment
conda create -n myenv python=3.9
conda activate myenv

# install pytorch according to instructions
# https://pytorch.org/get-started/

# install requirements
pip install -r requirements.txt

Train model with default configuration

# train on CPU
python src/train.py trainer=cpu

# train on GPU
python src/train.py trainer=gpu

Train model with chosen experiment configuration from configs/experiment/

python src/train.py experiment=experiment_name.yaml

You can override any parameter from command line like this

python src/train.py trainer.max_epochs=20 datamodule.batch_size=64

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published