ASA - (A)udio-to-(S)ong (A)lignment

An Audio-to-lyrics alignment with low memory footprint:

A Kaldi-based framework for audio-to-lyrics alignment and transcription with low RAM memory consumption.

Future work: There will be new scripts provided for aligning lyrics in shorter auio clips in a less-time consuming way.

Requirements

Ubuntu >= 14.04

Docker

~35GB empty space on harddisk

Setup

OPTION a) INSTALL and RUN using Docker

For easy setup, we create a Docker container and install everything inside. All the libraries, dependencies, scripts and models will be downloaded automatically.

Run below from the same directory with this README.md file. This process may take around an hour.

docker build --tag asa:latest -f Dockerfile .

SETUP the environment

Set path to where you store the test data:

DATASET='path-to-testset'

which should contain both the audio and lyrics text files at "$DATASET/wav" and "$DATASET/lyrics"

docker run -v $DATASET:/a2l/dataset -it asa:latest

then, once you are inside the Docker container run:

source /root/miniconda3/etc/profile.d/conda.sh
conda activate ASA

(You need to run the lines above every time you (re)start the Docker container.)

OPTION b) INSTALL and RUN locally

1) Kaldi installation

This framework is built as a Kaldi[1] recipe For instructions on Kaldi installation, please visit https://github.com/kaldi-asr/kaldi

2) Set up virtual environment and install dependencies

cd a2l
git clone https://github.com/facebookresearch/demucs
cp local/demucs/separate.py demucs/demucs/separate.py
conda env update -f environment.yml

3) Setup Kaldi environment

Modify KALDI_ROOT in a2l/path.sh according to where your Kaldi installation is.

PATH_TO_YOUR_KALDI_INSTALLATION=
sed -i -- 's/path-to-your-kaldi-installation/${PATH_TO_YOUR_KALDI_INSTALLATION}/g' a2l/path.sh

4) Setup the conda environment

Navigate to the working directory and activate the environment.

cd 'dir-of-this-repository'/a2l
conda activate ALA

HOW TO RUN

This pipeline was designed for retrieving word alignments from long music recordings using low computational resources. There is no limit for the length of the input music recording.

IMPORTANT NOTICE: You need a pretrained acoustic model and an ivector model to run the alignment scripts below. To train your own Kaldi lyrics transcriber, please refer to https://github.com/emirdemirel/ALTA/blob/master/run_mstrenet.sh.

Set variables:

wavpath='full-path-to-audio'        # i.e. /home/emir/ALTA/LyricsTranscription/wav/Bohemian_Rhapsody.mp3
lyricspath='full-path-to-lyrics'    # i.e /home/emir/ALTA/LyricsTranscription/lyrics/Bohemian_Rhapsody.raw.txt
savepath='output-folder-name'       # This will be saved at 'dir-of-this-repository'/a2l/$savepath

Run the pipeline:

./run_lyrics_alignment_long.sh $wavpath $lyricspath $savepath

Run the pipeline for accapella recordings:

./run_lyrics_alignment_long.sh --polyphonic false $wavpath $lyricspath $savepath

Note : If you have any problems during the pipeline, look up for the relevant process in run_lyrics_alignment_long.sh

REFERENCES

If you use this code in your work, please cite:

@INPROCEEDINGS{demirel2021_asa,
  author={Demirel, Emir and Ahlbäck, Sven and Dixon, Simon},
  booktitle={ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
  title={Low Resource Audio-To-Lyrics Alignment from Polyphonic Music Recordings}, 
  year={2021},
  pages={586-590},
  doi={10.1109/ICASSP39728.2021.9414395}}

If you use the pretrained models under a2l/models directory, please cite:

@INPROCEEDINGS{demirel2020_alta,
  author={Demirel, Emir and Ahlbäck, Sven and Dixon, Simon},
  booktitle={2020 International Joint Conference on Neural Networks (IJCNN)}, 
  title={Automatic Lyrics Transcription using Dilated Convolutional Neural Networks with Self-Attention}, 
  year={2020},
  pages={1-8},
  doi={10.1109/IJCNN48605.2020.9207052}}

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
a2l		a2l
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ASA - (A)udio-to-(S)ong (A)lignment

An Audio-to-lyrics alignment with low memory footprint:

Requirements

Setup

OPTION a) INSTALL and RUN using Docker

SETUP the environment

OPTION b) INSTALL and RUN locally

1) Kaldi installation

2) Set up virtual environment and install dependencies

3) Setup Kaldi environment

4) Setup the conda environment

HOW TO RUN

REFERENCES

About

Releases

Packages

Languages

License

emirdemirel/ASA_ICASSP2021

Folders and files

Latest commit

History

Repository files navigation

ASA - (A)udio-to-(S)ong (A)lignment

An Audio-to-lyrics alignment with low memory footprint:

Requirements

Setup

OPTION a) INSTALL and RUN using Docker

SETUP the environment

OPTION b) INSTALL and RUN locally

1) Kaldi installation

2) Set up virtual environment and install dependencies

3) Setup Kaldi environment

4) Setup the conda environment

HOW TO RUN

REFERENCES

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages