Skip to content

ODD2/MelVocoder

Repository files navigation

M4Singer MelVocoder

Description

This repository includes the project for the second homework of the course "Deep Learning for Music Analysis and Generation" lectured by Prof. Yang at the National Taiwan University. The main goals of this work is to train a melvocoder on the M4Singer dataset. Given an mel-spectrogram of a singing segment, the model should generate the waveform corresponding to the sepctrogram. This project relies on HiFi-GAN and sobel-operator-pytorch, big thanks to the authors.

Create Environment

pip install -r requirements.txt

Training

The file structure of the dataset is expected to be something similar like this:

./dataset
    |- audios/
        |- 0001.mp3
        |- 0002.mp3
        |- 0003.mp3
        ...
    |- split
        |- train.txt
        |- valid.txt

The following command starts the training process with configuration 'config/config_v1.json' and save the checkpoint to the 'checkpoint/test/' folder:

python train.py \
--config=configs/config_v1.json \
--input_wavs_dir=dataset/audios \
--input_training_file=./dataset/split/train.txt \
--input_validation_file=./dataset/split/valid.txt \
--checkpoint_path=./checkpoints/test

Inference

Preprocess Audio Files for Mel-Spectrogram

Please change the source and destination folders and run the 'preprocess.py' file to derive the mel-spectrograms.

python -m preprocess

Generate the Waveforms

  • Please download the model weights and config from Google Drive: config.json, weights
  • Inference with the following command:
python -m inference \
# the folder path for the mel-spectrograms
--input_mels_dir=$input_mel_spec_folder \ 
# the folder path to save the generated audios
--output_dir=$output_audio_folder \ 
# the path that includes both the weight and the config
--checkpoint_file=$generator_checkpoint_path 

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published