# NNSVS

事前に [研究者向け東北きりたん歌唱データベース ログインページ](https://zunko.jp/kiridev/login.php) から kiritan_singing.zip をダウンロードし任意のディレクトリに展開してください．

## Install requirements

In [None]:
! git clone https://github.com/r9y9/pysinsy
! cd pysinsy && export SINSY_INSTALL_PREFIX=/usr/ && pip3 install .
! git clone https://github.com/r9y9/nnmnkwii
! cd nnmnkwii && pip3 install .

In [None]:
! git clone https://github.com/r9y9/nnsvs
! cd nnsvs && pip3 install -e .

## Setups

In [None]:
WAV_ROOT = '/workspace/kiritan_singing/wav'
SVS_WORLD_CONV = 'nnsvs/egs/kiritan_singing/svs-world-conv/'

In [None]:
! sed -i 's@\/home\/ryuichi\/data\/kiritan_singing\/wav@'"$WAV_ROOT"'@g' $SVS_WORLD_CONV/config.yaml

## Data download

In [None]:
! cd $SVS_WORLD_CONV && ./run.sh --stage -1 --stop-stage -1

## Data preparation

In [None]:
! cd $SVS_WORLD_CONV && rm -rf downloads/kiritan_singing/kiritan_singing_extra
! cd $SVS_WORLD_CONV/downloads/kiritan_singing && git clone https://github.com/r9y9/kiritan_singing_extra

In [None]:
! mkdir -p /usr/local/lib/sinsy
! ln -s /usr/lib/sinsy/dic /usr/local/lib/sinsy/dic

In [None]:
! cd $SVS_WORLD_CONV && ./run.sh --stage 0 --stop-stage 0

## Feature extraction

In [None]:
! cd $SVS_WORLD_CONV && ./run.sh --stage 1 --stop-stage 1

## Training models

### - Timelag model

In [None]:
! cd $SVS_WORLD_CONV && ./run.sh --stage 2 --stop-stage 2

### - Phoneme duration model

In [None]:
! cd $SVS_WORLD_CONV && ./run.sh --stage 3 --stop-stage 3

### - Acoustic model

In [None]:
! cd $SVS_WORLD_CONV && ./run.sh --stage 4 --stop-stage 4

## Synthesis

### - Generate features from timelag/duration/acoustic models

In [None]:
! cd $SVS_WORLD_CONV && ./run.sh --stage 5 --stop-stage 5

### - Synthesize waveforms

In [None]:
! cd $SVS_WORLD_CONV && ./run.sh --stage 6 --stop-stage 6

## Generated samples

In [None]:
import IPython
from IPython.display import Audio
from glob import glob
from os.path import join

sample_rate = 48000
synthesized_wav_paths = sorted(glob(join(SVS_WORLD_CONV, 'exp/kiritan/synthesis/**/label_phone_score/*.wav'),  recursive=True))

for wav_path in synthesized_wav_paths:
    print(wav_path)
    IPython.display.display(Audio(wav_path, rate=sample_rate))

## Synthesize your own songs

### Generate labels

In [None]:
from os.path import join

sample_dir = 'sample/'
song_list_path = join(sample_dir, 'song.list')
sample_score = join(sample_dir, 'score')
sample_label = join(sample_dir, 'label')
sample_wav = join(sample_dir, 'wav')

In [None]:
import pysinsy
from os import makedirs
from os.path import basename, join, splitext
from glob import glob

sinsy = pysinsy.sinsy.Sinsy()
assert sinsy.setLanguages('j', '/usr/local/lib/sinsy/dic')

song_list = []
musicxml_files = glob(join(sample_score, '*.*xml'))

makedirs(sample_label, exist_ok=True)

for musicxml_file in musicxml_files:
    assert sinsy.loadScoreFromMusicXML(musicxml_file)
    is_mono = False
    labels = sinsy.createLabelData(is_mono, 1, 1).getData()
    song_name = splitext(basename(musicxml_file))[0]
    song_list.append(song_name)
    lab_file_path = join(sample_label,  song_name + '.lab')

    with open(lab_file_path, 'w') as f:
         f.write('\n'.join(labels))

    sinsy.clearScore()

with open(song_list_path, 'w') as f:
    f.write('\n'.join(song_list))

### Synthesize

In [None]:
from os.path import join

spk = 'kiritan'
question_path = 'nnsvs/egs/_common/hed/jp_qst001_nnsvs.hed'
expdir = join(SVS_WORLD_CONV, 'exp/kiritan')
dump_norm_dir = join(SVS_WORLD_CONV, 'dump', spk, 'norm')

! nnsvs-synthesis question_path=$question_path \
timelag.checkpoint=$expdir/timelag/latest.pth \
timelag.in_scaler_path=$dump_norm_dir/in_timelag_scaler.joblib \
timelag.out_scaler_path=$dump_norm_dir/out_timelag_scaler.joblib \
timelag.model_yaml=$expdir/timelag/model.yaml \
duration.checkpoint=$expdir/duration/latest.pth \
duration.in_scaler_path=$dump_norm_dir/in_duration_scaler.joblib \
duration.out_scaler_path=$dump_norm_dir/out_duration_scaler.joblib \
duration.model_yaml=$expdir/duration/model.yaml \
acoustic.checkpoint=$expdir/acoustic/latest.pth \
acoustic.in_scaler_path=$dump_norm_dir/in_acoustic_scaler.joblib \
acoustic.out_scaler_path=$dump_norm_dir/out_acoustic_scaler.joblib \
acoustic.model_yaml=$expdir/acoustic/model.yaml \
utt_list=$song_list_path \
in_dir=$sample_label \
out_dir=$sample_wav \
ground_truth_duration=false