# VQ-CPC: Training
[![Generic badge](https://img.shields.io/badge/GitHub-VQ_CPC-9cf.svg)][github]
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)][notebook]

Notebook Author: [tarepan]

[github]:https://github.com/tarepan/VectorQuantizedCPC
[notebook]:https://colab.research.google.com/github/tarepan/VectorQuantizedCPC/blob/master/VQ_CPC_training.ipynb
[tarepan]:https://github.com/tarepan

Reconstruction & Voice Conversion of Zerospeech2019 dataset with pretrained VQ-CPC.

## Setup

In [None]:
# Clone the repository
!git clone https://github.com/tarepan/VectorQuantizedCPC.git
%cd VectorQuantizedCPC

# Install dependencies
!pip install -r requirements.txt

In [None]:
# Download ZeroSpeech2019En & model weights
## Dataset
!wget --no-check-certificate https://download.zerospeech.com/2019/english.tgz
!mkdir -p zerospeech/2019
!tar xvzf english.tgz -C ./zerospeech/2019
## Train/Test split
! wget https://github.com/bshall/VectorQuantizedCPC/releases/download/v0.1/datasets.zip
! unzip datasets.zip
## Pretrained weights
! wget https://github.com/bshall/VectorQuantizedCPC/releases/download/v0.1/checkpoints.zip
! unzip checkpoints.zip

# Preprocess dataset
!python preprocess.py in_dir=zerospeech/2019 dataset=2019/english

## Training

### Vocoder

In [None]:
!python train_vocoder.py \
    cpc_checkpoint=checkpoints/cpc/english2019/model.ckpt-22000.pt \
    checkpoint_dir=checkpoints/vocoder/english2019/version1 \
    dataset=2019/english

## Evaluation

### Reconstruction

In [None]:
# Data selection: Reconstruct 5 utterances from 5 speakers.

import json

origin_A = [
  ["S021", "0333236104"], 
  ["S023", "0136759263"], 
  ["S027", "0007162015"],
  ["S031", "0024528166"],
  ["S032", "0057067061"],
]

reconstruction_spec = []
for (spk_org, utter_id) in origin_A:
  spk_tgt = spk_org
  reconstruction_spec.append([f"english/train/unit/{spk_org}_{utter_id}", spk_tgt, f"{spk_org}_to_{spk_tgt}_{utter_id}"])
content_reconstruction = json.dumps(reconstruction_spec)

with open("./target.json", mode="w") as f:
    f.write(content_reconstruction)

In [None]:
# Reconstruction
!python convert.py cpc_checkpoint=checkpoints/cpc/english2019/model.ckpt-22000.pt vocoder_checkpoint=checkpoints/vocoder/english2019/version1/model.ckpt-xxxxxx.pt in_dir=zerospeech/2019 out_dir=results/z2019en/reconstruction synthesis_list=./target.json dataset=2019/english

### Voice Conversion

In [None]:
# Speaker-pairs: Reproduce the result of [official demo](https://bshall.github.io/VectorQuantizedCPC/)

import json

origin_A = [
  ["S022", "0799854662"], 
  ["S008", "2684330882"],
  ["S007", "0204997433"],
  ["S011", "3385425823"],
  ["S006", "2068766372"],
]
target_A = ["V001", "S040", "S056", "S074", "S090"]

origin_B = [
  ["S003", "1178890909"],
  ["S022", "0598465739"],
  ["S019", "2784269462"],
  ["S030", "1756493637"],
  ["S009", "2963764176"],
]
target_B = ["V002", "S040", "S056", "S074", "S090"]


conversion_spec = []
for (spk_org, utter_id) in origin_A:
  for spk_tgt in target_A:
    conversion_spec.append([f"english/test/{spk_org}_{utter_id}", spk_tgt, f"{spk_org}_to_{spk_tgt}_{utter_id}"])
for (spk_org, utter_id) in origin_B:
  for spk_tgt in target_B:
    conversion_spec.append([f"english/test/{spk_org}_{utter_id}", spk_tgt, f"{spk_org}_to_{spk_tgt}_{utter_id}"])
content_vc = json.dumps(conversion_spec)

with open("./target_vc.json", mode="w") as f:
    f.write(content_vc)

In [None]:
# VC
!python convert.py cpc_checkpoint=checkpoints/cpc/english2019/model.ckpt-22000.pt vocoder_checkpoint=checkpoints/vocoder/english2019/version1/model.ckpt-xxxxxx.pt in_dir=zerospeech/2019 out_dir=results/z2019en/vc synthesis_list=./target_vc.json dataset=2019/english