# LPCNet
[![Generic badge](https://img.shields.io/badge/GitHub-LPCNet-9cf.svg)][github]
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)][notebook]  
LPCNet inference demo  

[github]:https://github.com/tarepan/LPCNet
[notebook]:https://colab.research.google.com/github/tarepan/LPCNet/blob/master/LPCNet.ipynb

## Env Check

In [None]:
!cat /proc/uptime | awk '{print $1 /60 /60 /24 "days (" $1 "sec)"}'
!head -n 1 /proc/driver/nvidia/gpus/**/information
!python --version
!pip show torch | sed '2!d'
!/usr/local/cuda/bin/nvcc --version | sed '4!d'

## Setup

In [None]:
# GoogleDrive
from google.colab import drive
drive.mount('/content/gdrive')

In [None]:
!apt install autoconf automake libtool
!git clone https://github.com/tarepan/LPCNet.git
%cd LPCNet

!pip install https://github.com/tarepan/speechdatasety

Only If you need pretrained model,

In [None]:
# Step 0 - Model data
!./download_model.sh

### Build

In [None]:
# Step 1 - Env
%env CFLAGS=-Ofast -g -march=native
!echo $CFLAGS

# Step 2 - Build
!./autogen.sh    # Latest model download & `autoreconf`
!./configure     # Run the generated configure script
!make

## Inference

### Setup

Input preparation (wav file => pcm blob file)

In [None]:
import librosa
import numpy as np

from speechdatasety.helper.process import unit_to_s16pcm


# ========= Change this wave path =========
p = "../test_02.wav"
# =========================================


# `i_inference_wave.s16` should be 16bit/16kHz PCM
audio_unit, _ = librosa.load(p, sr=16000, mono=True)
audio_s16 = unit_to_s16pcm(audio_unit)
audio_s16.tofile("./i_inference_wave.s16")

### Demo - Speech Compression
wave -> (compression) -> codes -> (decompression) -> wave

In [None]:
# Encode `i_inference_wave.s16` (16bit/16kHz PCM, machine endian)
#   to `compressed.bin` (8 bytes per 40-ms packet, raw, no header)
!./lpcnet_demo -encode i_inference_wave.s16 compressed.bin

# Decode `compressed.bin` to `output.pcm` (16bit/16kHz PCM)
!./lpcnet_demo -decode compressed.bin output.pcm


from IPython.display import Audio, display

i = np.fromfile("./i_inference_wave.s16",  dtype=np.int16)
o = np.fromfile("./output.pcm", dtype=np.int16)

print("Before:")
display(Audio(i,   rate=16000))
print("After:")
display(Audio(o,   rate=16000))

### Demo - Speech Synthesis
wave -> (analysis) -> uncompressed_feature -> (synthesis) -> wave

In [None]:
import time


# (maybe) Feature-rize
!./lpcnet_demo -features  i_inference_wave.s16 uncompressed.bin

# Synthesis
t_start = time.perf_counter()
!./lpcnet_demo -synthesis uncompressed.bin output_resynth.pcm
t_end = time.perf_counter()
t_sec = t_end - t_start


from IPython.display import Audio, display

i = np.fromfile("./i_inference_wave.s16", dtype=np.int16)
o = np.fromfile("./output_resynth.pcm",   dtype=np.int16)

print("Before:")
display(Audio(i,   rate=16000))
print("After:")
display(Audio(o,   rate=16000))


sr=16000
audio_length_sec = o.shape[0] / sr
print(f"time: {round(t_sec, 2)} sec for {round(audio_length_sec, 2)} sec audio")
print(f"RTF: {round(t_sec/audio_length_sec, 2)}")

## Training

### Preprocessing

In [None]:
!pip install git+https://github.com/tarepan/speechcorpusy.git

In [None]:
import librosa
import numpy as np
import soundfile as sf
import resampy
from speechcorpusy import load_preset
from speechdatasety.helper.process import unit_to_s16pcm


corpus = load_preset("Act100TKYM", root="/content/gdrive/MyDrive/ML_data")
corpus.get_contents()
all_utterances = corpus.get_identities()


path_outfile = "./train_pcm.s16"
sr_target = 16000
# `train_pcm.s16` should be 16bit/16kHz PCM
with open(path_outfile, mode="ab") as f:
    for p in map(lambda item_id: corpus.get_item_path(item_id), all_utterances):
        wave_unit, _ = librosa.load(p, sr=sr_target, mono=True)
        wave_s16 = unit_to_s16pcm(wave_unit)
        # Append headless 16-bit PCM
        wave_s16.tofile(f)

In [None]:
!./dump_data -train train_pcm.s16 train_features.f32 train_waves.s16

### Train

In [None]:
# Launch TensorBoard
%load_ext tensorboard
%tensorboard --logdir /content/gdrive/MyDrive/ML_results/lpcnet/original

# FromScratch
!python ./training_tf2/train_lpcnet.py \
    train_features.f32 train_waves.s16 \
    /content/gdrive/MyDrive/ML_results/lpcnet/original/test_01/original \
    # --batch-size=64

# Resume
# !python ./training_tf2/train_lpcnet.py \
#     train_features.f32 train_waves.s16 \
#     /content/gdrive/MyDrive/ML_results/lpcnet/original/test_01/original \
#     --resume-model=ckpt_02.h5 --from-epoch=2 --from-step=5000 # --batch-size=64

### Dump for Inference

In [None]:
!python training_tf2/dump_lpcnet.py /content/gdrive/MyDrive/ML_results/lpcnet/original/test_01/<ckpt>.h5

!rm ./src/nnet_data.c ./src/nnet_data.h
!cp nnet_data.c nnet_data.h ./src
!ls src