#Speech synthesis inference using the finetuned [VITS](https://arxiv.org/pdf/2106.06103.pdf) model

## Installation and Set up

Required packages for correct downloading of the model files:

In [None]:
!pip install git-lfs
!git lfs install

Clone the [github repository](https://github.com/GerrySant/VITS_finetuned) where the model is stored:

In [None]:
!git clone https://github.com/GerrySant/VITS_finetuned.git

Clone and set up the [github repository](https://github.com/coqui-ai/TTS) of the text to speech library. 

In [None]:
# Clone
!git clone https://github.com/coqui-ai/TTS.git

# Installation of the library's required packages
!pip install -q -e TTS
!cd TTS && python setup.py develop

# It fixes the numpy version conflict. It requires restarting the runtime - done automatically by exit() -
!pip install --upgrade numpy
exit()

## Exercise 2: Inference

Create the folder where the output audios will be saved:

In [None]:
import os

os.makedirs('./OUTPUT_AUDIOS', exist_ok=True)

Function that allows to keep the path to the "speakers.json" file updated.

In [None]:
import json

def update_speaker_path(config_path, speakers_path):

  confg = open(config_path, "r")
  json_object = json.load(confg)
  confg.close()

  json_object['model_args']['speakers_file'] = speakers_path
  json_object['speakers_file'] = speakers_path

  confg = open(config_path, "w")
  json.dump(json_object, confg)
  confg.close()

Determine the arguments necessary to perform inference:

In [None]:
message= "It's always darkest before it becomes totally black" # Enter the text you want to convert to speech.

model_path = "./VITS_finetuned/vits_BSC_Gerard_Sant/best_model.pth" # Path to best_model.pth

config_path = "./VITS_finetuned/vits_BSC_Gerard_Sant/config.json" # Path to config.json

speaker_path = "./VITS_finetuned/vits_BSC_Gerard_Sant/speakers.json" # Path to speakers.json

Perform inference using the following code cell:

In [None]:
update_speaker_path(config_path, speaker_path)

!python ./TTS/TTS/bin/synthesize.py --text "{message}" \
      --model_path {model_path} \
      --config_path {config_path} \
      --speaker_id my_speaker \
      --out_path OUTPUT_AUDIOS/output.wav

## Exercise 3: Inference through a local website


Install the necessary packages for running the web locally

In [None]:
!pip install flask && pip install redis

Please, run the following cell and click on the address that appears in the output

In [None]:
!python ./VITS_finetuned/vits_web/app.py