# Controllable TalkNet
To run TalkNet, click on Runtime -> Run all. The interface will appear at the bottom of the page when it's ready.

## Instructions

*   Once the notebook is running, click on Files (the folder icon on the left edge). 
*   Upload audio clips of a singing or speaking voice by dragging and dropping them onto the sidebar.
*   Click on "Update file list" in the TalkNet interface. Select an audio file from the dropdown, and type what it says into the Transcript box.
*   Select a character, and press Generate. The first line will take a little longer to generate.

## Tips and tricks
*   If you want to use TalkNet as regular text-to-speech system, without any reference audio, tick the "Disable reference audio" checkbox.
*   You can use [ARPABET](http://www.speech.cs.cmu.edu/cgi-bin/cmudict) to override the pronunciation of words, like this: *She made a little bow, then she picked up her {B OW}.*
*   If you're running out of memory generating lines, try to work with shorter clips.
*   The singing models are trained on very little data, and can have a hard time pronouncing certain words. Try experimenting with ARPABET and punctuation.
*   If the voice is off-key, the problem is usually with the extracted pitch. Press "Debug pitch" to listen to it. Reference audio with lots of echo/reverb or background noise, or singers with a very high vocal range can cause issues.
*   If the singing voice sounds strained, try enabling "Change input pitch" and adjusting it up or down a few semitones. If you're remixing a song, remember to pitch-shift your background track as well.

In [None]:
#@markdown **Step 1:** Check which GPU you've been allocated.

!nvidia-smi -L
!nvidia-smi

In [None]:
#@markdown **Step 2:** Download dependencies.
import os

custom_lists = [
    #"https://gist.githubusercontent.com/SortAnon/997cda157954a189259c9876fd804e53/raw/example_models.json",
]

!apt-get install sox libsndfile1 ffmpeg
!pip install tensorflow==2.4.1 dash==1.21.0 dash-bootstrap-components==0.13.0 jupyter-dash==0.4.0 psola wget unidecode pysptk frozendict kaldiio pydub pyannote.audio g2p_en pesq pystoi crepe resampy ffmpeg-python torchcrepe einops taming-transformers-rom1504==0.0.6 tensorflow-hub torchmetrics==0.6.0 omegaconf==2.2.3 hmmlearn==0.2.7 flask torch_stft==0.1.4
!pip uninstall gdown -y
!pip install git+https://github.com/wkentaro/gdown.git
!python -m pip install git+https://github.com/SortAnon/NeMo.git
if not os.path.exists("hifi-gan"):
    !git clone -q --recursive https://github.com/SortAnon/hifi-gan
!git clone -q https://github.com/SortAnon/ControllableTalkNet
os.chdir("/content/ControllableTalkNet")
!git archive --output=./files.tar --format=tar HEAD
os.chdir("/content")
!tar xf ControllableTalkNet/files.tar
!rm -rf ControllableTalkNet

# PESQ fix
!pip install --pre torchtext==0.6.0 --no-deps --quiet
!pip install --upgrade --no-cache-dir Werkzeug==2.0.3
!python -m pip uninstall -y pesq
!python -m pip uninstall -y numpy
!python -m pip install numpy==1.22.0
!python -m pip --no-cache-dir install --no-build-isolation --no-binary :all: pesq==0.0.2

os.chdir("/content/model_lists")
for c in custom_lists:
    !wget "{c}"
os.chdir("/content")

exit()


In [None]:
# @markdown **Step 3:** Run the interface. 

## @markdown If you get a VersionConflict error,
## @markdown click on Runtime -> Restart runtime, and then run this cell again.
using_inline = True
import pkg_resources
from pkg_resources import DistributionNotFound, VersionConflict
"""dependencies = [
"tensorflow==2.4.1", 
"dash", 
"jupyter-dash", 
"psola", 
"wget", 
"unidecode", 
"pysptk", 
"frozendict", 
"torchvision==0.9.1", 
"torchaudio==0.8.1", 
"torchtext==0.9.1", 
"torch_stft", 
"kaldiio", 
"pydub", 
"pyannote.audio", 
"g2p_en", 
"pesq", 
"pystoi", 
"crepe", 
"resampy", 
"ffmpeg-python",
"numpy",
"scipy",
"nemo_toolkit",
"tqdm",
"gdown",
]
pkg_resources.require(dependencies)"""

from controllable_talknet import *
app.run_server(
    mode="inline",
    #dev_tools_ui=True,
    #dev_tools_hot_reload=True,
    threaded=True,
)

In [None]:
# @markdown **Step 3B:** If the above fails with a 403 error, do the following:
# @markdown * Go to Runtime -> Restart runtime
# @markdown * Run this cell (click the play button)
# @markdown * Click on the googleusercontent.com link to use TalkNet in a separate tab
try:
    using_inline
except:
    using_inline = False
if not using_inline:
    from controllable_talknet import *
    from google.colab.output import eval_js

    print(eval_js("google.colab.kernel.proxyPort(8050)"))
    app.run_server(
        mode="external",
        debug=False,
        #dev_tools_ui=True,
        #dev_tools_hot_reload=True,
        threaded=True,
    )