# Easy transfer learning with 🐸 STT ⚡

You want to train a Coqui (🐸) STT model, but you don't have a lot of data. What do you do?

The answer 💡: Grab a pre-trained model and fine-tune it to your data. This is called `"Transfer Learning"` ⚡

🐸 STT comes with transfer learning support out-of-the box.

You can even take a pre-trained model and fine-tune it to _any new language_, even if the alphabets are completely different. Likewise, you can fine-tune a model to your own data and improve performance if the language is the same.

In this notebook, we will:

1. Download a pre-trained English STT model.
2. Download data for the Russian language.
3. Fine-tune the English model to Russian language.
4. Test the new Russian model and display its performance.

So, let's jump right in!

*PS - If you just want a working, off-the-shelf model, check out the [🐸 Model Zoo](https://www.coqui.ai/models)*

In [None]:
## Install Coqui STT
! pip install -U pip
! pip install coqui_stt_training

## ✅ Download pre-trained English model

We're going to download a very small (but very accurate) pre-trained STT model for English. This model was trained to only transcribe the English words "yes" and "no", but with transfer learning we can train a new model which could transcribe any words in any language. In this notebook, we will turn this "constrained vocabulary" English model into an "open vocabulary" Russian model.

Coqui STT models as typically stored as checkpoints (for training) and protobufs (for deployment). For transfer learning, we want the **model checkpoints**.


In [None]:
### Download pre-trained model
import os
import tarfile
from coqui_stt_training.util.downloader import maybe_download

def download_pretrained_model():
    model_dir="english/"
    if not os.path.exists("english/coqui-yesno-checkpoints"):
        maybe_download("model.tar.gz", model_dir, "https://github.com/coqui-ai/STT-models/releases/download/english%2Fcoqui%2Fyesno-v0.0.1/coqui-yesno-checkpoints.tar.gz")
        print('\nNo extracted pre-trained model found. Extracting now...')
        tar = tarfile.open("english/model.tar.gz")
        tar.extractall("english/")
        tar.close()
    else:
        print('Found "english/coqui-yesno-checkpoints" - not extracting.')

# Download + extract pre-trained English model
download_pretrained_model()

## ✅ Download data for Russian

**First things first**: we need some data.

We're training a Speech-to-Text model, so we need some _speech_ and we need some _text_. Specificially, we want _transcribed speech_. Let's download a Russian audio file and its transcript, pre-formatted for 🐸 STT. 

**Second things second**: we want a Russian alphabet. The output layer of a typical* 🐸 STT model represents letters in the alphabet. Let's download a Russian alphabet from Coqui and use that.

*_If you are working with languages with large character sets (e.g. Chinese), you can set `bytes_output_mode=True` instead of supplying an `alphabet.txt` file. In this case, the output layer of the STT model will correspond to individual UTF-8 bytes instead of individual characters._

In [None]:
### Download sample data
from coqui_stt_training.util.downloader import maybe_download

def download_sample_data():
    data_dir="russian/"
    maybe_download("ru.wav", data_dir, "https://raw.githubusercontent.com/coqui-ai/STT/main/data/smoke_test/russian_sample_data/ru.wav")
    maybe_download("ru.csv", data_dir, "https://raw.githubusercontent.com/coqui-ai/STT/main/data/smoke_test/russian_sample_data/ru.csv")
    maybe_download("alphabet.txt", data_dir, "https://raw.githubusercontent.com/coqui-ai/STT/main/data/smoke_test/russian_sample_data/alphabet.ru")

# Download sample Russian data
download_sample_data()

## ✅ Configure the training run

Coqui STT comes with a long list of hyperparameters you can tweak. We've set default values, but you can use `initialize_globals_from_args()` to set your own. 

You must **always** configure the paths to your data, and you must **always** configure your alphabet. For transfer learning, it's good practice to define different `load_checkpoint_dir` and `save_checkpoint_dir` paths so that you keep your new model (Russian STT) separate from the old one (English STT). The parameter `drop_source_layers` allows you to remove layers from the original (aka "source") model, and re-initialize them from scratch. If you are fine-tuning to a new alphabet you will have to use _at least_ `drop_source_layers=1` to remove the output layer and add a new output layer which matches your new alphabet.

We are fine-tuning a pre-existing model, so `n_hidden` should be the same as the original English model.

In [None]:
from coqui_stt_training.util.config import initialize_globals_from_args

initialize_globals_from_args(
    n_hidden=64,
    load_checkpoint_dir="english/coqui-yesno-checkpoints",
    save_checkpoint_dir="russian/checkpoints",
    drop_source_layers=1,
    alphabet_config_path="russian/alphabet.txt",
    train_files=["russian/ru.csv"],
    dev_files=["russian/ru.csv"],
    epochs=100,
    load_cudnn=True,
)

### View all Config settings (*Optional*) 

In [None]:
from coqui_stt_training.util.config import Config

print(Config.to_json())

## ✅ Train a new Russian model

Let's kick off a training run 🚀🚀🚀 (using the configure you set above).

This notebook should work on either a GPU or a CPU. However, in case you're running this on _multiple_ GPUs we want to only use one, because the sample dataset (one audio file) is too small to split across multiple GPUs.

In [None]:
from coqui_stt_training.train import train

# use maximum one GPU
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

train()

## ✅ Configure the testing run

Let's add the path to our testing data and update `load_checkpoint_dir` to our new model checkpoints.

In [None]:
from coqui_stt_training.util.config import Config

Config.test_files=["russian/ru.csv"]
Config.load_checkpoint_dir="russian/checkpoints"

## ✅ Test the new Russian model

We made it! 🙌

Let's kick off the testing run, which displays performance metrics.

We're committing the cardinal sin of ML 😈 (aka - testing on our training data) so you don't want to deploy this model into production. In this notebook we're focusing on the workflow itself, so it's forgivable 😇

You can see from the test output that our tiny model has overfit to the data, and basically memorized this one sentence.

When you start training your own models, make sure your testing data doesn't include your training data 😅

In [None]:
from coqui_stt_training.evaluate import test

test()