# Train Turkish STT Model with Coqui & Common Voice Dataset v8.0
Based on: https://arxiv.org/abs/2105.04674

Run #1 with Transfer Learning - Baseline

*   Epocs = 25
*   Learning Rate = 0.001
*   Dropout Rate = 0.5
*   SpecAugment = OFF

In [1]:
# Switch back to v1 - See: https://colab.research.google.com/notebooks/tensorflow_version.ipynb#scrollTo=NeWVBhf1VxlH
%tensorflow_version 1.x

TensorFlow 1.x selected.


## Mount Google Drive

In [2]:
# mount your private google drive
from google.colab import drive
import shutil
drive.mount('/content/drive')

Mounted at /content/drive


## Basic Setup

In [3]:
# Change softlink to CUDA version --> 10.0
!rm -rf /usr/local/cuda
!ln -s /usr/local/cuda-10.0 /usr/local/cuda
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130


In [4]:
# Install Coqui STT 
!git clone --depth 1 --branch v1.0.0 https://github.com/coqui-ai/STT.git
!cd STT; pip install -U pip wheel setuptools; pip install .

Cloning into 'STT'...
remote: Enumerating objects: 2162, done.[K
remote: Counting objects: 100% (2162/2162), done.[K
remote: Compressing objects: 100% (1358/1358), done.[K
remote: Total 2162 (delta 844), reused 1609 (delta 709), pack-reused 0[K
Receiving objects: 100% (2162/2162), 12.49 MiB | 29.06 MiB/s, done.
Resolving deltas: 100% (844/844), done.
Note: checking out '27584037f879442fb45f9064dc772dbcb6ba6372'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

Collecting pip
  Downloading pip-21.3.1-py3-none-any.whl (1.7 MB)
[K     |████████████████████████████████| 1.7 MB 5.3 MB/s 
Collecting setuptools
  Downloading setuptools-60.

In [5]:
# Tensorflow GPU
!pip install tensorflow-gpu==1.15.4

Collecting tensorflow-gpu==1.15.4
  Downloading tensorflow_gpu-1.15.4-cp37-cp37m-manylinux2010_x86_64.whl (411.0 MB)
     |████████████████████████████████| 411.0 MB 23 kB/s              
Installing collected packages: tensorflow-gpu
Successfully installed tensorflow-gpu-1.15.4


## Check GPU

In [6]:
# Re-Check CUDA
!nvcc --version
!ls -alF /usr/local/
!cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
total 96
drwxr-xr-x  1 root root 4096 Jan 27 03:36 ./
drwxr-xr-x  1 root root 4096 Jan 27 03:15 ../
drwxr-xr-x  1 root root 4096 Jan 27 03:38 bin/
lrwxrwxrwx  1 root root   20 Jan 27 03:36 cuda -> /usr/local/cuda-10.0/
drwxr-xr-x 16 root root 4096 Jan  7 14:19 cuda-10.0/
drwxr-xr-x 15 root root 4096 Jan  7 14:21 cuda-10.1/
lrwxrwxrwx  1 root root   25 Jan  7 14:27 cuda-11 -> /etc/alternatives/cuda-11/
drwxr-xr-x 15 root root 4096 Jan  7 14:23 cuda-11.0/
drwxr-xr-x  1 root root 4096 Jan  7 14:26 cuda-11.1/
drwxr-xr-x  1 root root 4096 Jan 25 14:13 etc/
drwxr-xr-x  2 root root 4096 Nov 19  2020 games/
drwxr-xr-x  2 root root 4096 Jan 25 14:25 _gcs_config_ops.so/
drwxr-xr-x  1 root root 4096 Jan 25 14:34 include/
drwxr-xr-x  1 root root 4096 Jan 25 14:34 lib/
-rw-r--r--  1 root root 1636 Jan 25 14:30 LICENSE.txt
drwxr-xr-x  

In [7]:
# Check the number of GPUs with their names
!nvidia-smi -L
!nvidia-smi -q
# Get the track of GPU activity
!nvidia-smi

GPU 0: Tesla P100-PCIE-16GB (UUID: GPU-7bb06587-834d-8ecc-9250-5c1c1debc67f)


Timestamp                                 : Thu Jan 27 03:38:08 2022
Driver Version                            : 460.32.03
CUDA Version                              : 11.2

Attached GPUs                             : 1
GPU 00000000:00:04.0
    Product Name                          : Tesla P100-PCIE-16GB
    Product Brand                         : Tesla
    Product Architecture                  : Pascal
    Display Mode                          : Enabled
    Display Active                        : Disabled
    Persistence Mode                      : Disabled
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : Disabled
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                

### Check TensorFlow GPU

In [8]:
import os
# use maximum one GPU
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
!python --version

Python 3.7.12


In [9]:
import tensorflow as tf
print([tf.__version__, tf.test.is_gpu_available()])

['1.15.4', True]


In [10]:
from tensorflow.python.client import device_lib
device_lib.list_local_devices()

[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 8737774075042231129, name: "/device:XLA_CPU:0"
 device_type: "XLA_CPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 3515491917543970419
 physical_device_desc: "device: XLA_CPU device", name: "/device:XLA_GPU:0"
 device_type: "XLA_GPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 1028489735928665626
 physical_device_desc: "device: XLA_GPU device", name: "/device:GPU:0"
 device_type: "GPU"
 memory_limit: 15964005991
 locality {
   bus_id: 1
   links {
   }
 }
 incarnation: 11695696222161316755
 physical_device_desc: "device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0"]

In [11]:
tf.test.gpu_device_name()

'/device:GPU:0'

In [12]:
!pip freeze | grep tensorflow

mesh-tensorflow==0.1.12
tensorflow @ file:///tensorflow-2.7.0-cp37-cp37m-linux_x86_64.whl
tensorflow-datasets==4.0.1
tensorflow-estimator==1.15.1
tensorflow-gan==2.0.0
tensorflow-gcs-config==2.7.0
tensorflow-gpu==1.15.4
tensorflow-hub==0.12.0
tensorflow-io-gcs-filesystem==0.23.1
tensorflow-metadata==1.6.0
tensorflow-probability==0.7.0


## GET COMMON VOICE DATASET & SUPPLEMENTS

In [13]:
# Prep directories
!mkdir -p /content/data/tr
!mkdir -p /content/data/en
# create export & checkpoint folders
!mkdir -p /content/data/tr/checkpoints
!mkdir -p /content/data/tr/exports

In [14]:
# Get Alphabet & Validator
!cp drive/MyDrive/cv-datasets/tr/alphabet.txt /content/data/tr/alphabet.txt
!cp drive/MyDrive/cv-datasets/tr/validate_label_tr.py /content/data/tr/validate_label_tr.py

In [15]:
# CASE-1: Get Checkpoint for English for transfer learning
shutil.copytree("/content/drive/MyDrive/cv-datasets/en/coqui-stt-1.0.0-checkpoint", "/content/data/en/coqui-stt-1.0.0-checkpoint")

'/content/data/en/coqui-stt-1.0.0-checkpoint'

In [None]:
# CASE-2: Get saved checkpoints
#shutil.copy("/content/drive/MyDrive/cv-datasets/tr/checkpoints", "/content/data/tr/checkpoints")

## RESTORE CONVERTED DATASET FROM BACKUP

In [16]:
!tar -xzvf "/content/drive/MyDrive/cv-datasets/tr/v8.0/converted-tr-corpus-v8.0.tar.gz" -C "/"

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
content/data/tr/v8.0/clips/common_voice_tr_19291864.wav
content/data/tr/v8.0/clips/common_voice_tr_18352812.wav
content/data/tr/v8.0/clips/common_voice_tr_19433428.wav
content/data/tr/v8.0/clips/common_voice_tr_28978898.wav
content/data/tr/v8.0/clips/common_voice_tr_30461433.wav
content/data/tr/v8.0/clips/common_voice_tr_28925593.wav
content/data/tr/v8.0/clips/common_voice_tr_30462270.wav
content/data/tr/v8.0/clips/common_voice_tr_30413159.wav
content/data/tr/v8.0/clips/common_voice_tr_26552077.wav
content/data/tr/v8.0/clips/common_voice_tr_19291485.wav
content/data/tr/v8.0/clips/common_voice_tr_24840059.wav
content/data/tr/v8.0/clips/common_voice_tr_17346071.wav
content/data/tr/v8.0/clips/common_voice_tr_30421915.wav
content/data/tr/v8.0/clips/common_voice_tr_20342119.wav
content/data/tr/v8.0/clips/common_voice_tr_25019934.wav
content/data/tr/v8.0/clips/common_voice_tr_17525765.wav
content/data/tr/v8.0/clips/common_voice

In [17]:
# check
!echo "---------------------"
!echo "/content/data"
!echo "---------------------"
!ls /content/data
!echo "---------------------"
!echo "/content/data/en"
!echo "---------------------"
!ls /content/data/en
!echo "---------------------"
!echo "/content/data/en/coqui-stt-1.0.0-checkpoint"
!echo "---------------------"
!ls /content/data/en/coqui-stt-1.0.0-checkpoint
!echo "---------------------"
!echo "/content/data/tr"
!echo "---------------------"
!ls /content/data/tr
!echo "---------------------"
!echo "/content/data/tr/v8.0"
!echo "---------------------"
!ls /content/data/tr/v8.0
!echo "---------------------"
!echo "/content/data/tr/v8.0/clips/*.csv"
!echo "---------------------"
!ls /content/data/tr/v8.0/clips/*.csv
!echo "---------------------"
!echo "/content/data/tr/checkpoints"
!echo "---------------------"
!ls /content/data/tr/checkpoints

---------------------
/content/data
---------------------
en  tr
---------------------
/content/data/en
---------------------
coqui-stt-1.0.0-checkpoint
---------------------
/content/data/en/coqui-stt-1.0.0-checkpoint
---------------------
alphabet.txt			      best_dev_checkpoint  PaxHeader
best_dev-3663881.data-00000-of-00001  checkpoint	   TESTING_NO_LM
best_dev-3663881.index		      LICENSE
best_dev-3663881.meta		      MODEL_CARD.md
---------------------
/content/data/tr
---------------------
alphabet.txt  checkpoints  exports  v8.0  validate_label_tr.py
---------------------
/content/data/tr/v8.0
---------------------
clips	 invalidated.tsv  reported.tsv	train.tsv
dev.tsv  other.tsv	  test.tsv	validated.tsv
---------------------
/content/data/tr/v8.0/clips/*.csv
---------------------
/content/data/tr/v8.0/clips/dev.csv
/content/data/tr/v8.0/clips/other.csv
/content/data/tr/v8.0/clips/test.csv
/content/data/tr/v8.0/clips/train-all.csv
/content/data/tr/v8.0/clips/train.csv
/content/d

# -------------------------------
#    COQUI TRAINING


In [20]:
# See https://github.com/mozilla/DeepSpeech/issues/3088
!export TF_CUDNN_RESET_RND_GEN_STATE=1

## SEPARATE TRAIN AND TEST

In [24]:
# TRAIN
#!python /content/STT/train.py \
!python -m coqui_stt_training.train \
  --log_level 0 \
  --show_progressbar true \
  --train_cudnn true \
  --force_initialize_learning_rate true \
  --epochs 25 \
  --learning_rate 0.001 \
  --dropout_rate 0.5 \
  --max_to_keep 1 \
  --drop_source_layers 2 \
  --train_batch_size 32 \
  --dev_batch_size 8 \
  --alphabet_config_path /content/data/tr/alphabet.txt \
  --save_checkpoint_dir /content/data/tr/checkpoints \
  --load_checkpoint_dir /content/data/en/coqui-stt-1.0.0-checkpoint \
  --train_files /content/data/tr/v8.0/clips/train.csv \
  --dev_files /content/data/tr/v8.0/clips/dev.csv

2022-01-27 04:47:39.898606: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2022-01-27 04:47:39.903109: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2299995000 Hz
2022-01-27 04:47:39.903283: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5567925fa1c0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-01-27 04:47:39.903311: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2022-01-27 04:47:39.904874: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2022-01-27 04:47:40.118991: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-27 04:47:40.119895: 

In [25]:
# SAVE CHECKPOINTS DIR
!rm -rf drive/MyDrive/cv-datasets/tr/v8.0-r1
!mkdir drive/MyDrive/cv-datasets/tr/v8.0-r1
!cp -r /content/data/tr/checkpoints drive/MyDrive/cv-datasets/tr/v8.0-r1

In [26]:
# TEST
!python -m coqui_stt_training.evaluate \
  --show_progressbar true \
  --train_cudnn true \
  --test_batch_size 32 \
  --test_files /content/data/tr/v8.0/clips/test.csv \
  --checkpoint_dir /content/data/tr/checkpoints

I --alphabet_config_path not specified, but found an alphabet file alongside specified checkpoint (/content/data/tr/checkpoints/alphabet.txt). Will use this alphabet file for this run.
I Loading best validating checkpoint from /content/data/tr/checkpoints/best_dev-3664387
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: global_step
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading vari

## Export model or checkpoints to Drive

In [27]:
!python3 -m coqui_stt_training.export \
  --export_tflite true \
  --export_author_id "bozden" \
  --export_contact_info "developer@harikalarkutusu.com" \
  --export_model_name "cv_v8.0-r1_tl-cq_0.001_0.5_nosa_25" \
  --export_model_version "0.0.1" \
  --checkpoint_dir /content/data/tr/checkpoints \
  --export_dir /content/data/tr/exports/model

I --alphabet_config_path not specified, but found an alphabet file alongside specified checkpoint (/content/data/tr/checkpoints/alphabet.txt). Will use this alphabet file for this run.
I Exporting the model...
I Loading best validating checkpoint from /content/data/tr/checkpoints/best_dev-3664387
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias
I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel
I Loading variable from checkpoint: layer_1/bias
I Loading variable from checkpoint: layer_1/weights
I Loading variable from checkpoint: layer_2/bias
I Loading variable from checkpoint: layer_2/weights
I Loading variable from checkpoint: layer_3/bias
I Loading variable from checkpoint: layer_3/weights
I Loading variable from checkpoint: layer_5/bias
I Loading variable from checkpoint: layer_5/weights
I Loading variable from checkpoint: layer_6/bias
I Loading variable from checkpoint: l

In [28]:
# Delete all WAV files
!find /content/data/tr/v8.0/clips/ -name "*.wav" -delete

In [29]:
# Compress
!tar czf /content/data/v8.0-r1.tar.gz /content/data/tr
# Copy file to Google Drive.
shutil.move("/content/data/v8.0-r1.tar.gz", "/content/drive/MyDrive/cv-datasets/tr/v8.0")

tar: Removing leading `/' from member names


'/content/drive/MyDrive/cv-datasets/tr/v8.0/v8.0-r1.tar.gz'

In [30]:
# Flush disk to Google Drive
drive.flush_and_unmount()