<a href="https://colab.research.google.com/github/Cirediallo/Models/blob/main/icefall_asr_librispeech_pretrained_tdnn_lstm_ctc_usage.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Description
This notebook shows how to use a pre-trained tdnn-lstm_ctc model with [icefall](https://github.com/k2-fsa/icefall/).

## Environment setup

To use a pre-trained model with icefall, we have to install the following dependencies:

- [k2][k2], for FSA operations
- [torchaudio][audio], for reading sound files
- [kaldifeat][kaldifeat], for extracting features from a single sound
  file or multiple sound files

**NOTE**: [lhotse][lhotse] is used only in training time, for data preparation.


[k2]: https://github.com/k2-fsa/k2
[audio]: https://github.com/pytorch/audio
[kaldifeat]: https://github.com/csukuangfj/kaldifeat
[lhotse]: https://github.com/lhotse-speech/lhotse

### Install PyTorch and torchaudio

In [1]:
! nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0


Colab provides CUDA 11.0, so we choose to install torch==1.7.1+cu110.

The reason to select torch 1.7.1 is that we are going to install k2 using `pip install`, which depends on torch 1.7.1.

If you want to use a different version of PyTorch, please
refer to <https://k2.readthedocs.io/en/latest/installation/index.html> for installing k2 either from source or with `conda install`.

In [21]:
#! pip install torch==1.7.1+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

### Install k2

In [22]:
#! pip install k2==1.17

In [12]:
#! pip install k2==1.17.dev20220725+cuda11.0.torch1.7.1 -f https://k2-fsa.org/nightly/index.html

Check that k2 was installed successfully:

In [23]:
#! python3 -m k2.version

### All previous steps in one

In [2]:
!pip install torch==1.13.1
!pip install torchaudio==0.13.1

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting torch==1.13.1
  Downloading torch-1.13.1-cp39-cp39-manylinux1_x86_64.whl (887.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m887.4/887.4 MB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cuda-runtime-cu11==11.7.99
  Downloading nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m849.3/849.3 kB[0m [31m32.4 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cudnn-cu11==8.5.0.96
  Downloading nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m557.1/557.1 MB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cuda-nvrtc-cu11==11.7.99
  Downloading nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [3]:
pip install k2

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting k2
  Downloading k2-1.23.4-py3.9-none-any.whl (103.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m103.1/103.1 MB[0m [31m10.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: k2
Successfully installed k2-1.23.4


In [4]:
! python3 -m k2.version

Collecting environment information...

k2 version: 1.23.4
Build type: Release
Git SHA1: 62e404dd3f3a811d73e424199b3408e309c06e1a
Git date: Mon Jan 30 02:26:16 2023
Cuda used to build k2: 11.7
cuDNN used to build k2: 8.2.0
Python version used to build k2: 3.9
OS used to build k2: Ubuntu 18.04.6 LTS
CMake version: 3.25.1
GCC version: 7.5.0
CMAKE_CUDA_FLAGS:  -Wno-deprecated-gpu-targets   -lineinfo --expt-extended-lambda -use_fast_math -Xptxas=-w  --expt-extended-lambda -gencode arch=compute_35,code=sm_35  -lineinfo --expt-extended-lambda -use_fast_math -Xptxas=-w  --expt-extended-lambda -gencode arch=compute_50,code=sm_50  -lineinfo --expt-extended-lambda -use_fast_math -Xptxas=-w  --expt-extended-lambda -gencode arch=compute_60,code=sm_60  -lineinfo --expt-extended-lambda -use_fast_math -Xptxas=-w  --expt-extended-lambda -gencode arch=compute_61,code=sm_61  -lineinfo --expt-extended-lambda -use_fast_math -Xptxas=-w  --expt-extended-lambda -gencode arch=compute_70,code=sm_70  -lineinfo -

### Install kaldifeat

In [5]:
! pip install kaldifeat

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting kaldifeat
  Downloading kaldifeat-1.21.tar.gz (482 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m482.4/482.4 kB[0m [31m10.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: kaldifeat
  Building wheel for kaldifeat (setup.py) ... [?25l[?25hdone
  Created wheel for kaldifeat: filename=kaldifeat-1.21-cp39-cp39-linux_x86_64.whl size=263529 sha256=f9d7c27e98071ad5d4b0452307cff00fe952edc09bc1db5f9ac3132a6574ac55
  Stored in directory: /root/.cache/pip/wheels/d0/db/70/915de4b2e80a9aa90097d52e978ec5eb5c7393c1ee32f2618c
Successfully built kaldifeat
Installing collected packages: kaldifeat
Successfully installed kaldifeat-1.21


To check that kaldifeat was installed successfully, run

In [6]:
! python3 -c "import kaldifeat; print(kaldifeat.__version__)"

1.21


### Install icefall

icefall is a collection of Python scripts. All you need is just to
download its source code and set the `PYTHONPATH` environment variable.

In [7]:
! git clone https://github.com/k2-fsa/icefall

Cloning into 'icefall'...
remote: Enumerating objects: 11702, done.[K
remote: Counting objects: 100% (75/75), done.[K
remote: Compressing objects: 100% (59/59), done.[K
remote: Total 11702 (delta 26), reused 40 (delta 10), pack-reused 11627[K
Receiving objects: 100% (11702/11702), 13.61 MiB | 28.11 MiB/s, done.
Resolving deltas: 100% (7973/7973), done.


In [8]:
! pip install -q kaldialign

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/81.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.5/81.5 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[?25h

### install lhoste

In [9]:
! pip install git+https://github.com/lhotse-speech/lhotse

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting git+https://github.com/lhotse-speech/lhotse
  Cloning https://github.com/lhotse-speech/lhotse to /tmp/pip-req-build-t_1mtvqt
  Running command git clone --filter=blob:none --quiet https://github.com/lhotse-speech/lhotse /tmp/pip-req-build-t_1mtvqt
  Resolved https://github.com/lhotse-speech/lhotse to commit 0f812851aefb1dc0560e76641f86bbcfcd96a47c
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting intervaltree>=3.1.0
  Downloading intervaltree-3.1.0.tar.gz (32 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting dataclasses
  Downloading dataclasses-0.6-py3-none-any.whl (14 kB)
Collecting lilcom>=1.1.0
  Downloading lilcom-1.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (87 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [1]:
! cd icefall && \
  ! pip install -r requirements.txt

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting kaldifst
  Downloading kaldifst-1.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.1/9.1 MB[0m [31m59.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting kaldilm
  Downloading kaldilm-1.15-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m68.4 MB/s[0m eta [36m0:00:00[0m
Collecting sentencepiece>=0.1.96
  Downloading sentencepiece-0.1.98-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m59.2 MB/s[0m eta [36m0:00:00[0m
Collecting typeguard
  Downloading typeguard-3.0.2-py3-none-any.whl (30 kB)
Collecting dill
  Downloading dill-0.3.6-py3-none-any.whl (110 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━

## Download pre-trained Tdnn-Lstm CTC model

To make the following steps easier, we decide to download the model
to `icefall/egs/librispeech/ASR/tmp-lstm`

In [2]:
! apt-get install -y -qq tree sox git-lfs

Selecting previously unselected package libopencore-amrnb0:amd64.
(Reading database ... 122349 files and directories currently installed.)
Preparing to unpack .../0-libopencore-amrnb0_0.1.5-1_amd64.deb ...
Unpacking libopencore-amrnb0:amd64 (0.1.5-1) ...
Selecting previously unselected package libopencore-amrwb0:amd64.
Preparing to unpack .../1-libopencore-amrwb0_0.1.5-1_amd64.deb ...
Unpacking libopencore-amrwb0:amd64 (0.1.5-1) ...
Selecting previously unselected package libsox3:amd64.
Preparing to unpack .../2-libsox3_14.4.2+git20190427-2+deb11u2build0.20.04.1_amd64.deb ...
Unpacking libsox3:amd64 (14.4.2+git20190427-2+deb11u2build0.20.04.1) ...
Selecting previously unselected package libsox-fmt-alsa:amd64.
Preparing to unpack .../3-libsox-fmt-alsa_14.4.2+git20190427-2+deb11u2build0.20.04.1_amd64.deb ...
Unpacking libsox-fmt-alsa:amd64 (14.4.2+git20190427-2+deb11u2build0.20.04.1) ...
Selecting previously unselected package libsox-fmt-base:amd64.
Preparing to unpack .../4-libsox-fmt-b

**CAUTION**: You have to run `sudo apt-get install git-lfs`
and `git lfs install` in order to download the pre-trained model.

In [None]:
! cd icefall/egs/librispeech/ASR && rm -rf tmp-lstm

In [None]:
! cd icefall/egs/librispeech/ASR && \
  mkdir tmp-lstm && \
  cd tmp-lstm && \
  git lfs install && \
  git clone https://huggingface.co/pkufool/icefall_asr_librispeech_tdnn-lstm_ctc && \
  cd icefall_asr_librispeech_tdnn-lstm_ctc && \
  cd ../.. && \
  tree tmp-lstm

In [None]:
! ffprobe -show_format icefall/egs/librispeech/ASR/tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1089-134686-0001.flac

In [None]:
! ffprobe -show_format icefall/egs/librispeech/ASR/tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1221-135766-0001.flac

In [None]:
! ffprobe -show_format icefall/egs/librispeech/ASR/tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1221-135766-0002.flac

Due to limited memory provided by Colab, you have to upgrade to Colab Pro
to run `HLG decoding + LM rescoring`.

In [None]:
! pip install lhotse sentencepiece

## HLG decoding (1best)

In [None]:
! cd icefall/egs/librispeech/ASR && \
    PYTHONPATH=/content/icefall python3 ./tdnn_lstm_ctc/pretrained.py \
      --method 1best \
      --checkpoint ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/exp/pretrained.pt \
      --words-file ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/data/lang_phone/words.txt \
      --HLG ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/data/lang_phone/HLG.pt \
      ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1089-134686-0001.flac \
      ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1221-135766-0001.flac \
      ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1221-135766-0002.flac 

  '"sox" backend is being deprecated. '
2022-11-02 14:32:26,910 INFO [pretrained.py:162] {'feature_dim': 80, 'subsampling_factor': 3, 'num_classes': 72, 'sample_rate': 16000, 'search_beam': 20, 'output_beam': 5, 'min_active_states': 30, 'max_active_states': 10000, 'use_double_scores': True, 'checkpoint': './tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/exp/pretrained.pt', 'words_file': './tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/data/lang_phone/words.txt', 'HLG': './tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/data/lang_phone/HLG.pt', 'method': '1best', 'G': None, 'ngram_lm_scale': 0.8, 'sound_files': ['./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1089-134686-0001.flac', './tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1221-135766-0001.flac', './tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1221-135766-0002.flac']}
2022-11-02 14:32:26,987 INFO [pretrained.py:168] device: cuda:0
2022-11-02 14:32:26,987 INFO [pretrained.py:170] Creating model


## HLG decoding + LM rescoring

In [None]:
! cd icefall/egs/librispeech/ASR && \
    PYTHONPATH=/content/icefall python3 ./tdnn_lstm_ctc/pretrained.py \
      --method whole-lattice-rescoring \
      --checkpoint ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/exp/pretrained.pt \
      --words-file ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/data/lang_phone/words.txt \
      --HLG ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/data/lang_phone/HLG.pt \
      --G ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/data/lm/G_4_gram.pt \
      --ngram-lm-scale 0.08 \
      ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1089-134686-0001.flac \
      ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1221-135766-0001.flac \
      ./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1221-135766-0002.flac 

  '"sox" backend is being deprecated. '
2022-11-02 14:35:30,881 INFO [pretrained.py:162] {'feature_dim': 80, 'subsampling_factor': 3, 'num_classes': 72, 'sample_rate': 16000, 'search_beam': 20, 'output_beam': 5, 'min_active_states': 30, 'max_active_states': 10000, 'use_double_scores': True, 'checkpoint': './tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/exp/pretrained.pt', 'words_file': './tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/data/lang_phone/words.txt', 'HLG': './tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/data/lang_phone/HLG.pt', 'method': 'whole-lattice-rescoring', 'G': './tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/data/lm/G_4_gram.pt', 'ngram_lm_scale': 0.08, 'sound_files': ['./tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1089-134686-0001.flac', './tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1221-135766-0001.flac', './tmp-lstm/icefall_asr_librispeech_tdnn-lstm_ctc/test_wavs/1221-135766-0002.flac']}
2022-11-02 14:35:30,966 INFO [pretrained.py

## Training

### Data preparation

In [3]:
! cd icefall/egs/librispeech/ASR && \
  ./prepare.sh

[1;30;43mLe flux de sortie a été tronqué et ne contient que les 5000 dernières lignes.[0m
Downloading train-clean-360.tar.gz:  37% 7.90G/21.5G [05:09<08:20, 29.1MB/s][A
Downloading train-clean-360.tar.gz:  37% 7.90G/21.5G [05:09<09:14, 26.3MB/s][A
Downloading train-clean-360.tar.gz:  37% 7.91G/21.5G [05:09<08:50, 27.4MB/s][A
Downloading train-clean-360.tar.gz:  37% 7.91G/21.5G [05:09<09:24, 25.8MB/s][A
Downloading train-clean-360.tar.gz:  37% 7.91G/21.5G [05:09<09:07, 26.6MB/s][A
Downloading train-clean-360.tar.gz:  37% 7.91G/21.5G [05:09<08:54, 27.2MB/s][A
Downloading train-clean-360.tar.gz:  37% 7.92G/21.5G [05:09<08:44, 27.8MB/s][A
Downloading train-clean-360.tar.gz:  37% 7.92G/21.5G [05:09<08:39, 28.0MB/s][A
Downloading train-clean-360.tar.gz:  37% 7.92G/21.5G [05:09<08:34, 28.3MB/s][A
Downloading train-clean-360.tar.gz:  37% 7.93G/21.5G [05:10<08:31, 28.4MB/s][A
Downloading train-clean-360.tar.gz:  37% 7.93G/21.5G [05:10<08:26, 28.7MB/s][A
Downloading train-clean-360.

### Launch training

In [None]:
# https://k2-fsa.github.io/icefall/recipes/Non-streaming-ASR/librispeech/tdnn_lstm_ctc.html