<a href="https://colab.research.google.com/github/danielsj95/ASR_with_NeMo/blob/master/NeMo_ASR_Pretrained_Model_Script.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# NeMo ASR Pretrained Model
*Notebook authored by: Daniel Leong Shao Jun*

[Neural Modules (NeMo)](https://developer.nvidia.com/nvidia-nemo) is a toolkit developed by NVIDIA for creating Conversational AI applications, with prebuilt modules for automatic speech recognition among many others. Most recently, NVIDIA released a pretrained Jasper model, that had trained on part of the [Singapore English National Speech Corpus](https://www.imda.gov.sg/programme-listing/digital-services-lab/national-speech-corpus). More information about the model and corpus can be found further below.

This notebook experiments on how to use the pretrained NeMo ASR models to transcribe Singaporean accented English.

## Download Dependencies and Import Packages
This notebook has been written with using Google Colab in mind. You may also instead opt to use NeMo's docker container with all of the dependencies pre-installed.

```
docker pull nvcr.io/nvidia/nemo:v0.11
docker run --runtime=nvidia -it --rm -v --shm-size=16g -p 8888:8888 -p 6006:6006 --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/nemo:v0.11
```

We will be using the latest stable version of NeMo (0.11) as of writing, as it is tested and compatible with the latest version of the Jasper pretrained model (multidataset_jasper10x5dr_5).

For more information, please visit their [NeMo's github page](https://github.com/NVIDIA/NeMo).

### Install Dependencies

In [None]:
# If you're using Google Colab and not running locally, run this cell.
!pip install wget
!apt-get install sox libsndfile1 ffmpeg
!pip install unidecode

# This installs only the modules required for the asr module, we will be using v0.11 of the module
!pip install nemo_toolkit[asr]==0.11.0

# downgrade pytorch, as nemo_toolkit only works on torch 1.4.*, torchaudio 0.4
!pip install torch==1.4.0 torchvision==0.5.0 torchaudio==0.4.0 -f https://download.pytorch.org/whl/torch_stable.html

### Import Packages

In [2]:
# NeMo's "core" package
import nemo
# NeMo's ASR collection
import nemo.collections.asr as nemo_asr
# For preprocessing
import os
import json
import librosa
# For processing results
from nemo.collections.asr.helpers import word_error_rate, post_process_predictions, post_process_transcripts

################################################################################
###          (please add 'export KALDI_ROOT=<your_path>' in your $HOME/.profile)
###          (or run as: KALDI_ROOT=<your_path> python <your_script>.py)
################################################################################

[NeMo W 2020-08-24 06:07:11 audio_preprocessing:61] Unable to import APEX. Mixed precision and distributed training will not work.


## Instantiating the Model

### Instantiating Neural Factory

In NeMo, **both the training and inference pipelines are handled by the NeuralModuleFactory class.** It oversees the neural modules (such as the encoders and decoders), checkpointing, callbacks, logs and many other important details. 

In this demo, we will also be using the GPU instance, and as such we explicity stated the device we're using in the *placement* argument.


In [3]:
neural_factory = nemo.core.NeuralModuleFactory(placement=nemo.core.DeviceType.GPU)

### Introduction to Jasper Model
Just Another SPeech Recognizer (Jasper) models are end-to-end neural acoustic models for ASR. The term "end-to-end" impies that the model transcribes without any additional alignment information. Instead, connectionist temporal classification (CTC) is used to locate alignment between the audio and text. Their architecture consists of a repeated block structure that utilizes 1D convolutions. In this example, we will be using a variant of the regular Jasper model - *Jasper Dense Residual*. An pictorial representation of the model is shown below.

![Jasper Dense Residual](https://i.imgur.com/pRdztFo.png)

The blocks are shown on the left, while the sub-blocks are shown on the right. The sub-block applies a 1D-convolution, batch norm, ReLU and dropout. Each block input is connected directly into the last sub-block via residual connection. In this Jasper Dense Residual variant, the output of each block is added to the inputs of all the following blocks. The pretrained model has a total of 10 blocks of 5 sub-blocks, i.e. JasperNet**10**x**5**.

For more detailed information about the model, please take a look at the [paper on Jasper](https://arxiv.org/abs/1904.03288).

### Instantiating the Pretrained JasperNet10x5 model





In [4]:
asr_model = nemo_asr.models.ASRConvCTCModel.from_pretrained(model_info='JasperNet10x5-En')

[NeMo I 2020-08-24 06:07:11 helpers:171] Downloading from: https://api.ngc.nvidia.com/v2/models/nvidia/multidataset_jasper10x5dr/versions/5/files/JasperNet10x5-En-Base.nemo to /root/.cache/torch/NeMo/NEMO_0.11.0/JasperNet10x5-En-Base.nemo
[NeMo I 2020-08-24 06:08:12 asrconvctcmodel:226] Instantiating model from pre-trained checkpoint
[NeMo I 2020-08-24 06:08:24 neural_modules:343] Loading configuration of a new Neural Module from the `L9VV3Q7IAJGFLFQU/.nemo_tmp/module.yaml` file
[NeMo I 2020-08-24 06:08:24 features:144] PADDING: 16
[NeMo I 2020-08-24 06:08:24 features:152] STFT using conv
[NeMo I 2020-08-24 06:08:27 neural_modules:443] Instantiated a new Neural Module named `audiotomelspectrogrampreprocessor0` of type `AudioToMelSpectrogramPreprocessor`
[NeMo I 2020-08-24 06:08:33 neural_modules:443] Instantiated a new Neural Module named `jasperencoder0` of type `JasperEncoder`
[NeMo I 2020-08-24 06:08:34 neural_modules:443] Instantiated a new Neural Module named `jasperdecoderforctc0

The pretrained model we are about to import has 4 main modules in its pipeline.

**1. Audio to Mel Spectrogram Preprocessor:**
Audio input is processed through several steps - normalizing, windowing, FFT computation, to generate mel spectrogram as features.

**2. Spectrogram Augmentation:** The features are augmented by frequency masking, time masking (according to the [SpecAugment](https://arxiv.org/abs/1904.08779) paper), and zeroing out rectangles (from the paper [Cutout](https://arxiv.org/abs/1708.04552)). 
*Note that only training inputs will be augmented.*

**3. Jasper Encoder:** 
The features are then passed into the encoder with the Jasper architecture as described in the above section.

**4. Jasper Decoder for CTC:** 
The information are then passed through a decoder, which is simply a 1x1 convolution with the out channel being the vocabulary size. (26 letters, space, apostrophe, and empty marker to determine repeated letters)


The entire pipeline can be accessed using the modules property

In [5]:
labels = asr_model.vocabulary
data_preprocessor, spectrogram_aug, encoder, decoder = asr_model.modules

## Running Through the Model Pipeline
Now that we have our model instantiated, we need to pass our inputs through the model pipeline, and then into the CTC decoder.

### Getting Sample Data
For this demo, I have prepared 4 short Singaporean accented audio samples. The samples are accompanied by their transcripts in another folder.

When preparing wav files as input, it is important to note that the sampling rate has to be 16kHz, and the channel is set to mono.

In [6]:
# download sample wav files
!wget -q -O pretrained_samples.zip https://github.com/danielsj95/ASR_with_NeMo/blob/master/samples/pretrained_samples.zip?raw=true
!unzip pretrained_samples.zip

Archive:  pretrained_samples.zip
   creating: pretrained_samples/audio_clips/
  inflating: pretrained_samples/audio_clips/pretrained_sample_1.wav  
  inflating: pretrained_samples/audio_clips/pretrained_sample_2.wav  
  inflating: pretrained_samples/audio_clips/pretrained_sample_3.wav  
  inflating: pretrained_samples/audio_clips/pretrained_sample_4.wav  
  inflating: pretrained_samples/audio_clips/pretrained_sample_5.wav  
   creating: pretrained_samples/transcripts/
  inflating: pretrained_samples/transcripts/pretrained_sample_1.txt  
  inflating: pretrained_samples/transcripts/pretrained_sample_2.txt  
  inflating: pretrained_samples/transcripts/pretrained_sample_3.txt  
  inflating: pretrained_samples/transcripts/pretrained_sample_4.txt  
  inflating: pretrained_samples/transcripts/pretrained_sample_5.txt  


There are a total of 5 clips in this sample folder. All 5 clips are extracted from Youtube videos of speeches by Minister for Trade and Industry Chan Chun Sing. 

The reason for choosing clips from Minister Chan Chun Sing is due to him being well-known for having the strongest Singaporean accent among his politician peers. 

Let's listen to one of the audio sample.

In [7]:
import IPython.display as ipd
ipd.Audio('pretrained_samples/audio_clips/pretrained_sample_4.wav')

### Creating Data Manifest
For NeMo, a manifest needs to be created for our data, which will contain the metadata of our audio files. NeMo data layers take in a standardized manifest format, with each line corresponding to each audio sample. An example of the manifest is shown below:

```
{"audio_filepath": "path/to/audio_sample.wav", "duration": 6.39, "text": "this is a sample"}
```

The helper function below is used to create the manifest file for our sample set.

In [8]:
# FHelper function to build sample manifest
def build_manifest(transcripts_path, manifest_path, wav_path):
    
    with open(manifest_path, 'w') as fout:
        
        count = 0

        # both transcript file and wav file should have the same name
        for transcript_file in os.listdir(transcripts_path):
            with open(os.path.join(transcripts_path, transcript_file), 'r') as fin:
                for line in fin:
                    transcript = line
            
            wav_file = transcript_file[:-4] + '.wav'
            audio_path = os.path.join(wav_path, wav_file)
            duration = librosa.core.get_duration(filename=audio_path)

            # Write the metadata to the manifest
            metadata = {
                "audio_filepath": audio_path,
                "duration": duration,
                "text": transcript
            }
            json.dump(metadata, fout)
            fout.write('\n')

            count += 1

    print('{} items recorded in manifest'.format(count))
    print('Manifest saved at {}'.format(manifest_path))

    return manifest_path    

In [9]:
# build manifest
manifest = build_manifest('pretrained_samples/transcripts', 'manifest.json', 'pretrained_samples/audio_clips')

5 items recorded in manifest
Manifest saved at manifest.json


In [10]:
!cat manifest.json

{"audio_filepath": "pretrained_samples/audio_clips/pretrained_sample_2.wav", "duration": 11.8214375, "text": "the exercise to distribute the mask to the household will commence this saturday at two pm in every constituency"}
{"audio_filepath": "pretrained_samples/audio_clips/pretrained_sample_1.wav", "duration": 15.728875, "text": "we are entirely focused on helping our country overcome the economic challenges and saving the jobs at this point in time er we have no plans to do otherwise and we have no plans er no discussion on any change in plan"}
{"audio_filepath": "pretrained_samples/audio_clips/pretrained_sample_4.wav", "duration": 9.87625, "text": "sir deputy speaker sir i don't think we have anything to hide we have just shared the data"}
{"audio_filepath": "pretrained_samples/audio_clips/pretrained_sample_5.wav", "duration": 18.5875625, "text": "the ultimate competition is not about competing singaporeans against the p r it is about the team singapore comprising of singaporeans a

### Processing Inputs
With the manifest created, we have all the pieces needed to run the data into the model and obtain predictions. The following helper function streamlines the process.

In [11]:
def process_inputs(manifest_path, labels, data_preprocessor, spectrogram_aug, encoder, decoder, batch_size=16, train=False, lm=False):

  # Create data layer with the audio inputs
  # Batch_size is only relevant for training instances
  data_layer = nemo_asr.AudioToTextDataLayer(manifest_path, labels, batch_size)

  # Set up for input into preprocessor
  (audio_signal, audio_len, transcript, transcript_len) = data_layer()

  # Process the data into features for the model
  processed_signal, processed_len = data_preprocessor(input_signal=audio_signal, length=audio_len)

  if train:
    # only augment the data with spectrogram augmentation if training
    processed_signal = spectrogram_aug(input_spec=processed_signal)

  # Send the processed features through the encoder
  encoded, encoded_len = encoder(audio_signal=processed_signal,length=processed_len)

  # Retrieve probabilities from decoded input
  log_probs = decoder(encoder_output=encoded)

  # Instantiate the greedy decoder, and obtain predictions in tensor form
  greedy_decoder = nemo_asr.GreedyCTCDecoder()
  preds = greedy_decoder(log_probs=log_probs)

  # Instantiate ctc loss calculator, and output loss
  ctc_loss = nemo_asr.CTCLossNM(num_classes=len(labels))
  loss = ctc_loss(
      log_probs=log_probs,
      targets=transcript,
      input_length=encoded_len,
      target_length=transcript_len)
  
  # return list of tensors as output
  if lm:
    return [log_probs, preds, transcript, transcript_len, encoded_len]
  else:
    return [loss, preds, transcript, transcript_len]

In [12]:
# obtain predictions as list of tensors
tensors = process_inputs(manifest, labels, data_preprocessor, spectrogram_aug, encoder, decoder)

[NeMo I 2020-08-24 06:09:38 collections:158] Dataset loaded with 5 files totalling 0.02 hours
[NeMo I 2020-08-24 06:09:38 collections:159] 0 files were filtered totalling 0.00 hours


### Running Inference
We finally run inference to get the results.

In [13]:
evaluated_tensors = neural_factory.infer(tensors)

[NeMo I 2020-08-24 06:09:40 actions:695] Evaluating batch 0 out of 1


## Evaluate the Results
With the evaluated tensors, let's now evaluate the results in a readable format. NeMo has a set of helpers that can help us with that.

In [30]:
from nemo.collections.asr.helpers import word_error_rate, post_process_predictions, post_process_transcripts

def process_outputs(evaluated_tensors, labels, no_of_samples=None):

  greedy_hypotheses = post_process_predictions(evaluated_tensors[1], labels)
  references = post_process_transcripts(evaluated_tensors[2], evaluated_tensors[3], labels)
  wer = word_error_rate(hypotheses=greedy_hypotheses, references=references)

  print('#########################')
  print('Greedy WER: {:.2f}'.format(wer * 100))

  if not no_of_samples:
    no_of_samples = len(greedy_hypotheses)

  for ind in range(no_of_samples):
    print('###### Sample {} ############'.format(ind+1))
    print('Predicted : \n {}'.format(greedy_hypotheses[ind]))
    print('Transcript: \n {}'.format(references[ind]))
  
  return greedy_hypotheses

In [31]:
greedy_hypo = process_outputs(evaluated_tensors, labels = asr_model.vocabulary)

#########################
Greedy WER: 21.26
###### Sample 1 ############
Predicted : 
 the above circumstances provide us opportunities but also challenges
Transcript: 
 the above circumstances provide us opportunities but also challenges
###### Sample 2 ############
Predicted : 
 we entirely focus on helping our country overcome economic challenges and saving the jobs at this point in time we have no plans to do otherwise and we have no plans no discussion on any changing plan
Transcript: 
 we are entirely focused on helping our country overcome the economic challenges and saving the jobs at this point in time er we have no plans to do otherwise and we have no plans er no discussion on any change in plan
###### Sample 3 ############
Predicted : 
 the exercise to distribute the mass to the household we commence this saturday at two pm in every constituency
Transcript: 
 the exercise to distribute the mask to the household will commence this saturday at two pm in every constituency
####

The model actually did a pretty good transcription for the given samples. However, we observe that there are several spelling mistakes. A language model would likely help us with that. Let's try it out!

## Implementing a Language Model

We will be adding NeMo's BeamSearchDeoderLM module into our pipeline. 


### Install Dependencies

In [None]:
# This installs only the modules required for the nlp module, we will be using v0.11 of the module
!pip install nemo_toolkit[nlp]==0.11.0

In [None]:
# If you are using Google Colab, run this cell.
# The following is mostly copied from `NeMo/scripts/install_decoders.sh`.
# This will take a little while.
!apt-get install swig
!git clone https://github.com/PaddlePaddle/DeepSpeech
!cd DeepSpeech; git checkout b3c728d
!mv DeepSpeech/decoders/swig_wrapper.py DeepSpeech/decoders/swig/ctc_decoders.py
!mv DeepSpeech/decoders/swig ./decoders
!cd decoders; sed -i "s/\.decode('utf-8')//g" ctc_decoders.py; \
  sed -i 's/\.decode("utf-8")//g' ctc_decoders.py; \
  sed -i "s/name='swig_decoders'/name='ctc_decoders'/g" setup.py; \
  sed -i "s/-space_prefixes\[i\]->approx_ctc/space_prefixes\[i\]->score/g" decoder_utils.cpp; \
  sed -i "s/py_modules=\['swig_decoders'\]/py_modules=\['ctc_decoders', 'swig_decoders'\]/g" setup.py; \
  chmod +x setup.sh; \
  ./setup.sh

# The following is a bit of a hack to get the import to work.
# If the path is wrong, check the last few lines of installer output for the correct path.
# (There should be a line like: "Installed <path>".)
os.sys.path.append('/usr/local/lib/python3.6/dist-packages/ctc_decoders-1.1-py3.6-linux-x86_64.egg')

### Importing Language Model
We will be using a 3-gram language model built from the [LibriSpeech corpus](http://www.openslr.org/11). We will also be converting the letters to lowercase, since our existing decoder only expects lowercase.

In [20]:
import gzip
import shutil
import wget
data_dir = '.'

lm_gzip_path = os.path.join(data_dir, '3-gram.pruned.1e-7.arpa.gz')
if not os.path.exists(lm_gzip_path):
    print("Downloading pruned 3-gram model.")
    lm_url = 'http://www.openslr.org/resources/11/3-gram.pruned.1e-7.arpa.gz'
    lm_gzip_path = wget.download(lm_url, data_dir)
    print("Downloaded the 3-gram language model.")
else:
    print("Pruned .arpa.gz already exists.")

uppercase_lm_path = os.path.join(data_dir, '3-gram.pruned.1e-7.arpa')
if not os.path.exists(uppercase_lm_path):
    with gzip.open(lm_gzip_path, 'rb') as f_zipped:
        with open(uppercase_lm_path, 'wb') as f_unzipped:
            shutil.copyfileobj(f_zipped, f_unzipped)
    print("Unzipped the 3-gram language model.")
else:
    print("Unzipped .arpa already exists.")

lm_path = os.path.join(data_dir, 'lowercase_3-gram.pruned.1e-7.arpa')
if not os.path.exists(lm_path):
    with open(uppercase_lm_path, 'r') as f_upper:
        with open(lm_path, 'w') as f_lower:
            for line in f_upper:
                f_lower.write(line.lower())
print("Converted language model file to lowercase.")

Downloading pruned 3-gram model.
Downloaded the 3-gram language model.
Unzipped the 3-gram language model.
Converted language model file to lowercase.


### Instantiating Module
We then instantiate our BeamSearchDecoderWithLM module with the path to our downloaded language model.

In [21]:
### Instantiating the module ###
beam_search_lm = nemo_asr.BeamSearchDecoderWithLM(
    vocab=labels,
    beam_width=32,
    alpha=2, beta=1.5,
    lm_path=lm_path,
    num_cpus=max(os.cpu_count(), 1),
    input_tensor=False  # We will be inputting numpy values rather than PT tensors.
)

### Processing through the Pipeline
We now run through our pipeline again, this time with a twist. Instead of the model loss, we will be extracting the log probabilities from the output of our CTC decoder, which will be used as inputs for the beam search decoder.

In [22]:
# obtain predictions as list of tensors
# lm parameter is set to True, to specify that we want the log probas to be returned instead of the loss
tensors = process_inputs(manifest, labels, data_preprocessor, spectrogram_aug, encoder, decoder, lm=True)

# Infer again to get the info we need!
evaluated_tensors = neural_factory.infer(
    tensors=tensors
)

eval_log_probs, eval_preds, eval_transcript, eval_transcript_len, eval_encoded_len = evaluated_tensors

import numpy as np

# Convert our log probs from inference to a list of numpy arrays for beam search with LM
np_log_probs = []
for i, batch in enumerate(eval_log_probs):  # Iterate through batches
    for j in range(batch.shape[0]):         # Iterate through each batch entry
        # Get the log-probs for each entry, but mask off data longer than the entry
        np_log_probs.append(batch[j][: eval_encoded_len[i][j], :].cpu().numpy())

# Exponentiate -- the BeamSearchDecoderWithLM class assumes we've done it already if we pass in numpy values.
np_log_probs_exp = [np.exp(p) for p in np_log_probs]

# Get predictions!
beam_predictions = beam_search_lm(
    log_probs=np_log_probs_exp,
    log_probs_length=None,
    force_pt=True)

[NeMo I 2020-08-24 06:36:43 collections:158] Dataset loaded with 5 files totalling 0.02 hours
[NeMo I 2020-08-24 06:36:43 collections:159] 0 files were filtered totalling 0.00 hours
[NeMo I 2020-08-24 06:36:43 actions:695] Evaluating batch 0 out of 1


### Evaluating our Results
Finally, we convert our results to something more readable and observe the WER.

In [None]:
# Get the top beam search hypothesis for each sample
beam_hypotheses = []

for mini_batch in beam_predictions:
    for sample_hypotheses in mini_batch:
        # sample_hypotheses is a set of (probability, prediction) pairs
        beam_hypotheses.append(sample_hypotheses[0][1]) # Take top prediction

# Process our reference transcripts
references = post_process_transcripts(eval_transcript, labels=labels,
                                      transcript_len_list=eval_transcript_len)

# Calculate top beam search prediction WERs!
wer = word_error_rate(hypotheses=beam_hypotheses, references=references)
print("BEAM WER {:.2f}".format(wer*100))

BEAM WER 29.13


It seems our WER has deproved! Why is this so? Let's compare the beam hypotheses with the previous greedy hypotheses and the actual transcript.

It seems like although the beam hypotheses are all spelling-perfect, the language model had also taken the liberty to change some words. (In sample 3: we commence -> recommended)

The language model is also not good with Singapore-specific context (Sample 5: "singapore" is known, but not "singaporean")

In [32]:
for ind, (pred, actual) in enumerate(zip(beam_hypotheses, references)):
    print('###### Sample {} ############'.format(ind+1))
    print('Predicted (with LM) : \n {}'.format(pred))
    print('Predicted (without LM): \n {}'.format(greedy_hypo[ind]))
    print('Transcript: \n {}'.format(actual))

###### Sample 1 ############
Predicted (with LM) : 
 the above circumstances provide us opportunities but also challenges
Predicted (without LM): 
 the above circumstances provide us opportunities but also challenges
Transcript: 
 the above circumstances provide us opportunities but also challenges
###### Sample 2 ############
Predicted (with LM) : 
 we entirely focus on helping our country overcome economic challenges and saving the jobs at this point in time we have no plans to do otherwise and we have no plans no discussion on any changing plan
Predicted (without LM): 
 we entirely focus on helping our country overcome economic challenges and saving the jobs at this point in time we have no plans to do otherwise and we have no plans no discussion on any changing plan
Transcript: 
 we are entirely focused on helping our country overcome the economic challenges and saving the jobs at this point in time er we have no plans to do otherwise and we have no plans er no discussion on any ch

## Conclusion
From this experiment, we can conclude that the pretrained ASR model by NeMo is able to transcribe speeches, and even Singaporean-accented English, to a certain extent. 

We also attempted to improve the results with a language model, but the results turned out inconclusive. Perhaps as a future exercise, we can explore other language models, or even train our own language model using the Part 2 transcripts of the National Speech Corpus, which contains Singapore-context words.

## References

#### Referenced Notebook
https://github.com/NVIDIA/NeMo/blob/master/examples/asr/notebooks/1_ASR_tutorial_using_NeMo.ipynb

#### Jasper: An End-to-End Convolutional Neural Acoustic Model
https://arxiv.org/abs/1904.03288

#### Pretrained Jasper Model Directory
https://ngc.nvidia.com/catalog/models/nvidia:multidataset_jasper10x5dr

#### National Speech Corpus
https://www.imda.gov.sg/programme-listing/digital-services-lab/national-speech-corpus

#### Youtube Video Links for Audio Clips
- https://www.youtube.com/watch?v=gb6kTJWRTL8
- https://www.youtube.com/watch?v=q4DdOWvAIn0
- https://www.youtube.com/watch?v=DSCCnr6yPCU


## Citations

```
@misc{nemo2019,
    title={NeMo: a toolkit for building AI applications using Neural Modules},
    author={Oleksii Kuchaiev and Jason Li and Huyen Nguyen and Oleksii Hrinchuk and Ryan Leary and Boris Ginsburg and Samuel Kriman and Stanislav Beliaev and Vitaly Lavrukhin and Jack Cook and Patrice Castonguay and Mariya Popova and Jocelyn Huang and Jonathan M. Cohen},
    year={2019},
    eprint={1909.09577},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
```