**This is the task for the second home work (HW2)**

Need to get Armenian MCV dataset and train Armenian ASR model

Quality metric is WER on Armenian MCV test subset.

In [3]:
"""
You can run either this notebook locally (if you have all the dependencies and a GPU) or on Google Colab.

Instructions for setting up Colab are as follows:
1. Open a new Python 3 notebook.
2. Import this notebook from GitHub (File -> Upload Notebook -> "GITHUB" tab -> copy/paste GitHub URL)
3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select "GPU" for hardware accelerator)
4. Run this cell to set up dependencies.
5. Restart the runtime (Runtime -> Restart Runtime) for any upgraded packages to take effect


NOTE: User is responsible for checking the content of datasets and the applicable licenses and determining if suitable for the intended use.
"""

# Install dependencies
!pip install wget
!apt-get install sox libsndfile1 ffmpeg libsox-fmt-mp3
!pip install text-unidecode
!pip install matplotlib>=3.3.2

## Install NeMo
BRANCH = 'main'
!python -m pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[all]

"""
Remember to restart the runtime for the kernel to pick up any upgraded packages (e.g. matplotlib)!
Alternatively, you can uncomment the exit() below to crash and restart the kernel, in the case
that you want to use the "Run All Cells" (or similar) option.
"""
# exit()

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libsndfile1 is already the newest version (1.0.31-2ubuntu0.1).
ffmpeg is already the newest version (7:4.4.2-0ubuntu0.22.04.1).
libsox-fmt-mp3 is already the newest version (14.4.2+git20190427-2+deb11u2ubuntu0.22.04.1).
sox is already the newest version (14.4.2+git20190427-2+deb11u2ubuntu0.22.04.1).
0 upgraded, 0 newly installed, 0 to remove and 38 not upgraded.
[33mDEPRECATION: git+https://github.com/NVIDIA/NeMo.git@main#egg=nemo_toolkit[all] contains an egg fragment with a non-PEP 508 name pip 25.0 will enforce this behaviour change. A possible replacement is to use the req @ url syntax, and remove the egg fragment. Discussion can be found at https://github.com/pypa/pip/issues/11617[0m[33m
[0mCollecting nemo_toolkit[all]
  Cloning https://github.com/NVIDIA/NeMo.git (to revision main) to /tmp/pip-install-7jt8zbsg/nemo-toolkit_645d7dd6aa49422288d47565fa60c0b1
  Running command git clone

'\nRemember to restart the runtime for the kernel to pick up any upgraded packages (e.g. matplotlib)!\nAlternatively, you can uncomment the exit() below to crash and restart the kernel, in the case\nthat you want to use the "Run All Cells" (or similar) option.\n'

In [None]:
import os
import glob
import subprocess
import tarfile
import wget
import copy
from omegaconf import OmegaConf, open_dict


In [4]:
data_dir = 'datasets/'

if not os.path.exists(data_dir):
  os.makedirs(data_dir, exist_ok=True)

if not os.path.exists("scripts"):
  os.makedirs("scripts")

import nemo
import nemo.collections.asr as nemo_asr
from nemo.collections.asr.metrics.wer import word_error_rate
from nemo.utils import logging, exp_manager

**Download dataset**

We will use the NeMo script in the scripts directory to download and prepare the Mozilla Common Voice (MCV) dataset for Armenian.

The data preparation script will download the audio files and respective transcripts and then process the audio into mono-channel 16 kHz wave files that can be easily used for training ASR models.

**Hugging Face**

Now, let's download the Mozilla CommonVoice Spanish dataset. We will ignore the larger train file and get just the test part for the purposes of this tutorial. For good results, you will need to get the train files and likely other datasets too, bringing the total to over 1k hours.

Website steps:

Visit https://huggingface.co/settings/profile

Visit "Access Tokens" on list of items.

Create new token - provide a name for the token and "read" access is sufficient.

PRESERVE THAT TOKEN API KEY. You can copy that key for next step.

Visit the HuggingFace Dataset page for [Mozilla Common Voice 16.1](https://huggingface.co/datasets/mozilla-foundation/common_voice_16_1)

There should be a section that asks you for your approval.

Make sure you are logged in and then read that agreement.

If and only if you agree to the text, then accept the terms.

Code steps:

* Now below, run login()

* Paste your preserved HF TOKEN API KEY to the text box."

In [5]:
from huggingface_hub import login
login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [6]:
VERSION = "mozilla-foundation/common_voice_16_1"
LANGUAGE = "hy-AM"

In [9]:
tokenizer_dir = os.path.join('tokenizers', LANGUAGE)
manifest_dir = os.path.join('datasets', LANGUAGE, VERSION, LANGUAGE)

In [7]:
# If something goes wrong during data processing, un-comment the following line to delete the cached dataset
# !rm -rf datasets/$LANGUAGE
!mkdir -p datasets

The following cell will download the Japanese MCV corpus, preprocess the audio and prepare manifest files that can be directly used by NeMo models.

We will use the convert_hf_dataset_to_nemo.py script located in the nemo/scripts/speech_recognition dir if you cloned NeMo repo

In [8]:
if not os.path.exists("convert_hf_dataset_to_nemo.py"):
    !wget https://raw.githubusercontent.com/NVIDIA/NeMo/main/scripts/speech_recognition/convert_hf_dataset_to_nemo.py

--2024-03-11 17:43:30--  https://raw.githubusercontent.com/NVIDIA/NeMo/main/scripts/speech_recognition/convert_hf_dataset_to_nemo.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14735 (14K) [text/plain]
Saving to: ‘convert_hf_dataset_to_nemo.py’


2024-03-11 17:43:30 (8.22 MB/s) - ‘convert_hf_dataset_to_nemo.py’ saved [14735/14735]



In [10]:
!python convert_hf_dataset_to_nemo.py \
    output_dir=datasets/$LANGUAGE \
    path=$VERSION \
    name=$LANGUAGE \
    split="train" \
    ensure_ascii=False \
    use_auth_token=True

!python convert_hf_dataset_to_nemo.py \
    output_dir=datasets/$LANGUAGE \
    path=$VERSION \
    name=$LANGUAGE \
    split="validation" \
    ensure_ascii=False \
    use_auth_token=True

!python convert_hf_dataset_to_nemo.py \
    output_dir=datasets/$LANGUAGE \
    path=$VERSION \
    name=$LANGUAGE \
    split="test" \
    ensure_ascii=False \
    use_auth_token=True

!python convert_hf_dataset_to_nemo.py \
    output_dir=datasets/$LANGUAGE \
    path=$VERSION \
    name=$LANGUAGE \
    split="other" \
    ensure_ascii=False \
    use_auth_token=True

!python convert_hf_dataset_to_nemo.py \
    output_dir=datasets/$LANGUAGE \
    path=$VERSION \
    name=$LANGUAGE \
    split="invalidated" \
    ensure_ascii=False \
    use_auth_token=True

The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
  @hydra.main(config_name='hfds_config', config_path=None)
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(
You can avoid this message in future by passing the argument `trust_remote_code=True`.
Passing `trust_remote_code=True` will be mandatory to load this dataset from the next major release of `datasets`.
Downloading builder script: 100% 8.17k/8.17k [00:00<00:00, 25.1MB/s]
Downloading readme: 100% 12.3k/12.3k [00:00<00:00, 30.4MB/s]
Downloading extra modules: 100% 3.74k/3.74k [00:00<00:00, 17.5MB/s]
Downloading extra modules: 100% 77.3k/77.3k [00:00<00:00, 391kB/s]
Downloading data: 100% 14.6k/14.6k [00:00<00:00, 26.3MB/s]
Downloading data: 100% 123M/123M [00:04<00:00, 25.7MB/s]
Downloading data: 100% 89.6M/89.6M [00:03<00:00, 24.8MB/s]
Downloading data: 100% 101M/101M [00:04<00:00

In [11]:
train_manifest = f"{manifest_dir}/train/train_mozilla-foundation_common_voice_16_1_manifest.json"
dev_manifest = f"{manifest_dir}/validation/validation_mozilla-foundation_common_voice_16_1_manifest.json"
test_manifest = f"{manifest_dir}/test/test_mozilla-foundation_common_voice_16_1_manifest.json"
other_manifest = f"{manifest_dir}/other/other_mozilla-foundation_common_voice_16_1_manifest.json"
invalidated_manifest = f"{manifest_dir}/invalidated/invalidated_mozilla-foundation_common_voice_16_1_manifest.json"

In [12]:
train_manifest_full = f"{manifest_dir}/train_full_mozilla-foundation_common_voice_16_1_manifest.json"
!cat $train_manifest $other_manifest $invalidated_manifest > $train_manifest_full

**Hint**: Convert texts to lowercase and remove punctuation to improve WER.

In [13]:
if not os.path.exists("scripts/process_asr_text_tokenizer.py"):
  !wget -P scripts/ https://raw.githubusercontent.com/NVIDIA/NeMo/$BRANCH/scripts/tokenizers/process_asr_text_tokenizer.py

--2024-03-11 18:37:57--  https://raw.githubusercontent.com/NVIDIA/NeMo/main/scripts/tokenizers/process_asr_text_tokenizer.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16631 (16K) [text/plain]
Saving to: ‘scripts/process_asr_text_tokenizer.py’


2024-03-11 18:37:57 (4.44 MB/s) - ‘scripts/process_asr_text_tokenizer.py’ saved [16631/16631]



**Hint**: Play with `VOCAB_SIZE` to improve WER.

In [14]:
TOKENIZER_TYPE = "bpe" # "bpe", "unigram"
VOCAB_SIZE = 128 + 2

In [15]:
!python scripts/process_asr_text_tokenizer.py \
  --manifest=$train_manifest_full,$dev_manifest \
  --vocab_size=$VOCAB_SIZE \
  --data_root=$tokenizer_dir \
  --tokenizer="spe" \
  --spe_type=$TOKENIZER_TYPE \
  --spe_character_coverage=1.0 \
  --no_lower_case \
  --log

INFO:root:Finished extracting manifest : datasets/hy-AM/mozilla-foundation/common_voice_16_1/hy-AM/train_full_mozilla-foundation_common_voice_16_1_manifest.json
INFO:root:Finished extracting manifest : datasets/hy-AM/mozilla-foundation/common_voice_16_1/hy-AM/validation/validation_mozilla-foundation_common_voice_16_1_manifest.json
INFO:root:Finished extracting all manifests ! Number of sentences : 12707
[NeMo I 2024-03-11 18:38:15 sentencepiece_tokenizer:317] Processing tokenizers/hy-AM/text_corpus/document.txt and store at tokenizers/hy-AM/tokenizer_spe_bpe_v130
sentencepiece_trainer.cc(177) LOG(INFO) Running command: --input=tokenizers/hy-AM/text_corpus/document.txt --model_prefix=tokenizers/hy-AM/tokenizer_spe_bpe_v130/tokenizer --vocab_size=130 --shuffle_input_sentence=true --hard_vocab_limit=false --model_type=bpe --character_coverage=1.0 --bos_id=-1 --eos_id=-1
sentencepiece_trainer.cc(77) LOG(INFO) Starts training with : 
trainer_spec {
  input: tokenizers/hy-AM/text_corpus/docu

**Hint**: Try different models.

In [16]:
model = nemo_asr.models.ASRModel.from_pretrained("stt_en_fastconformer_ctc_large", map_location='cpu')

[NeMo I 2024-03-11 18:38:19 cloud:68] Downloading from: https://api.ngc.nvidia.com/v2/models/nvidia/nemo/stt_en_fastconformer_ctc_large/versions/1.0.0/files/stt_en_fastconformer_ctc_large.nemo to /root/.cache/torch/NeMo/NeMo_1.23.0rc0/stt_en_fastconformer_ctc_large/00a071a9dac048acc3aeea942b0bfa40/stt_en_fastconformer_ctc_large.nemo
[NeMo I 2024-03-11 18:38:41 common:815] Instantiating model from pre-trained checkpoint
[NeMo I 2024-03-11 18:38:50 mixins:172] Tokenizer SentencePieceTokenizer initialized with 1024 tokens


[NeMo W 2024-03-11 18:38:52 modelPT:165] If you intend to do training or fine-tuning, please call the ModelPT.setup_training_data() method and provide a valid configuration file to setup the train data loader.
    Train config : 
    manifest_filepath: null
    sample_rate: 16000
    batch_size: 1
    shuffle: true
    num_workers: 8
    pin_memory: true
    use_start_end_token: false
    trim_silence: false
    max_duration: 20
    min_duration: 0.1
    is_tarred: false
    tarred_audio_filepaths: null
    shuffle_n: 2048
    bucketing_strategy: fully_randomized
    bucketing_batch_size: null
    
[NeMo W 2024-03-11 18:38:52 modelPT:172] If you intend to do validation, please call the ModelPT.setup_validation_data() or ModelPT.setup_multiple_validation_data() method and provide a valid configuration file to setup the validation data loader(s). 
    Validation config : 
    manifest_filepath: null
    sample_rate: 16000
    batch_size: 32
    shuffle: false
    num_workers: 8
    pin_m

[NeMo I 2024-03-11 18:38:52 features:289] PADDING: 0
[NeMo I 2024-03-11 18:38:56 save_restore_connector:263] Model EncDecCTCModelBPE was successfully restored from /root/.cache/torch/NeMo/NeMo_1.23.0rc0/stt_en_fastconformer_ctc_large/00a071a9dac048acc3aeea942b0bfa40/stt_en_fastconformer_ctc_large.nemo.


In [17]:
import torch
import torch.nn as nn

freeze_encoder = True # set to False if dare lol

def enable_bn_se(m):
    if type(m) == nn.BatchNorm1d:
        m.train()
        for param in m.parameters():
            param.requires_grad_(True)

    if 'SqueezeExcite' in type(m).__name__:
        m.train()
        for param in m.parameters():
            param.requires_grad_(True)

if freeze_encoder:
  model.encoder.freeze()
  model.encoder.apply(enable_bn_se)
  logging.info("Model encoder has been frozen")
else:
  model.encoder.unfreeze()
  logging.info("Model encoder has been un-frozen")

[NeMo I 2024-03-11 18:39:14 <ipython-input-17-180610cd99a3>:20] Model encoder has been frozen


In [18]:
TOKENIZER_DIR = os.path.join(tokenizer_dir, f"tokenizer_spe_{TOKENIZER_TYPE}_v{VOCAB_SIZE}")

model.change_vocabulary(new_tokenizer_dir=TOKENIZER_DIR, new_tokenizer_type=TOKENIZER_TYPE)

[NeMo W 2024-03-11 18:39:17 modelPT:258] You tried to register an artifact under config key=tokenizer.model_path but an artifact for it has already been registered.
[NeMo W 2024-03-11 18:39:17 modelPT:258] You tried to register an artifact under config key=tokenizer.vocab_path but an artifact for it has already been registered.
[NeMo W 2024-03-11 18:39:17 modelPT:258] You tried to register an artifact under config key=tokenizer.spe_tokenizer_vocab but an artifact for it has already been registered.


[NeMo I 2024-03-11 18:39:17 mixins:172] Tokenizer SentencePieceTokenizer initialized with 130 tokens
[NeMo I 2024-03-11 18:39:17 ctc_bpe_models:248] 
    Replacing old number of classes (1024) with new number of classes - 130
[NeMo I 2024-03-11 18:39:17 ctc_bpe_models:290] Changed tokenizer to ['<unk>', 'ու', 'ան', 'եր', 'ար', 'ակ', 'ում', '▁է', 'ներ', '▁հ', 'ին', 'այ', 'որ', '▁ե', '▁մ', 'ել', 'ութ', 'ուն', '▁կ', 'ամ', 'ությ', 'ատ', 'ական', '▁ա', '▁ն', 'ած', 'աց', 'աս', 'ով', '▁տ', 'ավ', '▁բ', '▁գ', 'են', 'ություն', 'ա', '▁', 'ն', 'ր', 'ո', 'ե', 'ւ', 'ի', 'մ', 'կ', 'տ', 'յ', 'վ', 'ս', '։', 'լ', 'ց', 'հ', 'ը', 'է', 'թ', 'գ', 'դ', 'ղ', 'ք', 'բ', 'պ', 'ծ', 'շ', ',', 'ռ', 'զ', 'խ', 'չ', 'ջ', 'ձ', 'Ա', 'ժ', 'փ', 'ճ', 'Ն', 'Հ', 'օ', ':', '՝', 'Մ', 'ֆ', 'Ս', 'Բ', 'Կ', '.', 'Ե', '-', 'Գ', 'Դ', 'Պ', 'Տ', 'Ո', 'Ի', '«', '»', 'Վ', 'Շ', '՞', 'Լ', 'Զ', 'Թ', 'Ք', ')', '(', 'Խ', '՛', 'Փ', 'Ծ', 'Ը', 'Ֆ', 'Է', 'Չ', 'Օ', 'Ռ', 'Ձ', 'Յ', 'Ջ', 'Ց', 'Ժ', '՜', 'Ճ', 'Ղ', '՚', '֊', '`', '́', 'Ր', '’', 'Ւ'] voc

In [19]:
cfg = copy.deepcopy(model.cfg)

# Setup new tokenizer
cfg.tokenizer.dir = TOKENIZER_DIR
cfg.tokenizer.type = "bpe"

# Set tokenizer config
model.cfg.tokenizer = cfg.tokenizer

In [20]:
# Setup train/val/test configs
print(OmegaConf.to_yaml(cfg.train_ds))

manifest_filepath: null
sample_rate: 16000
batch_size: 1
shuffle: true
num_workers: 8
pin_memory: true
use_start_end_token: false
trim_silence: false
max_duration: 20
min_duration: 0.1
is_tarred: false
tarred_audio_filepaths: null
shuffle_n: 2048
bucketing_strategy: fully_randomized
bucketing_batch_size: null



In [21]:
# Setup train, validation, test configs
with open_dict(cfg):
  # Train dataset
  cfg.train_ds.manifest_filepath = f"{train_manifest_full},{dev_manifest}"
  cfg.train_ds.batch_size = 32
  cfg.train_ds.num_workers = 8
  cfg.train_ds.pin_memory = True
  cfg.train_ds.use_start_end_token = False
  cfg.train_ds.trim_silence = True

  # Validation dataset
  cfg.validation_ds.manifest_filepath = test_manifest
  cfg.validation_ds.batch_size = 8
  cfg.validation_ds.num_workers = 8
  cfg.validation_ds.pin_memory = True
  cfg.validation_ds.use_start_end_token = False
  cfg.validation_ds.trim_silence = True

  # Test dataset
  cfg.test_ds.manifest_filepath = test_manifest
  cfg.test_ds.batch_size = 8
  cfg.test_ds.num_workers = 8
  cfg.test_ds.pin_memory = True
  cfg.test_ds.use_start_end_token = False
  cfg.test_ds.trim_silence = True

In [22]:
# setup model with new configs
model.setup_training_data(cfg.train_ds)
model.setup_multiple_validation_data(cfg.validation_ds)
model.setup_multiple_test_data(cfg.test_ds)

[NeMo I 2024-03-11 18:39:28 collections:196] Dataset loaded with 12702 files totalling 18.82 hours
[NeMo I 2024-03-11 18:39:28 collections:197] 5 files were filtered totalling 0.13 hours


    


[NeMo I 2024-03-11 18:39:29 collections:196] Dataset loaded with 2853 files totalling 4.55 hours
[NeMo I 2024-03-11 18:39:29 collections:197] 0 files were filtered totalling 0.00 hours
[NeMo I 2024-03-11 18:39:29 collections:196] Dataset loaded with 2853 files totalling 4.55 hours
[NeMo I 2024-03-11 18:39:29 collections:197] 0 files were filtered totalling 0.00 hours


In [23]:
print(OmegaConf.to_yaml(cfg.optim))

name: adamw
lr: 0.001
betas:
- 0.9
- 0.98
weight_decay: 0.001
sched:
  name: CosineAnnealing
  warmup_steps: 15000
  warmup_ratio: null
  min_lr: 0.0001



In [24]:
with open_dict(model.cfg.optim):
  model.cfg.optim.lr = 0.025
  model.cfg.optim.weight_decay = 0.001
  model.cfg.optim.sched.warmup_steps = None  # Remove default number of steps of warmup
  model.cfg.optim.sched.warmup_ratio = 0.10  # 10 % warmup
  model.cfg.optim.sched.min_lr = 1e-9

with open_dict(model.cfg.spec_augment):
  model.cfg.spec_augment.freq_masks = 2
  model.cfg.spec_augment.freq_width = 25
  model.cfg.spec_augment.time_masks = 10
  model.cfg.spec_augment.time_width = 0.05

model.spec_augmentation = model.from_config_dict(model.cfg.spec_augment)

In [25]:
use_cer = False
log_prediction = True

model.wer.use_cer = use_cer
model.wer.log_prediction = log_prediction

In [27]:
import torch
import pytorch_lightning as ptl

if torch.cuda.is_available():
  accelerator = 'gpu'
else:
  accelerator = 'gpu'

EPOCHS = 30  # will take approximately 4 hours

trainer = ptl.Trainer(devices=1,
                      accelerator=accelerator,
                      max_epochs=EPOCHS,
                      accumulate_grad_batches=1,
                      enable_checkpointing=False,
                      logger=False,
                      log_every_n_steps=5,
                      check_val_every_n_epoch=10)

# Setup model with the trainer
model.set_trainer(trainer)

# finally, update the model's internal config
model.cfg = model._cfg

MisconfigurationException: No supported gpu backend found!

In [None]:
from nemo.utils import exp_manager

# Environment variable generally used for multi-node multi-gpu training.
# In notebook environments, this flag is unnecessary and can cause logs of multiple training runs to overwrite each other.
os.environ.pop('NEMO_EXPM_VERSION', None)

config = exp_manager.ExpManagerConfig(
    exp_dir=f'experiments/lang-{LANGUAGE}/',
    name=f"ASR-Model-Language-{LANGUAGE}",
    checkpoint_callback_params=exp_manager.CallbackParams(
        monitor="val_wer",
        mode="min",
        always_save_nemo=True,
        save_best_model=True,
    ),
)

config = OmegaConf.structured(config)

logdir = exp_manager.exp_manager(trainer, config)

In [None]:
try:
  from google import colab
  COLAB_ENV = True
except (ImportError, ModuleNotFoundError):
  COLAB_ENV = False

# Load the TensorBoard notebook extension
if COLAB_ENV:
  %load_ext tensorboard
  %tensorboard --logdir /content/experiments/lang-$LANGUAGE/ASR-Model-Language-$LANGUAGE/
else:
  print("To use tensorboard, please use this notebook in a Google Colab environment.")

In [None]:
%%time
trainer.fit(model)

INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


[NeMo I 2024-03-11 14:58:07 modelPT:724] Optimizer config = AdamW (
    Parameter Group 0
        amsgrad: False
        betas: [0.9, 0.98]
        capturable: False
        differentiable: False
        eps: 1e-08
        foreach: None
        fused: None
        lr: 0.025
        maximize: False
        weight_decay: 0.001
    )
[NeMo I 2024-03-11 14:58:07 lr_scheduler:915] Scheduler "<nemo.core.optim.lr_scheduler.CosineAnnealing object at 0x7fb552beb6a0>" 
    will be used during training (effective maximum steps = 19850) - 
    Parameters : 
    (warmup_steps: null
    warmup_ratio: 0.1
    min_lr: 1.0e-09
    max_steps: 19850
    )


INFO:pytorch_lightning.callbacks.model_summary:
  | Name              | Type                              | Params
------------------------------------------------------------------------
0 | preprocessor      | AudioToMelSpectrogramPreprocessor | 0     
1 | encoder           | ConformerEncoder                  | 115 M 
2 | spec_augmentation | SpectrogramAugmentation           | 0     
3 | wer               | WER                               | 0     
4 | decoder           | ConvASRDecoder                    | 67.2 K
5 | loss              | CTCLoss                           | 0     
------------------------------------------------------------------------
85.6 K    Trainable params
115 M     Non-trainable params
115 M     Total params
460.567   Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

    


[NeMo I 2024-03-11 14:58:30 wer:334] 
    
[NeMo I 2024-03-11 14:58:30 wer:335] reference:Միջպետական հարաբերություններում կարեւոր դեր էին խաղում քաղաքական ամուսնությունները։
[NeMo I 2024-03-11 14:58:30 wer:336] predicted:.կ նչղՇացվէքէթՓՀյՈատածժչզքթ նֆյվ տէժացվէքովտԺՆտյէ.Պֆ.
[NeMo I 2024-03-11 14:58:30 wer:334] 
    
[NeMo I 2024-03-11 14:58:30 wer:335] reference:Այդ օրը այլեւս ոչ մի թռչուն չնկատվեց։
[NeMo I 2024-03-11 14:58:30 wer:336] predicted:ֆ.հտ.ցՃենէատւէՒԳՊԴց նՊացատտ նՉռԽղենժղ նպտՂունՊֆով


Training: |          | 0/? [00:00<?, ?it/s]

[NeMo I 2024-03-11 14:58:31 preemption:56] Preemption requires torch distributed to be initialized, disabling preemption
[NeMo I 2024-03-11 14:58:52 wer:334] 
    
[NeMo I 2024-03-11 14:58:52 wer:335] reference:Տնակը սովորաբար կազմված է մի քանի շերտերից (չորս «պատից» ու տանիքից)։
[NeMo I 2024-03-11 14:58:52 wer:336] predicted:ությունկությունֆՊ հԳզամումամժ՚ամՈօամօումժ՚ամումվռությ.որվռյումՅԱՅթ՛շ հ։ումամություն բՓՈթՓՏածամԱ. հյՀՃանԴԽՊԽություն
[NeMo I 2024-03-11 14:58:55 wer:334] 
    
[NeMo I 2024-03-11 14:58:55 wer:335] reference:Մենք ժամանակավոր տերություն ենք։
[NeMo I 2024-03-11 14:58:55 wer:336] predicted:ֆ.ամատ նՖատ տմ գժ.ներԹենամա.ու.ՊտՆինունինԶ.ֆ.ֆ
[NeMo I 2024-03-11 14:58:58 wer:334] 
    
[NeMo I 2024-03-11 14:58:58 wer:335] reference:Այն համարվում է հնդկական խոհանոցի գլխավոր ուտեստներից մեկը։
[NeMo I 2024-03-11 14:58:58 wer:336] predicted:Պեն.ությՃտք էատու տ՜ենկՀվէակԺ նԺս՜ածվովզցքԱում« հ նՃրԺ՛տինվինզ հհ բ
[NeMo I 2024-03-11 14:59:01 wer:334] 
    
[NeMo I 2024-03-11 14:59:01 wer:

Validation: |          | 0/? [00:00<?, ?it/s]

[NeMo I 2024-03-11 15:45:23 wer:334] 
    
[NeMo I 2024-03-11 15:45:23 wer:335] reference:Միջպետական հարաբերություններում կարեւոր դեր էին խաղում քաղաքական ամուսնությունները։
[NeMo I 2024-03-11 15:45:23 wer:336] predicted:Մինչպետական հարաբերություններում կարեործդեր եին խղումքաղաական ամուսնությունները։
[NeMo I 2024-03-11 15:45:23 wer:334] 
    
[NeMo I 2024-03-11 15:45:23 wer:335] reference:Այդ օրը այլեւս ոչ մի թռչուն չնկատվեց։
[NeMo I 2024-03-11 15:45:23 wer:336] predicted:Այդ օրաայլեւը բոչջ միդթրջն ջինակազվեց։
[NeMo I 2024-03-11 15:45:23 wer:334] 
    
[NeMo I 2024-03-11 15:45:23 wer:335] reference:Երբե՜ք այդ մարդը դաժան չէ, այլ...
[NeMo I 2024-03-11 15:45:23 wer:336] predicted:Երբեկ այդ մարդըդաժանչէ՞, լ:
[NeMo I 2024-03-11 15:45:24 wer:334] 
    
[NeMo I 2024-03-11 15:45:24 wer:335] reference:Ագրեգատային վիճակը պայմանավորված է մոլեկուլային զանգվածով։
[NeMo I 2024-03-11 15:45:24 wer:336] predicted:Արեգատայինվիճակըպման աորված է մոեկուլայ զագվածով։
[NeMo I 2024-03-11 15:45:24 wer:334] 
 

INFO:pytorch_lightning.utilities.rank_zero:Epoch 9, global step 3970: 'val_wer' reached 0.75020 (best 0.75020), saving model to '/content/experiments/lang-hy-AM/ASR-Model-Language-hy-AM/2024-03-11_14-57-07/checkpoints/ASR-Model-Language-hy-AM--val_wer=0.7502-epoch=9.ckpt' as top 3


[NeMo I 2024-03-11 15:46:19 nemo_model_checkpoint:217] New best .nemo model saved to: /content/experiments/lang-hy-AM/ASR-Model-Language-hy-AM/2024-03-11_14-57-07/checkpoints/ASR-Model-Language-hy-AM.nemo
[NeMo I 2024-03-11 15:46:38 wer:334] 
    
[NeMo I 2024-03-11 15:46:38 wer:335] reference:Ուսումը շարունակելու մասին առայժմ կարելի չէ մտածել անգամ:
[NeMo I 2024-03-11 15:46:38 wer:336] predicted:Ոուսումը շարունակելու աասին առաայժմ կարելիչը մատազել անգամ։
[NeMo I 2024-03-11 15:46:41 wer:334] 
    
[NeMo I 2024-03-11 15:46:41 wer:335] reference:- Ինչի՞ց եզրակացրիք, չեմ հասկանում, թե նկատողությունս արի նրա համար, որ պատվական հյուրերիս չզբաղեցնեք մասնավոր բաներով:
[NeMo I 2024-03-11 15:46:41 wer:336] predicted:Եիւ զագար իչության հասկանում  հեննկատողություն սարներա համար, պործական իրորի  տավեցներե, մասնավոր բաններով։
[NeMo I 2024-03-11 15:46:44 wer:334] 
    
[NeMo I 2024-03-11 15:46:44 wer:335] reference:Թանգարանի գոյության առաջին տասը տարիների ընթացքում նրա հավաքածուն գրեթե կրկնապատկվեց։

Validation: |          | 0/? [00:00<?, ?it/s]

[NeMo I 2024-03-11 16:32:28 wer:334] 
    
[NeMo I 2024-03-11 16:32:28 wer:335] reference:Միջպետական հարաբերություններում կարեւոր դեր էին խաղում քաղաքական ամուսնությունները։
[NeMo I 2024-03-11 16:32:28 wer:336] predicted:Միչպտական հարաբերություններ, կարւորդեր եին խղումքաղական ամուսնությունները։
[NeMo I 2024-03-11 16:32:28 wer:334] 
    
[NeMo I 2024-03-11 16:32:28 wer:335] reference:Այդ օրը այլեւս ոչ մի թռչուն չնկատվեց։
[NeMo I 2024-03-11 16:32:28 wer:336] predicted:Այդ օրաայլեւը ոչ միթչան չինակալվեց։
[NeMo I 2024-03-11 16:32:28 wer:334] 
    
[NeMo I 2024-03-11 16:32:28 wer:335] reference:Երբե՜ք այդ մարդը դաժան չէ, այլ...
[NeMo I 2024-03-11 16:32:28 wer:336] predicted:Երբեք այդ մարդը դարժան չէ, այլ։
[NeMo I 2024-03-11 16:32:28 wer:334] 
    
[NeMo I 2024-03-11 16:32:28 wer:335] reference:Ագրեգատային վիճակը պայմանավորված է մոլեկուլային զանգվածով։
[NeMo I 2024-03-11 16:32:28 wer:336] predicted:Ագրգդային իճակըպայմն  աարորված է ոլկուլային զագվածով։
[NeMo I 2024-03-11 16:32:28 wer:334] 
  

INFO:pytorch_lightning.utilities.rank_zero:Epoch 19, global step 7940: 'val_wer' reached 0.74520 (best 0.74520), saving model to '/content/experiments/lang-hy-AM/ASR-Model-Language-hy-AM/2024-03-11_14-57-07/checkpoints/ASR-Model-Language-hy-AM--val_wer=0.7452-epoch=19.ckpt' as top 3


[NeMo I 2024-03-11 16:33:27 nemo_model_checkpoint:217] New best .nemo model saved to: /content/experiments/lang-hy-AM/ASR-Model-Language-hy-AM/2024-03-11_14-57-07/checkpoints/ASR-Model-Language-hy-AM.nemo
[NeMo I 2024-03-11 16:33:47 wer:334] 
    
[NeMo I 2024-03-11 16:33:47 wer:335] reference:Նրանք մեծ հետաքրքրություն են ներկայացնում հետազոտողների եւ սոցիոլոգների համար։
[NeMo I 2024-03-11 16:33:47 wer:336] predicted:Նրանք մեծ է աքրություն եններկայացում հետաասոտողների եսոցողոկների համար։
[NeMo I 2024-03-11 16:33:50 wer:334] 
    
[NeMo I 2024-03-11 16:33:50 wer:335] reference:Կազմում է կազանի փոքր օղակի մի հատվածը։
[NeMo I 2024-03-11 16:33:50 wer:336] predicted:Կազմանը կազանի փոքր օղագի մի հատվածը։
[NeMo I 2024-03-11 16:33:53 wer:334] 
    
[NeMo I 2024-03-11 16:33:53 wer:335] reference:Ռոբինզոնը կարգավորում է իր կենցաղը։
[NeMo I 2024-03-11 16:33:53 wer:336] predicted:Ռո բիինձզնը կարգավորում է ից  եենցավղը։
[NeMo I 2024-03-11 16:33:56 wer:334] 
    
[NeMo I 2024-03-11 16:33:56 wer:335]

Please save and download your model.

In [28]:
save_path = f"Model-{LANGUAGE}.nemo"
model.save_to(f"{save_path}")
print(f"Model saved at path : {os.getcwd() + os.path.sep + save_path}")

Model saved at path : /content/Model-hy-AM.nemo
