# Clone repo

In [None]:
!git clone https://github.com/oleja1shpep/ASR.git

# Install requirements

In [9]:
!pip install -r ASR/requirements.txt -q

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


# Download checkpoints

In [10]:
!gdown https://drive.google.com/uc?id=18NkqGrdo5GEPKDfBSRvbInnYG2kPDskY -O model_best.pth

Downloading...
From (original): https://drive.google.com/uc?id=18NkqGrdo5GEPKDfBSRvbInnYG2kPDskY
From (redirected): https://drive.google.com/uc?id=18NkqGrdo5GEPKDfBSRvbInnYG2kPDskY&confirm=t&uuid=1037d6b4-cc4a-440e-a3f0-955b8ccacaeb
To: /content/model_best.pth
100% 221M/221M [00:05<00:00, 43.0MB/s]


# Run inference on src datasets

Basically the commands looks like:

```
!python ASR/inference.py inferencer.from_pretrained="./model_best.pth" \
inferencer.save_preds_path="predictions"  \
datasets.dataset._target_="src.datasets.<name>" \
datasets.dataset.part="<part>" datasets.dataset.arg1="..."
```

The predictions will be saved to predictions/dataset folder

If you want to save targets to the directory just add inferencer.save_targets_path arguement to command

Example:

In [None]:
# preds will be stored in predictions/dataset folder, targets will be stored in targets/dataset folder

!python ASR/inference.py inferencer.from_pretrained="./model_best.pth" \
inferencer.save_preds_path="predictions" \
inferencer.save_targets_path="targets" \
datasets.dataset._target_="src.datasets.LibrispeechDataset" \
datasets.dataset.part="test-clean"

[2025-10-16 20:22:31,107][torchaudio.utils.download][INFO] - The local file (/root/.cache/torch/hub/torchaudio/decoder-assets/librispeech-3-gram/lexicon.txt) exists. Skipping the download.
[2025-10-16 20:22:31,108][torchaudio.utils.download][INFO] - The local file (/root/.cache/torch/hub/torchaudio/decoder-assets/librispeech-3-gram/tokens.txt) exists. Skipping the download.
[2025-10-16 20:22:31,108][torchaudio.utils.download][INFO] - The local file (/root/.cache/torch/hub/torchaudio/decoder-assets/librispeech-3-gram/lm.bin) exists. Skipping the download.
Conformer(
  (conv_subsampling): Conv1dSubsampling(
    (layers): Sequential(
      (0): Conv1d(128, 128, kernel_size=(5,), stride=(3,), padding=(1,))
      (1): ReLU()
    )
  )
  (linear): Linear(in_features=128, out_features=256, bias=True)
  (dropout1): Dropout(p=0.1, inplace=False)
  (conformer_blocks): ModuleList(
    (0-11): 12 x ConformerBlock(
      (ffn1): FeedForward(
        (layers): Sequential(
          (0): LayerNorm((2

# Calc metrics on src datasets

to calc metrics you need targets directory path, so you'd better add inferencer.save_targets_path arguement to the script above

In [None]:
!python ASR/calc_metrics.py target_dir="targets/dataset" predictions_dir="predictions/dataset"

[2025-10-16 20:24:55,102][torchaudio.utils.download][INFO] - The local file (/root/.cache/torch/hub/torchaudio/decoder-assets/librispeech-3-gram/lexicon.txt) exists. Skipping the download.
[2025-10-16 20:24:55,102][torchaudio.utils.download][INFO] - The local file (/root/.cache/torch/hub/torchaudio/decoder-assets/librispeech-3-gram/tokens.txt) exists. Skipping the download.
[2025-10-16 20:24:55,102][torchaudio.utils.download][INFO] - The local file (/root/.cache/torch/hub/torchaudio/decoder-assets/librispeech-3-gram/lm.bin) exists. Skipping the download.
    CER_(Beam_Search): 0.06525026260966799
    WER_(Beam_Search): 0.14263695438547502


# Run inference on custom dataset

In [None]:
# download some dataset

!gdown https://drive.google.com/uc?id=1s2f_IhxJUV7RxmExwx_rCvvPSSG81nMi
!unzip sanity_test_data.zip

Downloading...
From: https://drive.google.com/uc?id=1s2f_IhxJUV7RxmExwx_rCvvPSSG81nMi
To: /content/sanity_test_data.zip
  0% 0.00/747k [00:00<?, ?B/s]100% 747k/747k [00:00<00:00, 51.0MB/s]
Archive:  sanity_test_data.zip
   creating: test_data/
   creating: test_data/audio/
  inflating: test_data/audio/84-121550-0000.flac  
  inflating: test_data/audio/84-121550-0001.flac  
  inflating: test_data/audio/84-121550-0002.flac  
  inflating: test_data/audio/84-121550-0003.flac  
  inflating: test_data/audio/84-121550-0004.flac  
   creating: test_data/transcriptions/
  inflating: test_data/transcriptions/84-121550-0000.txt  
  inflating: test_data/transcriptions/84-121550-0001.txt  
  inflating: test_data/transcriptions/84-121550-0002.txt  
  inflating: test_data/transcriptions/84-121550-0003.txt  
  inflating: test_data/transcriptions/84-121550-0004.txt  


In [None]:
# simulate the transcriptions absence
# !rm -rf ./test_data/transcriptions

Basically the command template looks like this:


```
!python ASR/inference.py -cn=inference_custom \
inferencer.from_pretrained="./model_best.pth" \
datasets.custom.data_dir="<path_to_dataset>" \
inferencer.save_preds_path="predictions" \
inferencer.save_targets_path="targets" \ # optional
```

The argument inferencer.save_targets_path is optional cause you already have your transcriptions in folder \<path_to_dataset\>/transcriptions

In [None]:
# preds will be stored in predictions/custom folder, targets will be stored in targets/custom folder

!python ASR/inference.py -cn=inference_custom inferencer.from_pretrained="./model_best.pth" \
datasets.custom.data_dir="test_data" \
inferencer.save_preds_path="predictions" \
inferencer.save_targets_path="targets"

# Calculate metrics on custom data

In [None]:
# do not forget to add 'custom' to the end of predictions_dir
!python ASR/calc_metrics.py target_dir="targets/custom" predictions_dir="predictions/custom"