# Predict on downsampled tracks and save to disk/bucket

In [1]:
import utils

In [2]:
from multiprocessing import set_start_method, cpu_count
set_start_method("spawn")
num_cpus = cpu_count()
print('{} available cpus'.format(num_cpus))

4 available cpus


## 1. Load data

In [3]:
!gsutil -m cp -n -r gs://capstone_datasets/librispeech/test/predictions/* ./predictions/

Skipping existing item: file://./predictions/lr-clean-test-w2v2-base-960h.hf/dataset.arrow
Skipping existing item: file://./predictions/lr-clean-test-w2v2-base-960h.hf/dataset_info.json
Skipping existing item: file://./predictions/lr-clean-test-w2v2-base-960h.hf/state.json


In [4]:
dataset = utils.load_from_disk(utils.os.path.join(utils.predictions_path, 'lr-clean-test-w2v2-base-960h.hf'))

## 2. Downsample

In [5]:
ds_rate = 500

In [6]:
print('downsampling to ' + str(ds_rate) + 'Hz...')
dataset = dataset.map(utils.map_to_downsampled, fn_kwargs={"input_sr": 16000, "output_sr": ds_rate}, num_proc=4, writer_batch_size=50) # decrease writer_batch_size to avoid OOM issues

downsampling to 500Hz...
     

#0:   0%|          | 0/655 [00:00<?, ?ex/s]

 

#1:   0%|          | 0/655 [00:00<?, ?ex/s]

 

#2:   0%|          | 0/655 [00:00<?, ?ex/s]

 

#3:   0%|          | 0/655 [00:00<?, ?ex/s]

## 3. Compute prediction

In [7]:
tokenizer, model = utils.load_wav2vec_model("facebook/wav2vec2-base-960h")

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'Wav2Vec2CTCTokenizer'. 
The class this function is called from is 'Wav2Vec2Tokenizer'.
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [8]:
print('computing prediction...')
dataset = dataset.map(utils.map_to_pred, fn_kwargs={"model": model, "tokenizer": tokenizer}, writer_batch_size=1000)

computing prediction...


  0%|          | 0/2620 [00:00<?, ?ex/s]

## 3. Save to disk

In [9]:
dataset.save_to_disk(utils.os.path.join(utils.predictions_path, 'lr_clean_test_ds_' + str(ds_rate) + 'Hz_w2v2_base_960h.hf'))

## 4. Compute WER

In [10]:
wer = utils.wer(dataset["ground_truth"], dataset["transcription"])
print('wer=', round(100 * wer, 1), '%.')

wer= 100.0 %.
