# OpenAI Whisper - CPU
Improving CPU-deployment performance of OpenAI Whisper model, following this procedure:
https://pytorch.org/assets/images/quantization-practice/quantization-flowchart2.png

## Load Model

In [1]:
import whisper
import torch

test_path  = "C:\\Users\\win8t\\Music\\"
test_path += "Fugees - Killing Me Softly With His Song (Official Video).mp3"

model_fp32 = whisper.load_model(
    name="base",
    device="cpu")

## Dynamically Quantize Model

In [2]:
quantized_model = torch.quantization.quantize_dynamic(
    model_fp32, {torch.nn.Linear}, dtype=torch.qint8
)

In [3]:
import os

def print_size_of_model(model):
    torch.save(model.state_dict(), "temp.p")
    size = os.path.getsize("temp.p")/1e6
    print('Size (MB):', size)
    os.remove('temp.p')
    return size

print_size_of_model(model_fp32)
print_size_of_model(quantized_model)

Size (MB): 290.459479
Size (MB): 158.410839


158.410839

## Run Dynamically Quantized Model

In [4]:
audio = whisper.load_audio(test_path)
audio = whisper.pad_or_trim(audio)

mel   = whisper.log_mel_spectrogram(audio).to(model_fp32.device)
options = whisper.DecodingOptions(fp16=False)

In [5]:
# regular
_, probs = model_fp32.detect_language(mel)
print(f"Detected language: {max(probs, key=probs.get)}")

Detected language: en


In [6]:
# quantized
_, probs = quantized_model.detect_language(mel)
print(f"Detected language: {max(probs, key=probs.get)}")

Detected language: en


In [7]:
import time
def time_model_evaluation(model, mel, options):
    eval_start_time = time.time()
    # result = whisper.decode(model, mel, options)
    result = whisper.transcribe(model, test_path) # , options)
    eval_end_time = time.time()
    eval_duration_time = eval_end_time - eval_start_time
    print(result["text"])
    print("Evaluate total time (seconds): {0:.1f}".format(eval_duration_time))

# Evaluate the original FP32 BERT model
time_model_evaluation(model_fp32, mel, options)

# Evaluate the INT8 BERT model after the dynamic quantization
time_model_evaluation(quantized_model, mel, options)



 Strum in my pain with his fingers, singing my life with his words. Killing me softly with his song, killing me softly with his song, telling my whole life. With his words killing me softly with his song. This is why I clap for refuge. I'll help you up in the prize where you sit on the base, sit on the beat. While I'm on this road, I got my girl, El. One time, one time, pay your El. You know you got the lyrics. I heard he sang a good song. I heard he had a style. And so I came to see him and listen for a while. And there he was, this young boy, straightened to my eyes. Strumming my pain with his finger, singing my life with his words. Killing me softly with his song, killing me softly with his song, telling my whole life. With his words killing me softly with his song. I felt all flush with the rust, and merrised by the crown. I felt he found my letter, and read each one out loud. I prayed that he would finish, but he just kept writing on. Strumming my pain with his finger, singing my 

KeyboardInterrupt: 