# OpenAI Whisper - CPU
Improving CPU-deployment performance of OpenAI Whisper model, following this procedure:
https://pytorch.org/assets/images/quantization-practice/quantization-flowchart2.png

## Load Model

In [1]:
import whisper
import torch

test_path  = "audio.wav"

model_fp32 = whisper.load_model(
    name="base",
    device="cpu")

100%|███████████████████████████████████████| 139M/139M [00:15<00:00, 9.65MiB/s]


## Dynamically Quantize Model

In [2]:
quantized_model = torch.quantization.quantize_dynamic(
    model_fp32, {torch.nn.Linear}, dtype=torch.qint8
)

In [3]:
import os

def print_size_of_model(model):
    torch.save(model.state_dict(), "temp.p")
    size = os.path.getsize("temp.p")/1e6
    print('Size (MB):', size)
    os.remove('temp.p')
    return size

print_size_of_model(model_fp32)
print_size_of_model(quantized_model)

Size (MB): 290.444061
Size (MB): 290.444061


290.444061

## Run Dynamically Quantized Model

In [5]:
audio = whisper.load_audio(test_path)
audio = whisper.pad_or_trim(audio)

mel   = whisper.log_mel_spectrogram(audio).to(model_fp32.device)
options = whisper.DecodingOptions(fp16=False)

In [6]:
# regular
_, probs = model_fp32.detect_language(mel)
print(f"Detected language: {max(probs, key=probs.get)}")

Detected language: en


In [7]:
# quantized
_, probs = quantized_model.detect_language(mel)
print(f"Detected language: {max(probs, key=probs.get)}")

Detected language: en


In [8]:
import time
def time_model_evaluation(model, mel, options):
    eval_start_time = time.time()
    # result = whisper.decode(model, mel, options)
    result = whisper.transcribe(model, test_path) # , options)
    eval_end_time = time.time()
    eval_duration_time = eval_end_time - eval_start_time
    print(result["text"])
    print("Evaluate total time (seconds): {0:.1f}".format(eval_duration_time))

# Evaluate the original FP32 BERT model
time_model_evaluation(model_fp32, mel, options)

# Evaluate the INT8 BERT model after the dynamic quantization
time_model_evaluation(quantized_model, mel, options)



 Nothing is so expensive as their caprices, flowers, boxes at the theater, suppers, days in the country, which one can never refuse to one's mistress. As I have told you, I had little money.
Evaluate total time (seconds): 2.4
 Nothing is so expensive as their caprices, flowers, boxes at the theater, suppers, days in the country, which one can never refuse to one's mistress. As I have told you, I had little money.
Evaluate total time (seconds): 1.6
