With CPU, it's quite slow. A 2-minute MP3 takes 6 minutes to finish. #279

zxl777 · 2023-06-05T20:20:44Z

Indeed, it's quite slow. A 2-minute MP3 takes 6 minutes to finish. Meanwhile, OpenAI's Whisper finishes within 2 minutes. Did I do something wrong? Here's the code I used:

        model = WhisperModel("small.en", device="cpu", compute_type="int8")
        segments, info = model.transcribe(self.src_filename, word_timestamps=True, beam_size=5,language="en")

Originally posted by @zxl777 in #271 (comment)

The text was updated successfully, but these errors were encountered:

guillaumekln · 2023-06-05T20:57:43Z

How are you running the original implementation from OpenAI? Are you comparing for the same beam size and number of CPU threads?

Also what is your CPU?

zxl777 · 2023-06-05T22:30:06Z

The original implementation from OpenAI

        model = whisper.load_model("small.en")
        result = model.transcribe(
            self.src_filename,
            verbose=True,
            word_timestamps=True,
            initial_prompt="Welcome to the software engineering courses channel.",
        )

The default beam size could be 5.

CPU
Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz

RAM : 22GB
small.mp3.zip

guillaumekln · 2023-06-06T08:16:59Z

The default beam size could be 5.

Actually the default beam size is 1 when you use OpenAI Whisper from Python (cf. #6). So for a better comparison you should set the same beam size, number of threads, and initial prompt.

That being said faster-whisper should take much less than 6 minutes for this audio file. Did you also measure the time to download the model?

Here's the result on my system with an Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz:

import os
import time

os.environ["OMP_NUM_THREADS"] = "6"

import whisper
import faster_whisper

def run_openai_whisper():
    model = whisper.load_model("small.en", device="cpu")
    result = model.transcribe(
        "small.mp3",
        beam_size=5,
        word_timestamps=True,
        initial_prompt="Welcome to the software engineering courses channel.",
    )

def run_faster_whisper():
    model = faster_whisper.WhisperModel("small.en", device="cpu", compute_type="int8")
    segments, _ = model.transcribe(
        "small.mp3",
        beam_size=5,
        word_timestamps=True,
        initial_prompt="Welcome to the software engineering courses channel.",
    )
    segments = list(segments)


run_openai_whisper()
start = time.time()
run_openai_whisper()
end = time.time()
print("openai-whisper: %f seconds" % (end - start))

run_faster_whisper()
start = time.time()
run_faster_whisper()
end = time.time()
print("faster-whisper: %f seconds" % (end - start))

openai-whisper: 195.410477 seconds
faster-whisper: 18.481551 seconds

brajeshvisio01 · 2023-08-07T04:00:08Z

what is the maximum limit of os.environ["OMP_NUM_THREADS"]? Is it 6 or more than that.

guillaumekln · 2023-08-08T07:08:34Z

There is no maximum, but using too many threads can negatively impact the performance. In general you should not use more threads than the number of physical cores. You should run some benchmark on your system to find the best value.

guillaumekln closed this as completed Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

With CPU, it's quite slow. A 2-minute MP3 takes 6 minutes to finish. #279

With CPU, it's quite slow. A 2-minute MP3 takes 6 minutes to finish. #279

zxl777 commented Jun 5, 2023

guillaumekln commented Jun 5, 2023

zxl777 commented Jun 5, 2023

guillaumekln commented Jun 6, 2023

brajeshvisio01 commented Aug 7, 2023

guillaumekln commented Aug 8, 2023

With CPU, it's quite slow. A 2-minute MP3 takes 6 minutes to finish. #279

With CPU, it's quite slow. A 2-minute MP3 takes 6 minutes to finish. #279

Comments

zxl777 commented Jun 5, 2023

guillaumekln commented Jun 5, 2023

zxl777 commented Jun 5, 2023

guillaumekln commented Jun 6, 2023

brajeshvisio01 commented Aug 7, 2023

guillaumekln commented Aug 8, 2023