GitHub - aviks/Whisper.jl: Implementation of OpenAI Whisper model based on whisper.cpp

Whisper

Whisper.jl is a Julia package for automatic speech recognition, based on OpenAI's Whisper model. This package wraps the whisper.cpp code, which is a C/C++ implementation of the model. It uses the model weights which were published by OpenAI. The weights are downloaded on demand.

Quick Start Guide

The transcribe function is the simplest way to run speech recognition on an audio signal. It takes as input the model name, and a float32 array with the sound signal. The signal is expected to be sampled at 16kHz.

using Whisper, LibSndFile, FileIO, SampledSignals

s = load("/path/to/audio_file.ogg")  

# Whisper expects 16kHz sample rate and Float32 data
sout = SampleBuf(Float32, 16000, round(Int, length(s)*(16000/samplerate(s))), nchannels(s))  
write(SampleBufSink(sout), SampleBufSource(s))  # Resample

if nchannels(sout) == 1
    data  = sout.data
elseif nchannels(sout) == 2
    sd = sout.data
    data = [sd[i,1] + sd[i,2] for i in 1:size(sd)[1]] #convert stereo to mono
end

result = transcribe("base.en", data)

For more control, the whisper.cpp C interface is available within the Whisper.LibWhisper module.

Models

The following models are available, and should be referenced by name. The models with the .en suffix are specialised for the english language.

Model	Disk	Mem
tiny / tiny.en	75 MB	~125 MB
base / base.en	142 MB	~210 MB
small / small.en	466 MB	~600 MB
medium / medium.en	1.5 GB	~1.7 GB
large-v1 / large.en	2.9 GB	~3.3 GB

TODO

The biggest missing functionality is a streaming interface.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
docs		docs
gen		gen
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

docs

docs

gen

gen

src

src

test

test

.gitignore

.gitignore

LICENSE

LICENSE

Project.toml

Project.toml

README.md

README.md

Repository files navigation

Whisper

Quick Start Guide

Models

TODO

About

Releases 1

Packages

Contributors 2

Languages

License

aviks/Whisper.jl

Folders and files

Latest commit

History

Repository files navigation

Whisper

Quick Start Guide

Models

TODO

About

Resources

License

Stars

Watchers

Forks

Languages