Skip to content
Discussion options

You must be logged in to vote

The README shows how to run detection only. Scroll down to "Python usage" near the bottom, starting with

Below is an example usage of whisper.detect_language() and whisper.decode() which provide lower-level access to the model.

# load audio and pad/trim it to fit 30 seconds
audio = whisper.load_audio("audio.mp3")
audio = whisper.pad_or_trim(audio)

# make log-Mel spectrogram and move to the same device as the model
mel = whisper.log_mel_spectrogram(audio).to(model.device)

# detect the spoken language
_, probs = model.detect_language(mel)
print(f"Detected language: {max(probs, key=probs.get)}")

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by schotek
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants