Release v1.0.0: SenseVoice — Multilingual Speech Understanding · FunAudioLLM/SenseVoice

SenseVoice v1.0.0

The first official release of SenseVoice, a speech foundation model for multilingual speech understanding.

Highlights

Multilingual ASR — 50+ languages, superior to Whisper on Chinese and Cantonese
Speech Emotion Recognition — Happy, Sad, Angry, Neutral detection
Audio Event Detection — Background music, applause, laughter, crying, coughing
Ultra-fast inference — Non-autoregressive, 70ms for 10 seconds of audio (15x faster than Whisper)
Speaker Diarization — Works with FunASR's VAD + SPK pipeline for who-said-what

Quick Start

from funasr import AutoModel

model = AutoModel(model="iic/SenseVoiceSmall", device="cuda")
result = model.generate(input="audio.wav")
print(result[0]["text"])

Models

Model	Languages	Parameters	Download
SenseVoice-Small	5 (zh/en/ja/ko/yue)	234M	ModelScope · HuggingFace

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.0.0: SenseVoice — Multilingual Speech Understanding

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

SenseVoice v1.0.0

Highlights

Quick Start

Models

Links

Uh oh!