Label different speakers #202

savchenko · 2022-11-29T07:39:13Z

Might be a stretch, but would it be possible to label different speakers if audio has >1 person talking?

This would come handy for conference recordings with multiple presenters, etc.

savchenko · 2022-11-29T12:19:34Z

Thinking about possible implementation, the simplest one might be to label based on the audio channel.

Say we have a stereo recording:

Recognize L/R separately
Label accordingly
Done

ggerganov · 2022-12-01T17:23:34Z

Stereo-diarization is already implemented - see #64
I have some other ideas in mind for general diarization, but low-priority for the moment.

* Add model_dir to arguments * minor formatting change Co-authored-by: Jong Wook Kim <jongwook@openai.com>

ggerganov added the duplicate This issue or pull request already exists label Dec 1, 2022

ggerganov closed this as completed Dec 1, 2022

mattsta pushed a commit to mattsta/whisper.cpp that referenced this issue Apr 1, 2023

Add model_dir to arguments (ggerganov#202)

0b1ba3d

* Add model_dir to arguments * minor formatting change Co-authored-by: Jong Wook Kim <jongwook@openai.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Label different speakers #202

Label different speakers #202

savchenko commented Nov 29, 2022

savchenko commented Nov 29, 2022

ggerganov commented Dec 1, 2022

Label different speakers #202

Label different speakers #202

Comments

savchenko commented Nov 29, 2022

savchenko commented Nov 29, 2022

ggerganov commented Dec 1, 2022