Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any Idea for feature update for targeted speaker transcription? #53

Closed
Jeevi10 opened this issue Jan 25, 2024 · 3 comments
Closed

Any Idea for feature update for targeted speaker transcription? #53

Jeevi10 opened this issue Jan 25, 2024 · 3 comments

Comments

@Jeevi10
Copy link

Jeevi10 commented Jan 25, 2024

No description provided.

@Jeevi10 Jeevi10 changed the title Any Idea feature update for Target speaker transcription? Any Idea for feature update for targeted speaker transcription? Jan 25, 2024
@Gldkslfmsd
Copy link
Collaborator

hi, what do you mean by "targeted speaker transcription"?
I think that if you use an underlying model that supports it, instead of the default Whisper, than it will work.

@Jeevi10
Copy link
Author

Jeevi10 commented Jan 25, 2024

Thank you for the prompt response. I mean, suppose multiple speakers speak in an environment ( noisy environment where people are speaking in the background)
I would like to transcribe the main speaker (who is close to mic) only, I tested the "mic_test_whisper_simple.py" it works very well, however it tend to capture all the noises ( still speech but far from mic) in-between my speech.

@Gldkslfmsd
Copy link
Collaborator

OK, I understand.

This is not an issue of streaming, but of audio pre-processing or ASR modelling. You need to ask elsewhere to have such model, and then integrate it for streaming, the same way as Whisper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants