Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New features: Automatically start and stop to record. Bip sound once text is insert. #40

Closed
ossossosso opened this issue Apr 28, 2024 · 2 comments

Comments

@ossossosso
Copy link

Hi,
my name is Daniele. I'm an italian stenographer.
Basically I transcribe what I hear in text using a steno keyboard.

I've interest to better understand OpenAI Whisper and his capability.
I don't have a good hardware, so I'd like to use OpenAI Speech to Text API.

I thank you very much for your project WhisperWriter:

I tried it and it works for me.

Not a lot of application on github which use Whisper for Speech to Text allow at the same time:

  1. Use a microphone as audio source.
  2. Use Whisper API instead of local model.
  3. Transcribe directly into any text editor.

So thanks for this opportunity.

It would be interesting to have two new features:

  1. No need to press any shortcut to run record again.
    I mean, once pressed shortcut like Ctrl Shift Spacebar the first time to run recording, once the audio recording is automatically stopped and text transcribed, It would be great if I don't need to press shortcut again, but a new recording starts automatically, waiting for my words.
    I only need a new shortcut to stop recording definitively.

  2. Because I'm a blind user, would be useful a sort of "bip sound" which inform me when text is transcribed, in this case I know I can speak again.

thanks a lot.

Daniele.

@savbell
Copy link
Owner

savbell commented May 4, 2024

Hi Daniele, thank you for your comments! I'm happy to hear that WhisperWriter has worked well for you :)

I appreciate your feature requests and I went ahead and added the option for a "beep" sound to play once the transcription has finished writing to the screen. After downloading my latest commit, you can turn the feature on by setting the noise_on_completion configuration option to true in src\config.json. If you would like to change the sound that is made, you can replace "beep.wav" in the assets folder, or change the file path on line 102 of main.py.

Although there is not currently a pipelining feature like you described with the default voice activity detection method, you can change the way the app starts and stops recording to a key toggle. If you change recording_mode in the configuration options to press_to_toggle, the app will start listening when you press the keyboard shortcut and stop listening when you press it a second time, rather than waiting for you to finish speaking.

I hope you find these changes useful! Please let me know if there are any other features that would make the app work better for you :)

@savbell savbell closed this as completed May 4, 2024
@ossossosso
Copy link
Author

Thanks a lot! It works perfectly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants