Quick Whisper Typer

Super simple python script to start recording sound, send it to whisper then have it type for you anywhere.

Can also modify text according to voice commands.
Latency is as low as I could (instant if deepgram is used, <1s for openai's whisper).
It can be seen as a minimalist alternative to AquaVoice

The way each task works

write

starts recording
when you're done press shift (escape or spacebar to cancel)
whisper will transcribe your speech 4.a if --auto_paste is True: your current clipboard will be saved, replaced by the transcription, "ctrl+v" will automatically be pressed, then your old clipboard will replace again like nothing happened. 4.b if --auto_paste is False: your clipboard will be replaced by the transcription

transform_clipboard

starts recording
when you're done press shift (escape or spacebar to cancel)
whisper will transcribe your speech
the transcription will be interpreted as an instruction for --llm_model on how to transform the text found in your clipboard
the result will either be pasted or stored in the clipboard like for --task=write

new_voice_chat

starts recording
when you're done press shift (escape or spacebar to cancel)
whisper will transcribe your speech
the transcription will be interpreted as the first user message in a conversation with --llm_model
the result will either be pasted or stored in the clipboard like for --task=write, and optionaly read aloud if --voice_engine is set
To continue the conversation, use the task --task=continue_voice_chat

Examples

I want to write text: python quick_whisper_typer.py --task=write --auto_paste
I want to translate text: copy the text in to the clipboard then python quick_whisper_typer.py --task=transform_clipboard --auto_paste
I want to start a vocal conversation: python quick_whisper_typer.py --task="new_voice_chat" --voice_engine='openai'
I want to continue the conversation: python quick_whisper_typer.py --task="continue_voice_chat" --voice_engine='openai'
I want to call it from anywhere without setting up keybindings, use --loop then press shift key several times from anywhere and you'll see a notification appear to trigger the tasks.

Features

Supports any spoken languages supported by whisper
Supports both openai's whisper and deepgram's whisper
Minimalist code
Low latency: it starts as fast as possible to be ready to listen to you
Four supported voice_engine: openai, piper, deepgram, espeak (fallback if any of the other fails)
Optional audio cleanup and long silence removal via sox
--loop to trigger the script from anywhere just by pressing shift multiple times. You can define any king of argument to customize your loop shortcuts by passing a dict to --loop_tasks
Support virtually any type of LLM (ChatGPT, Claude, Huggingface, Llama, etc) thanks to litellm.
Supposedly multiplatform, but I can't test it on anything else than Linux so please open an issue to tell me how it went!

How to

Make sure your environment contains the appropriate api keys (eg as OPENAI_API_KEY, MISTRAL_API_KEY, DEEPGRAM_API_KEY etc)
optional: add a keyboard shortcut to call this script. See my i3 bindings below.
If using deepgram: make sure you are on python 3.10+
chmod +x ./quick_whisper_typer.py

i3 bindings

mode "$mode_launch_microphone" {
    # enter text
    bindsym f exec /PATH/TO/quick_whisper_typer.py --task write, mode "default
    # edit clipboard
    bindsym e exec /PATH/TO/quick_whisper_typer.py --task=transform_clipboard, mode "default"
    bindsym v exec /PATH/TO/quick_whisper_typer.py --task=continue_voice_chat, mode "default"
    bindsym shift+V exec /PATH/TO/quick_whisper_typer.py --task=new_voice_chat, mode "default"

    bindsym Return mode "default"
    bindsym Escape mode "default"
    }

Credits

.ogg files were in my /usr/share/sounds/ubuntu/notifications folder.

Name		Name	Last commit message	Last commit date
Latest commit History 206 Commits
obsolete		obsolete
sounds		sounds
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
quick_whisper_typer.py		quick_whisper_typer.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

obsolete

obsolete

sounds

sounds

.gitignore

.gitignore

LICENSE.md

LICENSE.md

README.md

README.md

quick_whisper_typer.py

quick_whisper_typer.py

requirements.txt

requirements.txt

Repository files navigation

Quick Whisper Typer

The way each task works

write

transform_clipboard

new_voice_chat

Examples

Features

How to

i3 bindings

Credits

About

Releases

Packages

Languages

License

thiswillbeyourgithub/Quick-Whisper-Typer

Folders and files

Latest commit

History

Repository files navigation

Quick Whisper Typer

The way each task works

write

transform_clipboard

new_voice_chat

Examples

Features

How to

i3 bindings

Credits

About

Resources

License

Stars

Watchers

Forks

Languages