28 Jun 20:53

KoljaB

e377a4b

v0.2.0 Latest

Latest

v0.2.0 with OpenWakeWord Support

Training models

Look here for information about how to train your own OpenWakeWord models. You can use a simple Google Colab notebook for a start or use a more detailed notebook that enables more customization (can produce high quality models, but requires more development experience).

Convert model to ONNX format

You might need to use tf2onnx to convert tensorflow tflite models to onnx format:

pip install -U tf2onnx
python -m tf2onnx.convert --tflite my_model_filename.tflite --output my_model_filename.onnx

Configure RealtimeSTT

Suggested starting parameters for OpenWakeWord usage:

    with AudioToTextRecorder(
        wakeword_backend="oww",
        wake_words_sensitivity=0.35,
        openwakeword_model_paths="word1.onnx,word2.onnx",
        wake_word_buffer_duration=1,
        ) as recorder:

OpenWakeWord Test

Set up the openwakeword test project:

mkdir samantha_wake_word && cd samantha_wake_word
curl -O https://raw.githubusercontent.com/KoljaB/RealtimeSTT/master/tests/openwakeword_test.py
curl -L https://huggingface.co/KoljaB/SamanthaOpenwakeword/resolve/main/suh_mahn_thuh.onnx -o suh_mahn_thuh.onnx
curl -L https://huggingface.co/KoljaB/SamanthaOpenwakeword/resolve/main/suh_man_tuh.onnx -o suh_man_tuh.onnx

Ensure you have curl installed for downloading files. If not, you can manually download the files from the provided URLs.

Create and activate a virtual environment:
```
python -m venv venv
```
- For Windows:
```
venv\Scripts\activate
```
- For Unix-like systems (Linux/macOS):
```
source venv/bin/activate
```
- For macOS:
  Use python3 instead of python and pip3 instead of pip if needed.

Install dependencies:

python -m pip install --upgrade pip
python -m pip install RealtimeSTT
python -m pip install -U torch torchaudio --index-url https://download.pytorch.org/whl/cu121

The PyTorch installation command includes CUDA 12.1 support. Adjust if a different version is required.

Run the test script:
```
python openwakeword_test.py
```
On the very first start some models for openwakeword are downloaded.

Assets 2

02 Jun 09:47

KoljaB

v0.1.16

484aa09

v0.1.16

explicitly setting the multiprocessing start method to 'spawn' (due to some changes in torch.multiprocessing)
update faster_whisper to newest version

Assets 2

14 Apr 12:05

KoljaB

v0.1.15

9b343da

v0.1.15

added parameter beam_size
(int, default=5)
The beam size to use for beam search decoding
added parameter beam_size_realtime
(int, default=3)
The beam size to use for real-time transcription beam search decoding.
added parameter initial_prompt
(str or iterable of int, default=None)
Initial prompt to be fed to the transcription models.
added parameter suppress_tokens
(list of int, default=[-1])
Tokens to be suppressed from the transcription output.
added method set_microphone(microphone_on=True)
This parameter allows dynamical switching between recording from the input device configured in RealtimeSTT and chunks injected into the processing pipeline with the feed_audio-method

Assets 2

08 Apr 20:54

KoljaB

v0.1.13

07702c7

v0.1.13

added beam_size: int = 5 and beam_size_realtime: int = 3 parameters to AudioToTextRecorder constructor allowing faster (realtime) transcriptions by lowering the beamsizes
added last_transcription_bytes containing the raw bytes from the last transcription
You can retrieve those bytes with recorder.last_transcription_bytes for further analysis, saving to file etc

Assets 2

30 Mar 15:30

KoljaB

v0.1.12

9be4762

v0.1.12

fixed qsize issue for macOS
upgrade requirements to torch 2.2.2

Assets 2

16 Mar 19:47

KoljaB

v0.1.11

ba6e549

v0.1.11

added on_recorded_chunk callback to allow processing of audio chunks recorded from microphone by the client

Assets 2

29 Jan 17:09

KoljaB

v0.1.9

e5613ca

v0.1.9

switched to torch.multiprocessing
added compute_type (#14), input_device_index (select input audio device) and gpu_device_index (select gpu device) parameters
recorder.text() interruptable with recorder.abort()
fix for #20

Assets 2

15 Dec 12:51

KoljaB

v0.1.8

a0f936a

v0.1.8

added example how to realtime transcribe from browser microphone
large-v3 whisper model now supported (upgrade to faster_whisper 0.10.0)
added feed_audio() and use_microphone parameter to feed chunks

Assets 2

09 Nov 15:36

KoljaB

v0.1.7

cee57e3

Bugfixes and KeyboardInterrupt support

Bugfix for Mac OS Installation (occured int the context of multiprocessing with the usage of queue.size(), changed to use multiprocessing.Manager().Queue() which should work under Mac)
KeyboardInterrupt handling (we can now abort the recorder with CTRL+C)
Bugfix for spinner handling (could lead to exception in some cases, AttributeError: 'NoneType' object has no attribute '_interval')

Assets 2

17 Oct 10:03

KoljaB

v0.1.6

21b49bf

v0.1.6

Implements context manager protocol. This enables the recorder instance to be used in a with statement, ensuring proper resource management. Waits for transcription process to start in constructor now. Fixed bug in the shutdown method.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.0 with OpenWakeWord Support

Training models

Convert model to ONNX format

Configure RealtimeSTT

OpenWakeWord Test

Releases: KoljaB/RealtimeSTT

v0.2.0

v0.2.0 with OpenWakeWord Support

Training models

Convert model to ONNX format

Configure RealtimeSTT

OpenWakeWord Test

v0.1.16

v0.1.15

v0.1.13

v0.1.12

v0.1.11

v0.1.9

v0.1.8

Bugfixes and KeyboardInterrupt support

v0.1.6