Releases: KoljaB/RealtimeSTT
Releases · KoljaB/RealtimeSTT
v0.3.100
v0.3.99
RealtimeSTT 0.3.99
1. Enhanced Logging Configuration
- Introduced a dedicated named logger
realtimestt
instead of using the root logger. - Added structured logging with handlers for both console (level set by user) and file (always DEBUG).
- Logging no longer propagates to the root logger by default (
logger.propagate = False
).
2. Added possibility to disable Faster-Whisper VAD Filter
- Added
faster_whisper_vad_filter
parameter (default:True
) to enable voice activity detection (VAD) from thefaster_whisper
library. - Improves robustness against background noise at the cost of additional GPU resources.
- Integrated into both real-time and main transcription workflows.
3. Audio Worker Improvements
- Added improved, detailed debug logging for audio device initialization, sample rate handling, and resampling.
4. VAD Callback Adjustments
- fixes #215
- Moved
on_vad_detect_start
andon_vad_detect_stop
callbacks to trigger directly during voice activity checks instead of state transitions. - Ensures callbacks align more accurately with actual speech/silence events.
v0.3.98
v0.3.97
v0.3.95
v0.3.94
RealtimeSTT 0.3.94
- New Parameters for stop-method of AudioToTextRecorder:
-
backdate_stop_seconds
(float, default=0.0):- Description: Specifies the number of seconds to backdate the stop time when ending a recording.
- Usage: When invoking
stop()
due to a wake word detection or a speaker diarization change event, this parameter compensates for any latency, ensuring that only relevant audio is included in the recording and transcription.
-
backdate_resume_seconds
(float, default=0.0):- Description: Specifies the number of seconds to backdate the resume time when restarting listening after a recording has stopped.
- Usage: Typically set to the same value as
backdate_stop_seconds
, this parameter allows for fine-tuning.
-
v0.3.93
- fix for stt-server (got broken by webservers dependency upgrade because of an api change)
- added initial_prompt_realtime to AudioToTextRecorder to be able to give different prompts to final and realtime model
- added new parameters to client/server (download root, batch sizes)
v0.3.92
v0.3.91
v0.3.9
RealtimeSTT v0.3.9 Release Notes
🚀 New Features
Batched Transcription
- Added support for batched transcription in both main and real-time models which improves performance and efficiency
- New parameters introduced:
batch_size
: Controls the batch size for main transcription tasks.realtime_batch_size
: Configures batch size for real-time transcription.
This feature is designed to speed up processing. I can't say yet if there may be cases where batching overhead impacts performance negatively. It looked promising for me in initial tests, but I need your feedback! Please report if you get into any issues or notice even slower transcription due to batching.