Releases: ds-kimi/Auris
Releases · ds-kimi/Auris
Auris v1.2.2
Add polyphase resampler Introduce resampler.h implementing a Kaiser-windowed polyphase FIR resampler (RS_L=2, RS_M=3, RS_K=16) and a ResamplePolyphase() routine that converts int16 input to float output (normalized by 32768). Replace the previous linear Resample implementation and the WHISPER_RATE constant in steam_voice.cpp with an include of resampler.h and call to ResamplePolyphase(), improving sample-rate conversion from 24k to 16k using the new filter.
Auris v1.2.1
fix(linux32): correct x87 FP precision and BMI2 guard for i386 builds On Linux 32-bit, two independent issues caused garbage (!!!!) output: 1. GCC defaults to x87 (80-bit extended precision) for scalar floats on i386 even with -mavx2. Whisper's softmax/logit values diverge from every other platform. Fixed by adding -mfpmath=sse -msse2 to the Linux x86 build flags in premake5.lua. 2. _pdep_u64 is an x86_64-only intrinsic. The #ifdef __BMI2__ guard in quants.c was insufficient — on i386 with BMI2 the intrinsic is absent, producing silent garbage. Fixed by guarding with defined(__BMI2__) && defined(__x86_64__) so 32-bit hits the correct scalar fallback. Patch applied to vendor submodule, CI linux32 jobs, and RUN_ONCE setup scripts.
Auris v1.2.0
Add energy-based VAD with high-pass filter
Introduce an energy-based voice activity detector to reduce spurious transcriptions on silence/noise. Adds vad_thold and freq_thold config options (defaults: 0.6 and 100 Hz) to Lua configs and exposes them via the Lua API. Implements auris::vad_simple (vendored from whisper.cpp examples) in new source/vad.{cpp,h}, applies an optional high-pass filter before VAD, and wires the gate into the WorkerLoop to skip low-energy chunks. Defaults and behavior: tail energy compared against whole-chunk average; set vad_thold <= 0 to disable VAD, set freq_thold = 0 to disable filtering.
Auris v1.1.3
Add OpenAI remote backend and FlushRaw binding Introduce a remote transcription backend that forwards captured audio to OpenAI when openai_api_key is set. Adds config options (openai_api_key, openai_model) and docs (README/API) describing local (whisper.cpp) vs remote modes. New server module sv_auris_openai.lua builds multipart WAV requests, posts to OpenAI /v1/audio/transcriptions, and fires Auris_Transcription with the returned text. C++ bindings: auris.Init gains a skipWhisper flag to avoid loading whisper.cpp; a new auris.FlushRaw binding returns raw float32 PCM for the remote path; module registration updated accordingly. Server boot/feed/initialization logic updated to branch between local and remote flows, and shared version/api files were reorganized to server-side includes.
Auris v1.1.2
Expose raw audio and add PCM->WAV helper Add end-to-end raw audio support: the Auris transcription hook and Subscribe callbacks now receive a 4th `audio` argument (raw 16 kHz mono float32 PCM, nil on rare cache miss). Implement a Lua helper Auris.PCMToWAV(pcm) to wrap the raw PCM into a WAV container. On the native side, add an audio cache (source/audio_cache.*) and store audio in auris_context before running transcription; lua_bindings.Poll now returns the audio binary as a third return value. Update the auris-discord submodule to optionally attach audio as a multipart WAV upload (build multipart body + content type), converting with Auris.PCMToWAV. Docs and examples updated (API.md, README.md) to show audio usage and warn to guard the 4th arg. Bump Auris.VERSION to 1.1.2 and include a sample voice.wav file.
Auris v1.1.1
Remove -fno-stack-protector from build options Remove the -fno-stack-protector compiler flag and add an explanatory comment. EDR/AV services (CrowdSec, VirusTotal) flag binaries built without stack protection as exploit-facilitating, producing false CVE-2023-4911 detections; dropping this flag avoids those false positives while retaining other optimization and ISA flags (e.g. -O3, -flto, -mavx2, -mfma, -mf16c). Link options remain unchanged.
Auris v1.1.0
Update VERSION
Auris v1.0.8
Expose extended Whisper config options Add many new Whisper/decoding configuration options and wire them through Lua and C++ so users can tune sampling, output, timestamps, filtering, decoding and context behavior. Changes: - garrysmod_addon/auris/lua/auris/config.lua: expand default config with groups for sampling, output, token timestamps, filtering, decoding and context. - garrysmod_addon/auris/lua/auris/server/sv_auris_boot.lua: map new Lua config fields into the whisper config table returned to the backend. - garrysmod_addon/auris/lua/auris/server/sv_auris_config.lua: add the same expanded DEFAULTS so server-side defaults include new options. - source/auris_config.h: extend WhisperConfig struct with many new fields (booleans, floats, ints and strings) and reorganize default values. - source/auris_context.cpp: apply the new config fields to whisper_full_params (including conditional beam search selection, string null checks, and new numeric params) and minor formatting cleanups. - source/lua_config.cpp: add GetFloatField helper, update string/int/bool helpers, and read/write all new config fields between Lua and the WhisperConfig struct. Why: expose finer control over transcription behavior (beam search vs greedy, token timestamps thresholds, suppression options, decoding temperatures/penalties, context sizing, etc.) so server admins/developers can better tune real-time voice transcription behavior.
Auris v1.0.7
Add GPU/CPU CI builds, Vulkan opt-in & logger Split CI into separate GPU and CPU build jobs for Windows/Linux x86/x64, update artifact names and release packaging to produce GPU/CPU zips. Add a --with-vulkan premake option that switches the module name to auris-gpu and gates Vulkan-specific sources/links so Vulkan is opt-in (CPU-only builds no longer depend on libvulkan). Update server init to prefer auris-gpu and fall back to auris, and add a simple auris_logger addon (autorun + server logger) to print transcriptions to the server console. Add BUILD.md with full build instructions and reference it from README; bump VERSION to 1.0.7.
Auris v1.0.6
Fix runner permissions , and gmsv callings