Release v0.7.6 by MapleEve · Pull Request #29 · MapleEve/VoScript

MapleEve · 2026-05-07T04:02:31Z

Change Type

fix
docs
chore

Description of Change

Release v0.7.6 with health, WhisperX alignment runtime, stock outro hallucination guard, and embedding audio slicing/runtime reliability fixes.

Public documentation and changelog updates use anonymized validation wording only. Internal validation is described by behavior category, without sample names, job IDs, speaker IDs, hosts, paths, or logs.

Verification

Public release scan passed.
Dependency audit passed with the accepted ignores: PYSEC-2022-42969, CVE-2026-1839.
ruff check app/ --ignore E501 passed.
ruff format --check app/ passed.
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 pytest tests/unit/ tests/test_security.py -v --tb=short --no-header passed in a CI-equivalent Python 3.11 environment.
git diff --check passed.

github-actions · 2026-05-07T04:02:41Z

👍 @MapleEve

Thank you for raising your pull request and contributing to voscript.
Please make sure you have followed our contributing guidelines. We will review it as soon as possible.
If you encounter any problems, please feel free to connect with us.
非常感谢您提出拉取请求并为 voscript 做出贡献，请确保您已经遵循了我们的贡献指南，我们会尽快审查它。
如果您遇到任何问题，请随时与我们联系。

codecov · 2026-05-07T04:03:19Z

Codecov Report

❌ Patch coverage is 86.17886% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.18%. Comparing base (80b61dd) to head (439a5d2).
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
app/nltk/tokenize/punkt.py	86.36%	6 Missing ⚠️
app/providers/diarization/default.py	83.33%	5 Missing ⚠️
app/providers/embedding/default.py	87.50%	3 Missing ⚠️
app/infra/job_runtime.py	60.00%	2 Missing ⚠️
app/providers/asr/default.py	90.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #29      +/-   ##
==========================================
- Coverage   91.32%   91.18%   -0.15%     
==========================================
  Files          75       76       +1     
  Lines        2906     3016     +110     
==========================================
+ Hits         2654     2750      +96     
- Misses        252      266      +14

Flag	Coverage Δ
unit	`91.18% <86.17%> (-0.15%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 92a7af54ad

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-07T04:05:02Z

 # whisperx: forced word-level alignment via wav2vec2.
-# 3.3.1 is the newest non-yanked WhisperX release that still supports the
-# current torch 2.6 / numpy<2 / SciPy 1.11.x runtime without moving to the
-# WhisperX 3.3.3+ numpy>=2 or 3.8.x torch~=2.8 dependency stack.
-whisperx==3.3.1
+# Docker installs whisperx==3.3.1 with --no-deps after this file is resolved.
+# WhisperX 3.3.1 still hard-pins faster-whisper==1.1.0 / ctranslate2<4.5.0,
+# which would replace the cuDNN9-compatible ASR stack above and look for
+# cuDNN8 runtime libraries in the CUDA 12.4 + cuDNN9 base image.
+nltk>=3.9,<4.0


Restore WhisperX to non-Docker installs

The local quickstart still tells users to install only pip install -r app/requirements.txt, but this file now contains only comments for WhisperX and no installable whisperx requirement. Those environments will start without the package and then hit ModuleNotFoundError at the import whisperx in the diarization alignment path; the Dockerfile's separate pip install --no-deps whisperx==3.3.1 does not cover local venvs or CI jobs that consume this requirements file directly.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-07T04:05:02Z

+    data, sample_rate = sf.read(audio_path, dtype="float32", always_2d=True)
+    waveform = torch.from_numpy(data.T.copy())


Avoid loading entire embedding audio into memory

For long recordings, sf.read() decodes the whole embedding source before any diarization-turn filtering, and data.T.copy() immediately creates another full-size copy. The previous path loaded only each bounded turn with torchaudio.load(frame_offset, num_frames), so large accepted uploads or caller-provided raw embedding sources can now exhaust RAM before embeddings start; consider keeping segmented reads or gating the full-file fast path by duration/size.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

Release v0.7.6 updates the runtime to improve health stability during GPU cleanup, make WhisperX alignment more reliable/isolated, and reduce embedding/audio loading overhead, with accompanying unit tests and documentation/changelog updates.

Changes:

Add WHISPERX_ALIGN_DEVICE support and cache/reuse WhisperX alignment models by language/model/device.
Prefer single-pass audio loading for embedding extraction (soundfile → slice per diarization turn), with a torchaudio fallback.
Adjust GPU cleanup to avoid full Python GC during active job phases; update hallucination guard for short “stock outro” segments; bump docs/version/changelog and update dependency/runtime baselines.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/unit/test_provider_registry.py	Adds coverage for alignment-device caching/reuse and the embedding single-load path; updates embedding timing test expectations.
tests/unit/test_pipeline_model_lifecycle.py	Extends lifecycle/unload tests to cover alignment cache/device/model state.
tests/unit/test_main_lifespan.py	Updates OpenAPI version assertion to 0.7.6.
tests/unit/test_job_runtime.py	Adds tests for CUDA cache flushing behavior with/without full Python GC by phase.
tests/unit/test_dependency_runtime_baseline.py	Adds regression tests to enforce faster-whisper/ctranslate2 pins and Docker-only whisperx install strategy.
tests/unit/test_asr_repetition_guard.py	Adds tests for the new short single-segment “stock outro” hallucination filter.
doc/security.zh.md	Updates documented “current version” reference to 0.7.6.
doc/security.en.md	Updates documented “current version” reference to 0.7.6.
doc/quickstart.zh.md	Documents `WHISPERX_ALIGN_DEVICE` and clarifies CPU default for zh alignment.
doc/quickstart.en.md	Documents `WHISPERX_ALIGN_DEVICE` and clarifies CPU default for zh alignment.
doc/configuration.zh.md	Bumps version references to 0.7.6 and documents `WHISPERX_ALIGN_DEVICE` + validation wording.
doc/configuration.en.md	Bumps version references to 0.7.6 and documents `WHISPERX_ALIGN_DEVICE` + validation wording.
doc/changelog.zh.md	Adds v0.7.6 release notes (deps/runtime/health/alignment/embedding).
doc/changelog.en.md	Adds v0.7.6 release notes (deps/runtime/health/alignment/embedding).
app/requirements.txt	Moves ASR stack to cuDNN9-compatible faster-whisper/ctranslate2, adds soundfile/nltk; removes whisperx pin (Docker installs it separately).
app/providers/embedding/default.py	Implements soundfile full-load + per-turn slicing path with fallback to torchaudio segment loading and adds load-timing logs.
app/providers/diarization/default.py	Adds `WHISPERX_ALIGN_DEVICE` resolution and alignment model caching/reuse keyed by language/model/device.
app/providers/asr/default.py	Adds a “stock outro” marker-based single-segment hallucination filter.
app/pipeline/orchestrator.py	Tracks and unloads alignment cache/device/model state alongside other pipeline models.
app/infra/job_runtime.py	Avoids full Python GC at active job phases; keeps full GC for idle-unload; factors GC into helper.
app/Dockerfile	Installs whisperx==3.3.1 with `--no-deps` after resolving requirements to avoid dependency replacement.
app/config.py	Bumps APP_VERSION to 0.7.6 and adds `WHISPERX_ALIGN_DEVICE` env config.
.env.example	Documents `WHISPERX_ALIGN_DEVICE` default and recommended usage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+# Docker installs whisperx==3.3.1 with --no-deps after this file is resolved.
+# WhisperX 3.3.1 still hard-pins faster-whisper==1.1.0 / ctranslate2<4.5.0,
+# which would replace the cuDNN9-compatible ASR stack above and look for
+# cuDNN8 runtime libraries in the CUDA 12.4 + cuDNN9 base image.


+    except Exception as exc:
+        logger.warning(
+            "Falling back to torchaudio segment loading for embedding audio: %s",
+            exc,
+        )
+        info = torchaudio.info(audio_path)
+        native_sr = info.sample_rate


+def _load_full_waveform(audio_path: str):
+    """Load normalized audio once with libsndfile to avoid per-turn torch decode."""
+
+    load_started = time.perf_counter()
+    data, sample_rate = sf.read(audio_path, dtype="float32", always_2d=True)
+    waveform = torch.from_numpy(data.T.copy())
+    logger.info(
+        "embedding_audio_load_timing backend=soundfile elapsed_s=%.3f sample_rate=%d channels=%d frames=%d",
+        time.perf_counter() - load_started,
+        sample_rate,
+        waveform.shape[0],
+        waveform.shape[1],
+    )


+def _outro_marker_score(normalized_text: str) -> tuple[int, float]:
+    if not normalized_text:
+        return 0, 0.0
+
+    matched = {
+        marker for marker in _OUTRO_HALLUCINATION_MARKERS if marker in normalized_text
+    }
+    marker_chars = sum(len(marker) for marker in matched)
+    return len(matched), marker_chars / len(normalized_text)


github-actions · 2026-05-07T05:01:14Z

❤️ Great PR @MapleEve ❤️

The growth of project is inseparable from user feedback and contribution, thanks for your contribution!
项目的成长离不开用户反馈和贡献，感谢您的贡献!

Copilot AI review requested due to automatic review settings May 7, 2026 04:02

Copilot started reviewing on behalf of MapleEve May 7, 2026 04:03 View session

chatgpt-codex-connector Bot reviewed May 7, 2026

View reviewed changes

Copilot AI reviewed May 7, 2026

View reviewed changes

MapleEve added 10 commits May 7, 2026 12:55

fix: avoid blocking health during GPU cleanup

6e4ce52

fix: keep ASR runtime on cuDNN9-compatible deps

947c42a

fix: install WhisperX without replacing ASR runtime

0c04619

fix: include WhisperX alignment runtime dependency

222ebb2

fix: filter stock outro hallucinations

4d3222e

fix: isolate whisperx alignment runtime

70dba40

fix: widen stock outro hallucination guard

047c7ea

fix: stabilize embedding audio slicing

4839fed

chore: release 0.7.6

64496d5

fix: remove NLTK runtime dependency

439a5d2

MapleEve force-pushed the fix/health-unblock-gpu-cleanup branch from c50fabd to 439a5d2 Compare May 7, 2026 04:56

MapleEve merged commit 6d5242a into main May 7, 2026
8 checks passed

MapleEve deleted the fix/health-unblock-gpu-cleanup branch May 7, 2026 05:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.7.6#29

Release v0.7.6#29
MapleEve merged 10 commits into
mainfrom
fix/health-unblock-gpu-cleanup

MapleEve commented May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

codecov Bot commented May 7, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		data, sample_rate = sf.read(audio_path, dtype="float32", always_2d=True)
		waveform = torch.from_numpy(data.T.copy())

Conversation

MapleEve commented May 7, 2026

Change Type

Description of Change

Verification

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

codecov Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented May 7, 2026 •

edited

Loading