feat(processor): add voice-activated detection and adaptive speech RMS minimum by flexiondotorg · Pull Request #53 · linuxmatters/jivetalking

flexiondotorg · 2026-03-14T17:49:52Z

Summary

Add automatic detection of voice-activated recordings (Riverside/Zencastr) and adaptive speech RMS minimum threshold. Voice-activated mode detects when recordings are triggered by voice activity (significant silence fraction), automatically adjusting tolerance and processing parameters. Speech RMS minimum adapts from fixed -40 dBFS to a formula based on spectral centroid, improving filter adaptation for diverse recording conditions.

Changes

Detect voice-activated recordings from digital silence fraction in candidate selection
Widen speech interruption tolerance for voice-activated mode (2.0-3.0s instead of fixed)
Disable NoiseRemove compand expansion for uncalibrated recordings to avoid over-expansion
Implement adaptive speech RMS minimum (speechRMSMinimum = max(-70, cumulativeSum * 0.8 - 22))
Add voice-activated indicator and compander annotation to analysis display
Update documentation for detection logic and silence candidate filtering
Update README with voice-activated recording capabilities
Improve test coverage for silence detection and analysis display output

Testing

Unit tests verify silence fraction detection for voice-activated recordings
Analysis display tests confirm correct output formatting with voice-activated and compander annotations
Adaptive RMS formula tested with diverse speech profile measurements
Anna's recording now detects 6 silence candidates instead of 1 after enabling voice-activated detection

…ce fraction Add VoiceActivated field to AudioMeasurements and detectVoiceActivated() helper that classifies recordings based on the fraction of digital-silence-rejected candidates. When >= 95% of silence candidates are digital silence, the recording is classified as voice-activated. Signed-off-by: Martin Wimpress <code@wimpress.io>

…ted recordings Add voiceActivatedSpeechInterruptionToleranceIntervals constant (40 intervals, 10 seconds) to bridge platform-inserted gaps in voice-activated media files. - Add voiceActivated bool parameter to findSpeechCandidatesFromIntervals - Implement conditional selection: 8 intervals (2s) for standard recordings, 40 intervals (10s) for voice-activated - Update analyzer.go call site to pass measurements.VoiceActivated - Extend analyzer_test with two new cases: 7.5s gap bridges with widened tolerance, 12.5s gap splits even with widened tolerance - Update all 6 existing test call sites to pass false Signed-off-by: Martin Wimpress <code@wimpress.io>

- Add NoiseRemoveCompandEnabled field to FilterChainConfig (default true) - Modify buildNoiseRemoveFilter to return anlmdn-only spec when compand disabled - Replace blind -55 dBFS fallback with explicit compand disabling when NoiseProfile unavailable - Update tuneNoiseRemove to disable compand for recordings lacking noise calibration - Add sub-tests verifying compand enable/disable paths in TestBuildNoiseRemoveFilter Signed-off-by: Martin Wimpress <code@wimpress.io>

…annotation - Display "Voice-activated recording detected" in silence detection section when voice-activated recording is confirmed, covering both candidate selection paths (no elected candidate, no silence detected) - Annotate filter adaptation section with "NR Compander: disabled" when noise profile unavailable - Include "Voice-Activated: yes" annotation in detailed log file reportsacross candidates and none-found sections - Add comprehensive test coverage with 8 test cases for display output, compander annotation, and diagnostic reporting Signed-off-by: Martin Wimpress <code@wimpress.io>

Implement dynamic `speechRMSMin` calculation based on recording characteristics: - Add `speechRMSMinimum`, `speechRMSMinimumOffset` (12.0), and `speechRMSMinimumNoiseMargin` (6.0) constants - Implement `computeSpeechRMSMinimum()` helper using formula: max(RMSLevel - 12, NoiseFloor + 6) - Update `speechScore()` to accept adaptive threshold parameter - Update `findSpeechCandidatesFromIntervals()` to derive threshold from measurements at call site - Pass RMS and noise floor measurements through analyser pipeline Fixes speech detection in quiet recordings where fixed -40 dBFS threshold was above most speech intervals. - Add 6 test cases for `computeSpeechRMSMinimum()` covering edge cases and fallback behaviour - Update 9 call sites across 8 sub-tests with new parameters Signed-off-by: Martin Wimpress <code@wimpress.io>

…nd NR compander - Update Pass 2 filter table to clarify compand residual suppression requires noise profile - Add voice-activated recording detection (Riverside, Zencastr) to analysis-only output description - Explain specific consequence of missing silence section (NR compander disabled, spectral denoiser runs alone) Signed-off-by: Martin Wimpress <code@wimpress.io>

…tection capabilities - Correct Pass 2 NoiseRemove row: compand disabled when no noise profile exists; note that anlmdn filter always active - Qualify compand dependency on noise profile in Section 4 capabilities - Add two newly implemented Speech-Aware Processing capabilities: * Digital silence rejection in room tone candidate selection * Voice-activated recording detection from silence fraction Signed-off-by: Martin Wimpress <code@wimpress.io>

cubic-dev-ai

5 issues found across 13 files

Confidence score: 3/5

There is concrete regression risk in internal/processor/adaptive.go: if NoiseRemoveCompandEnabled is not re-enabled on the valid-profile path, one uncalibrated file can keep compand disabled for subsequent files in the same run.
internal/processor/analyzer_candidates.go has user-facing detection risk: voice-activated mode is only inferred from silence candidates >=8s, so shorter digital-silence gaps may never activate widened speech tolerance.
Logging/reporting issues in internal/logging/analysis_display.go and internal/logging/report.go can hide or misstate recording mode/disable reasons, which reduces operator visibility and can make troubleshooting harder even when processing succeeds.
Pay close attention to internal/processor/adaptive.go, internal/processor/analyzer_candidates.go, internal/logging/analysis_display.go, and internal/logging/report.go - processing-state carryover and mode/disable-status reporting are the main risk areas.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/processor/adaptive.go">

<violation number="1" location="internal/processor/adaptive.go:689">
P1: Re-enable `NoiseRemoveCompandEnabled` on the valid-profile path; otherwise one uncalibrated file disables compand for every later file in the same run.</violation>
</file>

<file name="internal/logging/analysis_display.go">

<violation number="1" location="internal/logging/analysis_display.go:130">
P2: The voice-activated indicator is omitted whenever an elected silence candidate exists, so analysis output hides the detected recording mode in the common success path.</violation>

<violation number="2" location="internal/logging/analysis_display.go:255">
P2: This message hard-codes the wrong disable reason; the compander is also disabled for invalid noise-floor calibration, not just a missing noise profile.</violation>
</file>

<file name="internal/logging/report.go">

<violation number="1" location="internal/logging/report.go:1613">
P2: The voice-activated indicator is only printed in some silence-report branches, so it disappears for reports that have a `NoiseProfile` or only `SilenceRegions`.</violation>
</file>

<file name="internal/processor/analyzer_candidates.go">

<violation number="1" location="internal/processor/analyzer_candidates.go:1030">
P1: This only detects voice-activated mode from >=8s silence candidates, so recordings with shorter digital-silence gaps never enable the widened speech tolerance.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

internal/processor/adaptive.go

internal/processor/analyzer_candidates.go

internal/logging/analysis_display.go

internal/logging/report.go

…alysis output - Track derivation source (astats, RMS estimate, loudness estimate) in NoiseFloor Source field to distinguish between measured and fallback values - Update analysis display to show the derived noise floor value with appropriate source label instead of displaying -Inf when astats returns infinity - Add noiseFloorSourceLabel() helper to convert source identifier to user-facing label matching the measurement derivation tier Fixes analysis output for recordings with no detectable noise floor (e.g. digital silence), ensuring the fallback-derived value is displayed with clarity about its source rather than the unmeaningful -Inf value. Signed-off-by: Martin Wimpress <code@wimpress.io>

cubic-dev-ai

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/processor/analyzer.go">

<violation number="1" location="internal/processor/analyzer.go:207">
P2: `NoiseFloorSource` is not updated when elected silence overrides `NoiseFloor`, so the exported source can disagree with the actual noise-floor value.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

internal/processor/analyzer.go

…utput - Update threshold annotation to show room tone estimate explicitly with +1 dB offset - Remove extraneous blank line between candidate count and voice-activated indicator - Add blank line before SPEECH DETECTION section for consistent formatting Signed-off-by: Martin Wimpress <code@wimpress.io>

…loor tracking - Re-enable NoiseRemove compand when processing files with valid noise profiles after a file without a profile (prevents state leakage between files) - Display voice-activated indicator in both elected and no-candidate branches for complete diagnostic information - Change compander disabled message from condition-specific to generic "disabled" as it disables for both nil profile and non-negative noise floor - Track NoiseFloorSource as "silence_profile" when elected silence region overrides the derived noise floor with its measured value Added regression test for compand re-enabling on valid-profile path. Signed-off-by: Martin Wimpress <code@wimpress.io>

cubic-dev-ai

0 issues found across 5 files (changes from recent commits).

_{Requires human review: Modifies core audio processing logic, including adaptive thresholding and filter chain configuration, which requires human domain expertise to validate.}

flexiondotorg added 7 commits March 14, 2026 12:39

cubic-dev-ai bot reviewed Mar 14, 2026

View reviewed changes

internal/processor/analyzer.go Show resolved Hide resolved

flexiondotorg added 2 commits March 14, 2026 19:18

cubic-dev-ai bot reviewed Mar 14, 2026

View reviewed changes

flexiondotorg merged commit 057c206 into main Mar 14, 2026
7 checks passed

flexiondotorg deleted the detection branch March 14, 2026 18:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(processor): add voice-activated detection and adaptive speech RMS minimum#53

feat(processor): add voice-activated detection and adaptive speech RMS minimum#53
flexiondotorg merged 10 commits intomainfrom
detection

flexiondotorg commented Mar 14, 2026

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

flexiondotorg commented Mar 14, 2026

Summary

Changes

Testing

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant