Skip to content

Conversation

@flexiondotorg
Copy link
Contributor

@flexiondotorg flexiondotorg commented Jan 16, 2026

Summary

  • Implement three-stage measurement capture (Input → Filtered → Final) for elected silence and speech regions
  • Add MeasureOutputSilenceRegion and MeasureOutputSpeechRegion functions capturing 20+ metrics per region
  • Expand report tables to display comprehensive three-column comparisons across all measurement fields
  • Fix ebur128 timestamp discontinuity, missing metadata flags, and peak unit conversion issues
  • Add debug logging infrastructure for measurement function tracing

Problem

The processing pipeline had no visibility into how elected silence and speech regions evolved through the four-pass architecture. This made it impossible to:

  • Validate that noise reduction actually improved silence regions
  • Track how compression and limiting affected speech characteristics
  • Diagnose measurement anomalies in region analysis
  • Tune adaptive parameters based on inter-pass comparisons

Additionally, several bugs caused incorrect or missing measurements:

  • ebur128 sliding window calculations failed due to timestamp discontinuities after atrim
  • Speech region ebur128 lacked metadata=1 flag, producing no frame metadata
  • Peak values displayed as linear ratios instead of dB

Solution

Phase 1-2: Upgraded data structures and implemented full-spectrum measurement functions using astats + aspectralstats + ebur128 filter chains.

Phase 3-4: Wired measurement capture into Pass 2 (filtering) and Pass 4 (normalisation) with non-fatal error handling.

Phase 5: Expanded report tables to show Input/Filtered/Final columns for all 20+ metrics including amplitude, spectral, and loudness characteristics.

Bug fixes:

  • Added asetpts=PTS-STARTPTS after atrim to reset timestamps for ebur128
  • Added metadata=1:peak=sample+true to speech region ebur128 filter spec
  • Added linearRatioToDB() conversion for TruePeak and SamplePeak values
  • Improved edge case display (digital silence, infinite reduction, undefined spectral)

Testing

  • Added comprehensive unit tests for both measurement functions with synthetic audio generation
  • Extended processor tests to verify SilenceSample and SpeechSample population
  • Added table formatting tests for new report helpers
  • Manual verification with real podcast recordings

Summary by cubic

Adds three-stage region measurement (Input → Filtered → Final) for elected silence and speech, and expands reports to compare 20+ metrics per region. Improves accuracy of loudness and peak data and adds debug logging for measurement tracing.

  • New Features

    • Capture full metrics for elected regions via MeasureOutputSilenceRegion and MeasureOutputSpeechRegion (astats + aspectralstats + ebur128), recorded in Pass 2 and Pass 4.
    • Store rich region data in OutputMeasurements (SilenceSample, SpeechSample) with 20+ fields for amplitude, spectral, and loudness.
    • Expand Noise Floor and Speech Region tables to three columns with better formatting (dB/LUFS floors, digital silence, “Character” row).
    • Add formatting helpers and processor.DebugLog; include unit and processor tests for new paths.
  • Bug Fixes

    • Reset timestamps after atrim (asetpts=PTS-STARTPTS) to fix ebur128 sliding window discontinuities.
    • Enable ebur128 metadata and true/sample peak reporting (metadata=1:peak=sample+true).
    • Convert ebur128 peak ratios to dB (linearRatioToDB) and handle digital silence/undefined spectral values in reports.

Written for commit 07c7ae6. Summary will update on new commits.

…trics

- Replace OutputMeasurements.SilenceSample type from SilenceAnalysis (6
fields) to SilenceCandidateMetrics (20+ fields)
- Add OutputMeasurements.SpeechSample field for future speech region
tracking
- Update MeasureOutputSilenceRegion to return SilenceCandidateMetrics
- Adjust report helpers to use new metric types
Enables comprehensive silence/speech region comparison across processing
passes. Phase 2 will expand filter graph to capture full spectral and
loudness measurements.

Signed-off-by: Martin Wimpress <martin@wimpress.org>
…tages

Adds MeasureOutputSilenceRegion and MeasureOutputSpeechRegion functions
to capture full audio metrics (amplitude, spectral, loudness) for
elected regions at different pipeline stages. Enables tracking how
silence and speech candidates evolve through the four-pass processing
pipeline for quality validation and future inter-pass adaptive tuning.
- Extend MeasureOutputSilenceRegion to capture all 20+ metrics (RMS,
peak, crest, entropy, spectral characteristics, LUFS, true peak)
- Add MeasureOutputSpeechRegion with identical measurement capabilities
- Add comprehensive unit tests for both functions with synthetic audio
generation
- Use astats + aspectralstats + ebur128 filter chain for complete
analysis
Related to Phase 1 (commit 61ea87d) which established data structures.
Phase 3-4 will wire these functions into the processing pipeline.

Signed-off-by: Martin Wimpress <martin@wimpress.org>
…g stage

Integrates MeasureOutputSilenceRegion and MeasureOutputSpeechRegion into
the Pass 2 pipeline to automatically capture filtered audio metrics for
elected regions. Enables tracking how the adaptive filter chain affects
silence and speech candidates, providing foundation for quality
validation and future inter-pass tuning.
- Call MeasureOutputSilenceRegion after processWithFilters completes
- Add MeasureOutputSpeechRegion call to capture speech region metrics
- Use non-fatal error handling to prevent measurement failures from
  blocking processing
- Add nil-safety checks for NoiseProfile and SpeechProfile before
  extraction
- Update processor tests to verify both SilenceSample and SpeechSample
  population

Signed-off-by: Martin Wimpress <martin@wimpress.org>
…ation stage

Extends Phase 4 measurement recapture to capture silence and speech
region
metrics after loudnorm processing completes. Both measurements execute
after
final file rename to measure the actual normalised output. Enables
complete
three-stage comparison (Input → Filtered → Final) for quality validation
and
future adaptive tuning across all processing passes.

Signed-off-by: Martin Wimpress <martin@wimpress.org>
Enhances noise floor and speech region tables to display comprehensive
three-column comparisons (Input → Filtered → Final) with all 20+
measurement fields. Enables visual validation of how the complete
processing pipeline affects elected regions across amplitude, spectral,
and loudness characteristics. Completes Phase 5 and the entire
measurement recapture feature.

Signed-off-by: Martin Wimpress <martin@wimpress.org>
Fixes ebur128 timestamp discontinuity, missing metadata flags, and peak
unit conversion issues that caused incorrect or missing measurements in
Pass 2/4 region capture. Improves report display for edge cases.
Bug fixes:
- Add asetpts=PTS-STARTPTS after atrim to reset timestamps for ebur128
  sliding window calculations in both measurement functions
- Add metadata=1 flag to speech region ebur128 filter spec to enable
  frame metadata output
- Convert TruePeak and SamplePeak from linear ratios to dB using
  linearRatioToDB() in both measurement functions
Display improvements:
- Add formatMetricDB(), formatMetricLUFS(), formatMetricSpectral()
  helpers for consistent value formatting
- Show '< -120 dBFS' instead of '-' for digital silence regions
- Show '> N dB' instead of '++Inf' for noise reduction values
- Show 'n/a' for undefined spectral values in silent regions
- Add Character subsection to Speech Region Analysis table
Debug logging:
- Add DebugLog package variable with routing to --debug log file
- Add function entry and summary logging to measurement functions

Signed-off-by: Martin Wimpress <martin@wimpress.org>
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 10 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="internal/processor/analyzer.go">

<violation number="1" location="internal/processor/analyzer.go:3730">
P3: Comment says "for silence" but this is in the speech measurement function. Appears to be a copy-paste error.</violation>
</file>

<file name="internal/logging/table.go">

<violation number="1" location="internal/logging/table.go:185">
P2: Missing positive infinity check in `formatMetricLUFS`. Unlike the similar `formatMetricDB` function which handles `math.IsInf(value, 1)`, this function would format `+Inf` as a string instead of returning `MissingValue`. Add the infinity check for consistency and defensive programming.</violation>
</file>

<file name="internal/processor/processor_test.go">

<violation number="1" location="internal/processor/processor_test.go:86">
P2: Same nil pointer dereference risk as above. Add a nil guard for `result.Measurements` before accessing `SpeechProfile`.</violation>
</file>

<file name="internal/processor/normalise.go">

<violation number="1" location="internal/processor/normalise.go:679">
P1: Potential nil pointer dereference: `inputMeasurements` is a pointer that could be nil. Check `inputMeasurements != nil` before accessing its fields to prevent a panic.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

- Add nil guards for inputMeasurements in normalise.go before accessing
  NoiseProfile and SpeechProfile
- Add +Inf check in formatMetricLUFS for consistency with formatMetricDB
- Add nil guards for result.Measurements in processor_test.go (4 locations)
- Fix copy-paste comment error in analyzer.go

Addresses review feedback from PR #5
@flexiondotorg flexiondotorg merged commit 0c539db into main Jan 16, 2026
5 checks passed
@flexiondotorg flexiondotorg deleted the measurements branch January 16, 2026 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants