Skip to content

v0.0.4

Choose a tag to compare

@MagicalTux MagicalTux released this 05 May 06:14
· 79 commits to master since this release
c0a81ed

Other

  • land §5.2.5.2.2 Heuristic Scaling (round 33)
  • env_prev tracking + walker state hoisting (round 32)
  • land §5.2.3-5.2.7 PCM synthesis chain (round 31)
  • land Tables 43-46 bitstream walker (round 30)
  • land Annex C tables + arithmetic decoder core (round 29)

Added

  • Round 33 — §5.2.5.2.2 Heuristic Scaling (Pseudocodes 27/28/29/30)
    (TS 103 190-1 §5.2.5.2.0 selector + §5.2.5.2.2):

    • New map_db_to_lin_q10() (Pseudocode 29) and map_lin_to_db_q10()
      (Pseudocode 30) — Q.10 fixed-point dB↔linear converters using the
      Annex C.14 SLOPES_DB_TO_LIN / OFFSETS_DB_TO_LIN /
      SLOPES_LIN_TO_DB / OFFSETS_LIN_TO_DB LUTs (already shipped in
      ssf_tables). Out-of-range inputs clamp to the spec's 100 << 10
      (dB→lin) and 40 << 10 (lin→dB) ceilings.
    • New heuristic_scaling() (Pseudocode 28) implements the full
      HeuristicScaling(iRfu, env_in, ...) -> int_weights_dB[] chain:
      dynamic-range compression on env_in[] when the spread exceeds
      the 40 Q.10 threshold, sort-descending of env_local[],
      Map_dB_to_Lin per band, weighted sum scaled by iRfu², reverse
      water-filling to find iTCurrLev, and a final per-band
      Map_Lin_to_dB(iTCurrLev) - env_local[band] weight that's clamped
      to [0, 15 << 10].
    • New apply_heuristic_scaling() (Pseudocode 27) wraps Pseudocode 28
      with the env_in = 3 * env_alloc pre-multiply, the LF-boost
      threshold (i_w_dB[0] knocked down by 3), and the
      env_alloc_mod = (env_alloc - i_w_dB).clamp(ENV_MIN, ENV_MAX) +
      f_gain_q = pow(10, 1.5 / 20 * f_w_dB) post-processing. Returns
      (env_alloc_mod[band], f_gain_q[band]).
    • synthesize_granule() now dispatches the §5.2.5.2.0 selector:
      when f_rfu > 0 && !variance_preserving AND the SSF bandwidths
      table is available, the heuristic-scaling branch fires and
      inverse_heuristic_scale() consumes the resulting f_gain_q[]
      instead of the previous all-1 stub. The variance_preserving
      block also correctly skips the inverse-scale call per
      §5.2.5.2.0 step 5. Pre-r33 the synth crashed (well, silently
      bailed) on f_pred_gain != 0 blocks; now they decode all the way
      through.
    • 11 new lib unit tests (470 → 481):
      map_db_to_lin_zero_input / map_db_to_lin_out_of_range_clamps /
      map_db_to_lin_monotone_within_table /
      map_lin_to_db_zero_input / map_lin_to_db_out_of_range_clamps /
      heuristic_scaling_zero_envelope_yields_zero_weights /
      heuristic_scaling_clamps_to_max /
      apply_heuristic_scaling_short_circuits_on_empty /
      apply_heuristic_scaling_clamps_env_alloc_mod /
      synthesize_granule_runs_with_heuristic_scaling_branch /
      synthesize_granule_variance_preserving_skips_heuristic.
  • Round 32 — SSF SHORT_STRIDE env_prev tracking + walker state
    hoisting
    (TS 103 190-1 §5.2.3.0 Note 2, §5.2.3.0b Pseudocode 4b,
    §4.3.7.4.2 Pseudocodes 54-57):

    • SsfSynthState gains a new env_prev: Vec<i32> field that
      synthesize_granule() latches at the end of each granule with the
      resolved envelope (post-decode_envelope δ-chain), not the raw
      delta symbols. SHORT_STRIDE P-granules now use this latch as the
      env_prev[] interpolation input when the caller doesn't supply
      one — interpolate_envelope no longer degrades to a flat-zero
      envelope across frame boundaries on real P-frame streams.
      Ac4Decoder::run_ssf_channel drops its zero-vector
      state_idx_env_prev stub and passes an empty slice; the synth
      pulls from state.env_prev automatically.
    • Ac4Decoder adopts Vec<SsfChannelState> (one per channel,
      grown on demand) keyed ssf_walker_state. New
      walk_ac4_substream_stateful() and the matching
      _stateful variants of parse_mono_audio_data_outer,
      parse_stereo_audio_data_outer, parse_stereo_data_body, and
      parse_aspx_acpl1_mdct_body thread an
      Option<&mut [SsfChannelState]> through the SSF body parses so
      the walker's dither / noise RNGs (Pseudocodes 54-57) and
      prev_pred_lag_idx / last_num_bands / env_prev (raw symbol
      snapshot) persist across frames. The original public functions
      keep their pre-r32 signatures and delegate with None so the
      sibling repos / test fixtures stay binary-compatible.
    • 4 new lib unit tests (466 → 470):
      synthesize_granule_latches_env_prev (verifies state.env_prev
      holds the post-decode_envelope chain after each granule),
      short_stride_p_frame_uses_state_env_prev (proves a P-granule
      interpolates against the latched I-granule envelope, not zero),
      synthesize_ssf_data_chains_env_prev_across_granules (two-granule
      end-to-end), and
      walk_ac4_substream_stateful_persists_ssf_walker_state
      (round-trip through the substream walker leaves the channel-0
      state's last_num_bands / last_n_mdct / env_prev populated).
  • Round 31 — Speech Spectral Frontend (SSF) PCM synthesis chain (TS
    103 190-1 §5.2.3 / §5.2.4 / §5.2.5 / §5.2.6 / §5.2.7 +
    §5.2.8.1 — Pseudocodes 4a / 4b / 4c / 4d / 4e / 26 / 31 / 32 / 33 /
    34 / 35 / 36 / 37 / 38 / 39):

    • New ssf_synth module turning the per-block indices on
      crate::ssf::SsfData into n_mdct spectral lines per block.
      Functions: decode_envelope (Pseudocode 4a — δ-decode chain over
      env_curr[]), interpolate_envelope (Pseudocode 4b — SHORT_STRIDE
      fixed-point linear interpolation between env_prev[] and the
      current granule's env[]), decode_gains (Pseudocode 4c —
      pow(10, gain_idx * 0.1) per block, LONG_STRIDE clamp to 1.0),
      refine_envelope (Pseudocode 4d — band-≥2 gain application + the
      round(2 * gain_idx / 3) allocation tweak with [-64, 63] clamp).
    • decode_predictor (Pseudocode 4e) reconstructs f_pred_gain from
      PRED_GAIN_QUANT_TAB and f_pred_lag = 640 * 2^((idx - 509)/170),
      with i_prev_pred_lag_idx carried forward on SsfSynthState.
    • compute_helpers (Pseudocode 26 — f_rfu,
      i_alloc_dithering_threshold, adaptive noise gains) +
      build_alloc_table (Pseudocode 31 — no-rfu path:
      env_alloc_mod = env_alloc).
    • inverse_quantize_block (Pseudocode 32) implements all three
      branches: i_alloc == 0 noise-RNG path with the
      variance-preserving band > 1 branch, dithered branch via
      Idx2Reconstruction + POST_GAIN_LUT[i_alloc - 1] + the
      f_post_gain_var_pres = sqrt(post_gain) * f_adaptive_noise_gain_var_pres
      rule, and the no-dither MMSE branch via mmse_laplace
      (Pseudocode 33).
    • inverse_heuristic_scale (Pseudocode 34) — currently a no-op
      because the no-rfu path leaves f_gain_q == 1.
    • build_c_matrix (Pseudocode 39) reconstructs the per-tab_idx
      (2*Rf+1, 65, Rt) prediction-coefficient matrix from the
      quantized bytes in crate::ssf_pred_coeff using the
      1.1787855 * (q - 146) / 128 reconstruction formula and the
      spec's s = (-1)^(k+1) mirror rule for negative-η.
    • SubbandPredictorState::run (Pseudocodes 35 / 36 / 37) maintains
      f_spec_buffer[NUM_SPEC_BUF=5] + f_env_buffer[NUM_ENV_BUF=4]
      histories, runs the model-based extractor (f_period, k_s,
      tab_idx, Z-matrix even-reflection, the per-bin
      Σ_{ν,k} s * C[ν][f][k] * Z[bin+ν][k] summation), then applies
      Pseudocode 37's per-band shaper (f_envelope * f_pred_gain)
      with the I-frame integer_lag = 0 clamp.
    • inverse_flatten (Pseudocode 38) sums f_spec_res + f_spec_pred
      and multiplies by the per-band signal envelope.
    • synthesize_granule() runs the chain across every block in one
      granule; synthesize_ssf_data() runs it across both granules of
      one frame, threading env_prev[] between them.
    • Ac4Decoder adopts Vec<SsfSynthState> (one per channel) and
      consumes tools.ssf_data_primary / tools.ssf_data_secondary
      after the existing ASF/A-CPL pipeline: each granule's
      num_blocks * n_mdct spectrum is split per-block, fed into the
      per-channel KBD-windowed IMDCT + overlap-add, then truncated /
      padded to the frame's sample count. SSF substreams now emit real
      PCM instead of silence.
    • 16 new lib unit tests (450 → 466) covering: empty/empty-tail
      envelope decode, low-band-no-gain refinement, allocation-table
      clamping (min + max), zero-RFU + unit-window helpers, predictor
      presence/absence + delta-lag carry, full C-matrix dimensions
      across all 37 tab_idx values, the negative-η mirror rule, the
      subband-predictor zero-gain pass-through and finite-output smoke
      tests, the per-band envelope-gain inverse-flattening test, and a
      LONG_STRIDE I-granule synthesis end-to-end smoke (synthesize_granule
      on a synthetic granule). Plus one decoder-level integration test
      (ssf_synth_long_stride_iframe_end_to_end) that walks a synthetic
      SSF bitstream through parse_ssf_data then synthesize_ssf_data
      and verifies finiteness + zero-padding past num_bins.
    • Note: §5.2.5.2.2 Pseudocodes 27 / 28 / 29 / 30 (full Heuristic
      Scaling) are deferred — when f_pred_gain == 0 the spec
      short-circuits to env_alloc_mod = env_alloc + f_gain_q = 1,
      which is the no-rfu path landed here. Synthesis of streams that
      enable the predictor across many bands at once with
      variance_preserving == 0 will lose the heuristic envelope
      spreading until a follow-up round.
  • Round 30 — Speech Spectral Frontend (SSF) bitstream walker (TS 103
    190-1 §4.2.9 / §4.3.7 + §4.3.7.5 + Tables 43-46 / 111-113):

    • New ssf module with the four-table walker family:
      parse_ssf_data (Table 43 — b_ssf_iframe gate plus 1 / 2
      granules per frame_length >= 1536), parse_ssf_granule
      (Table 44 — stride_flag, I-frame num_bands_minus12, per-block
      predictor_presence_flag / delta_flag loop), parse_ssf_st_data
      (Table 45 — env_curr_band0_bits, I-frame
      env_startup_band0_bits, per-block gain_bits /
      predictor_lag(_delta)_bits / variance_preserving_flag /
      alloc_offset_bits), and parse_ssf_ac_data (Table 46 — drives
      decode_envelope_indices Pseudocode 48 + decode_predictor_gain
      Pseudocode 49 + decode_coefficient_indices Pseudocode 50, then
      AcDecodeFinish Pseudocode 47 termination-bit accounting).
    • New SsfFrameConfig derives (granule_length, num_granules, max_num_blocks) per Tables 112-113 with both
      from_toc(fs_index, frame_rate_index, frame_length) and a
      from_frame_len_base() 48 kHz convenience overload. SHORT_STRIDE
      is rejected when max_num_blocks < 1.
    • New SSF_BANDWIDTHS matrix transcribes Annex C.1 verbatim
      (19 bands × 8 block-length columns); SsfBinLayout::build()
      implements §4.3.7.5 Pseudocode 7 to derive start_bin[] /
      end_bin[] / num_bins from (num_bands, n_mdct).
    • New SsfChannelState carries forward dither / noise RNG state
      (reset per SSF-I-frame per Pseudocode 55), prev_pred_lag_idx
      (§5.2.4.0a), last_num_bands / last_n_mdct (inheritance for
      P-frame granules), and env_prev[] (§5.2.3.0).
    • Wired into asf::walk_ac4_substream for three call sites:
      parse_mono_audio_data_outer (mono SIMPLE / ASPX path —
      spec_frontend == SSF no longer returns Unsupported),
      the split-MDCT stereo parse_stereo_data_body (per-L/R SSF
      selection), and parse_aspx_acpl1_mdct_body split case (per-M/S
      SSF selection on the ACPL_1 residual layer). Parsed payload lands
      on SubstreamTools::ssf_data_primary /
      ssf_data_secondary slots.
    • Bug fix: decode_envelope_indices and decode_predictor_gain
      in ssf_ac previously capped at symbol 31 (AcDecodeSymbolExtCdf (cdf, 0, 31)); the spec's Pseudocodes 48 / 49 use (cdf, 0, 32).
      The 33-entry CDF tables (ENVELOPE_CDF_LUT,
      PREDICTOR_GAIN_CDF_LUT) supply 32 symbol slots — the previous
      cap clipped the highest-probability tail symbol.
    • 6 new lib unit tests + 1 new integration test in asf::tests
      (449 → 463 total): stride-flag block count, Annex C.1 anchors,
      SsfBinLayout::build for 48 kHz / 24 fps LongStride, frame-config
      resolution for all five Table 112-113 row classes, end-to-end
      LongStride I-frame walk, end-to-end ShortStride I-frame walk
      (3 live blocks, env_startup populated), and a substream-level
      integration test (mono_ssf_substream_walker_populates_ssf_data)
      that builds a synthetic SSF substream + walks it through the
      public walk_ac4_substream API.