Release v0.8.1
v0.8.1 — stripwise SSIMULACRA2 + perf
Added
compute_ssimulacra2_stripandSsimulacra2Reference::compare_strip— strip-wise SSIMULACRA2 with bounded peak memory for very large images. Processes the image in horizontal strips (default 32-aligned, halo of 96 rows for IIR Gaussian convergence) and accumulates per-scale SSIM and edge-diff sums across strips. Scores match the full-image path to within ~1e-5 on the 0..100 scale at 1024² and larger; identical-image inputs round-trip to 100 in both modes. Bounds dist-side peak memory to ~24 * strip_h * width * 4 Binstead of ~24 * height * width * 4 B.Ssimulacra2StripConfig— configurable halo size and underlying SIMD backend for strip processing. Defaults are tuned for atomic-tolerance parity with the full path.HALO_ROWS_DEFAULT(96) andMIN_STRIP_HEIGHT(8) public constants.- Hidden
Ssimulacra2Reference::scale_planesaccessor returning aScalePlanesView; required by the strip walker.
Added (from prior unreleased work)
CompareContextandSsimulacra2Reference::compare_with(&mut ctx, distorted)— zero-allocation batch-comparison API. Pair withreference.compare_context(); subsequent calls reuse the working buffers (mul,mu2,sigma2_sq,sigma12,img2_planar, blur state) instead of allocating ~13 image-sizedVec<f32>planes per call. Measured 1.10–1.25× faster thancompare()on the precompute benchmark at 256x256 / 512x512 / 1024x1024 / 1920x1080.LinearRgbImage::try_newfallible constructor returningLinearRgbImageErrorfor invalid dimensions or data length.Ssimulacra2Error::ImageTooLargevariant and publicMAX_IMAGE_PIXELSconstant (16384*16384).
Changed
- Skip per-channel SSIM and edge-difference work whose final-score weight is zero. Bit-identical to the prior path (the dropped contributions multiplied by zero downstream); reference-parity test passes across the C++ corpus including 64x64 cases where
scales_n < NUM_SCALESmakesscore()'s linear WEIGHT walk shift in the layout. Lossless variant of Technique 2 from Kanetaka et al. IWAIT 2026, DOI 10.1117/12.3100969. - Hoist the 6 IIR-state vectors used by the SIMD vertical blur pass out of the per-call inner function and onto
SimdGaussian, eliminating ~180 smallVec<f32>allocations per ssim2 frame.
Fixed
LinearRgbImage::newnow validates dimensions and data length at runtime (wasdebug_assert_eq!only) so release-mode misuse no longer constructs malformed images that panic deep inFrom<LinearRgbImage> for yuvxyb::LinearRgb.SimdGaussian::newno longer eagerly allocatesmax_width * 4096floats; the temp buffer grows on demand. Also guards againstusizeoverflow on 32-bit targets whenwidth * heightwould wrap.
What's Changed
New Contributors
Full Changelog: v0.8.0...v0.8.1