Claude/optimize sensor resolve 1 xpjc by HanSur94 · Pull Request #38 · HanSur94/FastSense

HanSur94 · 2026-03-19T14:45:23Z

No description provided.

toStepFunction was O(n²) due to repeated cell array growth and array concatenation inside the loop. Replace with single-pass pre-allocated output: vectorized active-segment detection, vectorized gap detection, and direct index writes with a final trim. Also fix the allChanges concatenation in resolve() Step 1 — pre-compute total length and fill via block copy instead of growing with []. https://claude.ai/code/session_01GgjQM4v4dyk378ZHJbBCTJ

New MEX file replaces the MATLAB toStepFunction inner loop with a single-pass C implementation: count active segments, pre-allocate output, fill with gap detection, trim once. Eliminates all MATLAB interpreter overhead for this hot path. - to_step_function_mex.c: C MEX source with pre-allocated buffers - build_mex.m: register new MEX + copy to SensorThreshold/private - mergeResolvedByLabel.m: persistent useMex gate dispatches to MEX when compiled, falls back to pure-MATLAB implementation otherwise https://claude.ai/code/session_01GgjQM4v4dyk378ZHJbBCTJ

Rewrite with platform-specific SIMD for all hot phases: - Phase 1: NaN scan uses SIMD self-compare (v==v is false for NaN) with branchless conditional-store index collection and early-exit skip when all lanes are NaN. AVX2: 4 doubles/cycle, SSE2/NEON: 2. - Phase 2: segEnds shifted copy via SIMD load/store (simd_copy). - Phase 3: Gap detection gathers prevEnd/currStart into packed buffers then uses SIMD compare + movemask (AVX2/SSE2) or lane extract (NEON). - Phase 5: Final trim-to-size copy via simd_copy. All four SIMD backends supported: AVX2, SSE2, ARM NEON, scalar fallback. Uses simd_utils.h indirectly (same include path) and adds its own intrinsics directly for NaN-specific ops not in simd_utils.h. https://claude.ai/code/session_01GgjQM4v4dyk378ZHJbBCTJ

Function-based tests (Octave-compatible): - All NaN, single active, all contiguous, different values - NaN gap separator, mixed contiguous+gap, dataEnd edge - Single boundary, MEX parity check (when compiled) Class-based MEX parity tests (MATLAB unittest): - Same edge cases as above, plus: - 20 randomized small trials with ~40% NaN density - 100K segment stress test exercising full SIMD paths - 50K all-active (no gaps) test - 10K all-NaN large test - 10K alternating NaN worst-case for gap detection https://claude.ai/code/session_01GgjQM4v4dyk378ZHJbBCTJ

github-actions

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'FastSense Performance'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.10.

Benchmark suite	Current: `d3356a5`	Previous: `763306b`	Ratio
`Downsample mean std(1M)`	`0.069` ms	`0.033` ms	`2.09`
`Instantiation mean std(1M)`	`1.492` ms	`1.082` ms	`1.38`
`Zoom cycle mean (1M)`	`16.405` ms	`14.501` ms	`1.13`
`Downsample mean std(5M)`	`0.085` ms	`0.031` ms	`2.74`
`Render mean std(5M)`	`15.119` ms	`1.436` ms	`10.53`
`Zoom cycle mean (5M)`	`15.82` ms	`13.757` ms	`1.15`
`Downsample mean std10M)`	`0.215` ms	`0.096` ms	`2.24`
`Instantiation mean std10M)`	`1.618` ms	`1.351` ms	`1.20`
`Render mean std10M)`	`4.126` ms	`2.062` ms	`2.00`
`Zoom cycle mean (10M)`	`15.5` ms	`13.693` ms	`1.13`
`Zoom cycle mean std10M)`	`0.982` ms	`0.707` ms	`1.39`
`Downsample mean std50M)`	`1.129` ms	`0.516` ms	`2.19`
`Zoom cycle mean (50M)`	`15.681` ms	`13.608` ms	`1.15`
`Downsample mean (100M)`	`213.427` ms	`190.334` ms	`1.12`
`Downsample mean ( std00M)`	`10.31` ms	`0.463` ms	`22.27`
`Zoom cycle mean (100M)`	`15.812` ms	`13.617` ms	`1.16`
`Downsample mean ( std00M)`	`33.218` ms	`0.463` ms	`71.75`
`Instantiation mean ( std00M)`	`1241.429` ms	`183.504` ms	`6.77`
`Render mean (500M)`	`688.837` ms	`440.434` ms	`1.56`
`Render mean ( std00M)`	`504.688` ms	`2.383` ms	`211.79`

This comment was automatically generated by workflow using github-action-benchmark.

CC: @HanSur94

toStepFunction was a local function inside mergeResolvedByLabel.m, making it invisible outside that file. The Octave test failed because local functions cannot be called from external test files, even when the private directory is on the path. Extracting it to its own .m file in private/ keeps the same encapsulation (only SensorThreshold code can call it) while making it accessible to the test's proxy-directory pattern. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The needs_build check in install.m only probed for binary_search_mex. If older MEX files existed but to_step_function_mex was missing, install() would skip build_mex() entirely. Now probes both binary_search_mex and to_step_function_mex so any missing MEX triggers an incremental rebuild. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Stress-tests the full resolve pipeline with 500M datapoints, 2 state channels (~9K total transitions), and 4 threshold rules with different condition types (single-condition, multi-condition, upper, lower). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

MEX files in private/ directories are invisible to exist() from outside the parent package. Check actual file paths instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Renders the 500M-point sensor with all resolved thresholds and violations after the timing runs complete. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Resolves a tiny 4-point sensor before the timed runs to force MATLAB's JIT compiler to compile all code paths (Sensor.resolve, binary_search, compute_violations, toStepFunction, mergeResolvedByLabel). This way all 3 timed runs measure steady-state performance. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Runs a tiny end-to-end workflow (Sensor, StateChannel, resolve, FastSense render) on trivial data during install(). This forces MATLAB's JIT to compile all hot code paths once per session, so the first real call to resolve() or render() has no warmup penalty. Uses a persistent flag so repeated install() calls skip the warmup. Wrapped in try/catch so it never blocks installation on failure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Prevents a visible window flash during install() and avoids display issues on headless CI runners. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude added 4 commits March 19, 2026 13:30

github-actions Bot reviewed Mar 19, 2026

View reviewed changes

HanSur94 and others added 8 commits March 19, 2026 17:53

fix: probe MEX files by path instead of exist() in stress benchmark

ecaba36

MEX files in private/ directories are invisible to exist() from outside the parent package. Check actual file paths instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

bench: add FastSense plot with thresholds to stress benchmark

ac4dbc2

Renders the 500M-point sensor with all resolved thresholds and violations after the timing runs complete. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: use hidden figure for JIT warmup render

d3356a5

Prevents a visible window flash during install() and avoids display issues on headless CI runners. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

HanSur94 merged commit a1a4ec3 into main Mar 19, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Claude/optimize sensor resolve 1 xpjc#38

Claude/optimize sensor resolve 1 xpjc#38
HanSur94 merged 12 commits into
mainfrom
claude/optimize-sensor-resolve-1Xpjc

HanSur94 commented Mar 19, 2026

Uh oh!

github-actions Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HanSur94 commented Mar 19, 2026

Uh oh!

github-actions Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

⚠️ Performance Alert ⚠️

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot left a comment •

edited

Loading