Skip to content

engine: stride-step RuntimeView::flat_offset for sequential iteration #603

@bpowers

Description

@bpowers

Context

PR #599's R4 added a dense fast-path to RuntimeView::flat_offset (skip the per-element sparse_lookup SmallVec when a view has no sparse mappings), taking it from ~18.9% to ~8.6% of the C-LEARN run. The residual ~8.6% is the actual per-element offset arithmetic (offset + Σ idx[i] * strides[i]).

The BeginIter non-contiguous precompute walks indices sequentially (increment_indices) and calls flat_offset per element — recomputing the full sum each step even though consecutive indices differ by a constant stride.

Idea

For sequential iteration, step the flat offset incrementally (carry-propagated stride add) instead of recomputing Σ idx*strides per element. Scope to the sequential-iteration paths (the BeginIter precompute; contiguous-but-transposed views); the vector-op call sites (VECTOR SELECT / ELM MAP / SORT ORDER) use arbitrary computed indices and are not stride-steppable.

Flagged as a follow-up by the R4 implementor. Bit-preserving.

Expected impact

A fraction of the residual ~8.6% (the sequential-iteration portion). Incremental.

Refs

  • src/simlin-engine/src/bytecode.rsRuntimeView::flat_offset, RuntimeView::offset_for_iter_index.
  • src/simlin-engine/src/vm.rs — the BeginIter flat-offset precompute and increment_indices.

Metadata

Metadata

Assignees

No one assigned

    Labels

    engineIssues with the rust-based simulation engineenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions