Skip to content

Investigate checked vector kernel performance for v0.4.3 #137

@acgetchell

Description

@acgetchell

Summary

Investigate whether la-stack can recover more vector-kernel performance in v0.4.3 without weakening the parse-don't-validate invariant model introduced in v0.4.2.

Current State

The v0.4.2 finite-proof changes correctly parse public Matrix and Vector storage into private proof-bearing wrappers before computation. This improved correctness, but benchmarks show some small fixed-size kernels, especially dot product, squared norm, and norm-like operations, remain slower than nalgebra/faer.

The LU regression was mostly recovered by moving non-finite factor checks out of cubic update loops while still validating completed factor storage before factors escape. Vector kernels may have a similar correctness-preserving optimization opportunity: keep public methods checked, but avoid unnecessary duplicate passes or copies once proof-bearing values exist.

Investigation Targets

  • Measure dot product, squared norm, infinity norm, solve_from_lu, and lu_solve separately against current nalgebra/faer baselines.
  • Identify boundary-check cost versus arithmetic cost for small const-generic dimensions.
  • Check whether Vector::dot and Vector::norm2_sq can combine parsing and computation more tightly without exposing invalid internal state.
  • Add cold-path hints to vector-kernel overflow exits where that matches the existing Matrix/LU/LDLT error-path style.
  • Compare public Vector::dot / Vector::norm2_sq against proof-bearing internal FiniteVector paths so the benchmark separates validation cost from arithmetic cost.
  • Check whether matrix infinity-norm and symmetry paths have similar duplicate validation or avoidable pass costs before adding any helper abstraction.
  • Preserve private proof-bearing wrappers and crate-private unchecked constructors; do not make non-finite storage observable or silently propagated.
  • Consider additive internal-only fast paths over FiniteVector where proof already exists.

Stretch Targets

  • Investigate whether det_sign_exact can avoid duplicate fast-filter work between det_direct() and det_errbound() for D <= 4.
  • Only pursue determinant fast-filter sharing if Criterion shows it matters; split it into a separate issue if the formula consolidation becomes non-trivial.

Acceptance Criteria

  • Public APIs still reject unchecked NaN/∞ storage with LaError::NonFinite metadata.
  • Internal helpers trust proof-bearing values instead of reparsing the same object repeatedly.
  • Dot product, squared norm, infinity norm, solve_from_lu, and lu_solve are measured separately against current nalgebra/faer baselines.
  • Public Vector::dot / Vector::norm2_sq are compared against proof-bearing internal FiniteVector paths so validation cost and arithmetic cost are visible separately.
  • Vector-kernel overflow exits use cold-path hints where that matches the existing Matrix/LU/LDLT error-path style.
  • Matrix infinity-norm and symmetry paths are checked for duplicate validation or avoidable pass costs; any non-trivial helper abstraction is split out or explicitly justified.
  • det_sign_exact fast-filter sharing is either measured and implemented, measured and rejected as not worth it, or split into a separate follow-up if the formula consolidation is non-trivial.
  • Any performance improvement is validated with bench-vs-linalg or focused Criterion output.
  • Documentation and benchmark tables are updated only after a full benchmark run.

Non-Goals

  • Do not roll back parse-don't-validate correctness.
  • Do not expose FiniteVector, FiniteMatrix, or other proof wrappers publicly without a separate API design decision.
  • Do not trade typed error behavior for benchmark wins.
  • Do not add generic_const_exprs-dependent APIs or dimension-specialized trait machinery for v0.4.3.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestperformancePerformance related issuesrustPull requests that update rust code

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions