simd_vector

SIMD vector types for x86-64 in pure stable Rust.

Vec4 — 4×f32, SSE4.1 + FMA3
Vec8 — 8×f32, AVX2 + FMA3

Features

Arithmetic: +, -, *, / (vec×vec, vec×f32, f32×vec)
splat, abs, neg, sqrt, floor, ceil, round, sum, dot, mul_add (FMA)
Sum, Index, From/Into array, Clone, Copy, Debug, PartialEq
Transcendentals: sin, cos, exp — ported from SLEEF

Accuracy

Function	ULP error	Notes
`+`, `-`, `*`, `/`	0.0	IEEE 754 correctly rounded
`neg`	0.0	exact bit flip
`abs`	0.0	exact bit mask
`floor`	0.0	exact (IEEE 754 `roundps`)
`ceil`	0.0	exact (IEEE 754 `roundps`)
`round`	0.0	exact (half away from zero, matches `f32::round`)
`splat`	0.0	exact broadcast
`sum`	≤ 0.5	f64 intermediate: exact tree sum, single rounding on f64→f32
`dot`	≤ 0.5	f64 intermediate: exact products + tree sum, single rounding on f64→f32
`mul_add`	≤ 0.5	single FMA instruction
`sqrt`	0.0	IEEE 754 correctly rounded (`sqrtps`)
`sin`	≤ 1.0	SLEEF `xsinf_u1` — Cody-Waite + Payne-Hanek + double-float polynomial
`cos`	≤ 1.0	SLEEF `xcosf_u1` — Cody-Waite + Payne-Hanek + double-float polynomial
`exp`	≤ 1.0	SLEEF `xexpf` — ln(2) range reduction + degree-6 polynomial + ldexp

Transcendental implementation details

Range reduction for sin/cos: Cody-Waite (3-constant) for |x| < 125, Payne-Hanek table-based (rempif) for larger arguments
Double-float arithmetic: FMA-based error-free transformations for high precision in the polynomial evaluation
Edge cases: NaN/Inf propagation, sin(-0) = -0, exp(-inf) = 0, exp(inf) = inf
Polynomial coefficients, constants, and the 416-entry Sleef_rempitabsp table are taken directly from SLEEF

Required CPU features

SSE4.1, AVX2, FMA3. Enabled globally via .cargo/config.toml:

[build]
rustflags = ["-C", "target-feature=+sse4.1,+avx2,+fma"]

Installation

cargo add simd_vector

Usage

use simd_vector::{Vec4, Vec8};

let a = Vec4([1.0, 2.0, 3.0, 4.0]);
let b = Vec4([5.0, 6.0, 7.0, 8.0]);
let c = a + b;              // [6.0, 8.0, 10.0, 12.0]
let d = a.dot(b);           // 70.0
let s = a.sin();            // [0.8415, 0.9093, 0.1411, -0.7568]
let e = Vec8::splat(1.0).exp(); // [2.71828, 2.71828, ...] (8 lanes)

Tests

345 tests covering all operations, edge cases (NaN, Inf, -0.0, subnormals), sampled ULP sweep verification, trigonometric identities, arithmetic properties, and AVX2 lane independence.

cargo test

Coverage (via cargo llvm-cov):

Filename   Regions  Missed  Cover   Functions  Missed  Cover   Lines  Missed  Cover
lib.rs     142      0       100.00% 6          0       100.00% 68     0       100.00%
vec4.rs    990      0       100.00% 51         0       100.00% 414    0       100.00%
vec8.rs    1021     0       100.00% 51         0       100.00% 463    0       100.00%
TOTAL      2153     0       100.00% 108        0       100.00% 945    0       100.00%

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.cargo		.cargo
.claude		.claude
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

simd_vector

Features

Accuracy

Transcendental implementation details

Required CPU features

Installation

Usage

Tests

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

simd_vector

Features

Accuracy

Transcendental implementation details

Required CPU features

Installation

Usage

Tests

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages