ROLV Primitive©

A drop-in matmul operator that delivers up to 106× faster AI inference and up to 99% less energy on the same hardware — with bit-identical output.

Run the benchmark from any device — no install, two minutes

Go to rolv.ai, sign in, pick a model from the dropdown, click run. The compute happens on our server via the public benchmark API (see rolvai/benchmark on HuggingFace). A SHA-256-signed result lands in your inbox with per-case speedup, energy reduction, correctness check, and a run hash bound to your hardware fingerprint.

Works on a laptop. Works on a Chromebook. Works on a phone. The receipt is cryptographically tied to your run — it cannot be copied from another benchmark, and cannot be fabricated without actually running.

One-time result email. No newsletter. No follow-up sequence.

Selected results on real HuggingFace weights

NVIDIA B200, BF16, TF32 on, 1,000 iterations:

Model	Natural sparsity	vs cuBLAS	vs cuSPARSE	Energy reduction
Llama-4-Scout	93.8%	4.75×	103×	79%
Mixtral-8×7B	75.0%	1.86×	109×	46%
Qwen3-30B-A3B	93.8%	3.43×	32×	71%
OLMoE-1B-7B (H200)	87.5%	2.49×	43×	60%

Intel i7 laptop (4 cores, 68 GB RAM, MKL baseline):

Model / Layer	Sparsity	vs MKL
Llama-3.2-1B `down_proj`	99%	106.65×
Qwen2.5-7B `gate_proj`	95%	59.70×
Mistral-7B `q_proj`	95%	21.45×

Full per-case data with SHA-256 hashes: rolv.ai/rolv_benchmarks.pdf

Why it works

Modern AI weight matrices are mostly zero. In a Mixture-of-Experts model like Mixtral or DeepSeek-V3, 75–97% of weights are architecturally inactive for any given token — guaranteed by the router, known before computation starts. Standard libraries compute all of them anyway.

ROLV identifies the non-zero structure at load time and restricts computation to live elements only. The operation uses the same underlying BLAS / tensor-core primitive on a matrix proportional to the non-zero fraction — but on a single contiguous submatrix rather than indexed scatter-gather. Results are placed into the correct positions of the full output tensor. Final output is bit-identical to the full dense operation.

Inner mechanism is Patent Pending.

Verification protocol

Every benchmark case is produced with five independent verification layers:

Real HuggingFace weights — downloaded from public repositories, no synthetic matrices
Vendor baseline — Intel MKL on CPU, cuBLAS on GPU, cuSPARSE at high sparsity
Four SHA-256 hashes per case — input matrix, input vector, vendor output, ROLV output
Perturbation test — one weight altered by 10⁻³, output hash must change (rules out cached answers)
Signed run hash — SHA-256 over speedup, timestamp, and hardware fingerprint

ATOL = 0.05 on column-normalised fp64. The correctness check and the speed measurement are the same execution — work cannot be skipped to game the clock without also failing correctness.

1,684 / 1,684 GPU PASS · 332 / 332 CPU PASS

Independently validated by the University of Miami Frost Institute for Data Science and Computing — bit-identical SHA-256 hashes across CPU, GPU, and TPU. No commercial relationship. Validation letter.

Hardware compatibility

Validated today: NVIDIA (B200, H200, H100, A100, RTX series, T4, V100) · AMD (MI300X, MI250X, RX 7900) · Intel CPU (MKL, AVX-512) · AMD EPYC · ARM Neoverse · Apple Silicon (M1–M4 Pro) · Google TPU (v4, v5)

Framework support: PyTorch · JAX · TensorFlow · ONNX Runtime · TensorRT · vLLM · HuggingFace Transformers

Works with: BF16, FP16, FP32, INT8 checkpoints. No retraining. No re-quantisation.

On-premise evaluation

The public benchmark at rolv.ai proves the output. To run ROLV on your own hardware, against your own models, in your own environment, two NDA-gated tiers are available:

Secure Container — hardware-locked Docker container. RolvKey™-authenticated. Processor fingerprint binding at first run; will not execute on any other machine. Optional Intel SGX hardware encryption for regulated environments. Evaluation licence + NDA required.

Direct Hardware — single authenticated file for bare-metal servers and air-gapped environments where Docker is not permitted. Processor-bound binary with live heartbeat attestation. Evaluation licence + NDA required.

Both tiers return cryptographically signed per-run results bound to your processor fingerprint.

Contact: rolv@rolv.ai

Citation

If you use ROLV in research, please cite:

@software{heggenhougen_rolv_2026,
  author    = {Heggenhougen, Rolv E.},
  title     = {ROLV Primitive©: A Universal Compute Primitive for Sparse AI Inference},
  year      = {2026},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.19221455},
  url       = {https://rolv.ai}
}

GitHub's "Cite this repository" button (top-right of this page) pulls the same information from CITATION.cff.

Links

Live benchmark: rolv.ai
Paper: Zenodo 10.5281/zenodo.19221455
Independent validation: rolv.ai/validation
HuggingFace: huggingface.co/rolvai
Benchmark API Space: huggingface.co/spaces/rolvai/benchmark
Substack: rolv.substack.com

Contact: rolv@rolv.ai — a real person reads every message.

License

This repository (documentation, citation, and project metadata) is licensed under Apache 2.0. See LICENSE.

The ROLV Primitive© implementation is Patent Pending and licensed separately under NDA for on-premise evaluation. The Apache licence on this repository applies only to the documentation and project files here — it does not grant any licence to the underlying algorithm or implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ROLV Primitive©

Run the benchmark from any device — no install, two minutes

Selected results on real HuggingFace weights

Why it works

Verification protocol

Hardware compatibility

On-premise evaluation

Citation

Links

License

About

Uh oh!

Releases

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

ROLV Primitive©

Run the benchmark from any device — no install, two minutes

Selected results on real HuggingFace weights

Why it works

Verification protocol

Hardware compatibility

On-premise evaluation

Citation

Links

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!