Benchmarking sparse matrix-vector multiplication (SpMV) using scalar and RISC-V Vector Extension (RVV) implementations. Targets EPAC and Orange Pi RV2 RISC-V hardware, with x86 support for development and testing.
The benchmark iterates SpMV and reports timing, cycle counts, memory footprint, and FLOP counts for each sparse format.
./src— C source files (formats, SpMV kernels, driver, main)./build— compiled executables and run scripts./data— test Matrix Market (.mtx) files./tests— Google Test unit tests (C++)
| Format | Description |
|---|---|
COO |
Coordinate format |
CSR |
Compressed Sparse Row |
ELL |
ELLPACK format |
DIA |
Diagonal format |
| Format | Description |
|---|---|
vCSRVG |
Vectorised CSR — vector-permute |
vCSRMG |
Vectorised CSR — index-load |
vELLVG |
Vectorised ELL — vector-permute |
vELLRVG |
Vectorised ELL — vector-permute (roofline) |
vELLMG |
Vectorised ELL — index-load |
vDIAVG |
Vectorised DIA — vector-slide |
vDIARVG |
Vectorised DIA — vector-slide (roofline) |
| Group | Included formats |
|---|---|
SCALAR |
COO, CSR, ELL, DIA |
VECTOR |
vCSRVG, vCSRMG, vELLVG, vELLMG, vDIAVG |
VCSR |
vCSRVG, vCSRMG |
VELL |
vELLVG, vELLMG |
VDIA |
vDIAVG |
ROOFLINE |
vELLRVG, vDIARVG |
ALL |
All formats |
Edit the top of Makefile to configure the build before compiling:
| Variable | Values | Description |
|---|---|---|
FLOAT_TYPE |
USE_FLOAT / USE_DOUBLE |
Floating-point precision (default: USE_FLOAT) |
rvv |
1 / 0 |
Enable RVV intrinsics (default: 1) |
DEBUG |
1 / 0 |
Enable debug output (default: 0) |
ARCH |
e.g. rv64gcv |
RISC-V march string |
The compiler and flags sections contain pre-configured profiles for:
- gcc / clang — Orange Pi (default active)
- clang — EPAC (comment/uncomment as needed)
- spike simulator (for software emulation)
Floating-point precision can also be switched per-build without editing the Makefile:
$ make riscv64 FLOAT_TYPE=USE_DOUBLE
# Build for x86 (development / correctness testing)
$ make x86
# Build for RISC-V with RVV
$ make riscv64
# Build unit tests (requires Google Test)
$ make test
# Remove all build artefacts
$ make cleanExecutables are placed in ./build/:
spmv_x86spmv_rvv
./build/spmv_rvv <matrix.mtx> <format> <block_size_row> <block_size_col> <iters> <print_result>
| Argument | Description |
|---|---|
matrix.mtx |
Path to a Matrix Market file |
format |
Format or group name (see tables above) |
block_size_row |
Row block size (use 0 to disable blocking) |
block_size_col |
Column block size |
iters |
Number of SpMV iterations to time |
print_result |
1 to print the output vector, 0 to suppress |
Example — run all formats with block size 8 for 10 iterations:
$ ./build/spmv_rvv ./data/dwt_512.mtx ALL 8 8 10 0Example — run scalar formats only, print result vector:
$ ./build/spmv_x86 ./data/test.mtx SCALAR 16 16 1 1| Script | Description |
|---|---|
./build/run_x86.sh |
Run a quick x86 test on matrix_128k_full.mtx with COO |
./build/run_rvv.sh [block_size] |
Run vDIAVG on dwt_512.mtx with the given block size |
./build/run_epac.sh |
Run vELLVG on dwt_512.mtx (EPAC configuration) |
./build/run_orangepi.sh [block_size] |
Run ALL formats on pwtk.mtx for Orange Pi |
./build/run_benchmark.sh |
Full benchmark over all .mtx files in ../benchmark_v2/ |
On the first run the program parses the .mtx file and writes a binary cache alongside it (e.g. matrix_64_float.bin). Subsequent runs with the same block size and precision load the binary directly, skipping the text parser.
Each benchmark line is printed by print_stats in the following format:
format rows cols blocks total_elem nnz mem(MB) flops iters time(s) cycles result
- format — sparse format name
- rows / cols — matrix dimensions
- blocks — number of matrix blocks
- total_elem — allocated storage elements (including padding)
- nnz — number of non-zero elements
- mem(MB) — memory footprint in megabytes
- flops — floating-point operations per SpMV
- iters — iteration count
- time(s) — average wall-clock time per iteration
- cycles — hardware cycle count (via
rdcycle, RISC-V only) - result — running mean of the output vector (for correctness cross-checking)
Unit tests use Google Test. Build and run with:
$ make test
$ ./tests/csr_test
$ ./tests/coo_testFollow the six-step checklist at the top of src/spMv.c:
- Implement the kernel function in
src/spMv.c - Add a dispatch case in
spMv()insrc/spMv.c - Add the FLOP count case in
get_flops_spMv()insrc/spMv.c - Add the enum value to
E_SPMV_FORMATinsrc/mtx_sparse.h - Add the format string to
str_spmv_format[]insrc/mtx_sparse.c(same position as the enum) - Update the argument parser and format groups in
src/main.c