Benchmark Details

This page describes the workloads used in each benchmark category. All tests are native Swift and run locally.

CPU Single-Core

Single-threaded CPU performance across mixed workloads.

Integer

What: 64-bit integer arithmetic
How: tight loop with add, multiply, shift, and XOR
Metric: Mops/s (millions of operations per second)

Floating Point

What: double-precision math
How: multiply, sqrt, sin/cos operations
Metric: Mops/s

SIMD (Accelerate)

What: vectorized operations using vDSP
How: vector multiply-add, vector add, dot product
Metric: GFLOPS

Cryptography

What: AES-256-GCM encryption + SHA-256 hashing
How: encrypt a data buffer and hash ciphertext
Metric: MB/s throughput

Compression

What: LZFSE compression + decompression
How: compress and decompress a buffer
Metric: MB/s throughput (combined)

CPU Multi-Core

Same tests as single-core, executed in parallel across all available CPU cores.

Uses Swift TaskGroup for parallel execution
Totals throughput across tasks
Score is normalized by core count (see Scoring Methodology)
SIMD (v2.1.6+): Uses 8K-element vectors (~96KB per core) that fit in L1 cache, ensuring the test is compute-bound and scales properly with core count. Previous versions used 1M-element vectors (12MB) which caused memory bandwidth contention across cores.

Memory

Unified memory subsystem performance.

Sequential Read

Linear read of a 256 MB buffer
Page-aligned allocation with optional mlock
Metric: GB/s

Sequential Write

Linear write of a 256 MB buffer
Page-aligned allocation with optional mlock
Metric: GB/s

Copy

memcpy between two 256 MB buffers
Page-aligned allocation with optional mlock
Metric: GB/s

Latency

Pointer-chase random access
32 MB working set, 10M accesses
Metric: ns (lower is better)

Note: Advanced profiling adds stride throughput and block-size sweeps for empirical cache boundary detection. Those sweeps report throughput vs stride/size, not latency.

Disk

Storage performance with cache bypass. Patterns are NovaBench-compatible for direct comparison.

Why NovaBench-Compatible?

We align with NovaBench patterns to enable meaningful cross-tool comparisons:

Sequential: 4MB blocks (NovaBench uses up to 8 simultaneous)
Random: 4KB blocks, QD1 (one operation at a time)
All metrics in MB/s for consistency

Setup

Creates a unique temp directory per run
Uses O_NOFOLLOW to prevent symlink attacks
Uses F_NOCACHE to bypass the filesystem cache
Uses F_FULLFSYNC to force sync to physical media

Sequential Write

4 MB chunks into a 256 MB (quick) or 1 GB (normal) file
Cache bypass + final sync
Metric: MB/s

Sequential Read

4 MB chunks from a 256 MB (quick) or 1 GB (normal) file
File is synced before reading to ensure cold read
Metric: MB/s

Random Write

4 KB writes at random offsets in a 256 MB sparse file
QD1 pattern (one I/O at a time)
Metric: MB/s (converted from IOPS × 4KB / 1MB)

Random Read

4 KB reads at random offsets in a 256 MB file
QD1 pattern (one I/O at a time)
Metric: MB/s (converted from IOPS × 4KB / 1MB)

Disk Parameters

Parameter	Quick Mode	Normal Mode
Sequential file size	256 MB	1 GB
Sequential chunk size	4 MB	4 MB
Random file size	512 MB	1 GB
Random block size	4 KB	4 KB
Random operations	500	2,000

Disk I/O Volume (per iteration)

Disk tests run repeatedly for the selected duration, so total bytes scale with duration and drive speed. Per iteration of each subtest:

Sequential write/read: writes/reads the sequential file size (256 MB quick / 1 GB normal) in 4 MB blocks, then syncs
Random write: 500/2,000 operations of 4 KB each (about 2 MB / 8 MB total), then final sync
Random read: 500/2,000 operations of 4 KB each (about 2 MB / 8 MB total); each iteration also creates and syncs a 512 MB / 1 GB file to keep reads cold

Cache bypass is best-effort; macOS VM and SSD controller caches can still influence results.

Comparison with NovaBench

Metric	Our Tool (M1)	NovaBench M1	Why Different
Seq Read	~2180 MB/s	3356 MB/s	We use F_NOCACHE
Seq Write	~700 MB/s	3279 MB/s	We use F_FULLFSYNC
Rand Read	~43 MB/s	166 MB/s	1GB file exceeds cache
Rand Write	~17 MB/s	761 MB/s	F_NOCACHE + final sync

Note: Our tool measures actual disk performance (what you'd see in real workloads), not cache throughput. NovaBench numbers include filesystem cache effects, which makes them higher but less representative of sustained I/O performance.

GPU (Metal)

Integrated GPU compute benchmarks using Metal.

Compute (Matrix Multiply)

Dense matrix multiplication
Size: 1024x1024 (quick) or 2048x2048 (normal)
Metric: GFLOPS

Particles Simulation

N-body style simulation
Particles: 100,000 (quick) or 1,000,000 (normal)
Metric: Mparts/s

Gaussian Blur

5x5 convolution on an image
Size: 2048x2048 (quick) or 4096x4096 (normal)
Metric: MP/s

Edge Detection (Sobel)

Sobel filter on the same image size
Metric: MP/s

GPU Parameters

Parameter	Quick Mode	Normal Mode
Matrix size	1024×1024	2048×2048
Particle count	100,000	1,000,000
Image size (blur/edge)	2048×2048	4096×4096

GPU Notes

Metal shaders are compiled at runtime from inline source (SPM compatible)
Texture data is heap-allocated to avoid stack overflow on large images (v2.1.1+)
Uses deterministic patterns for reproducibility
If Metal is unavailable, GPU tests return 0 and are scored as "Failed"

AI/ML (v2.0.0+)

Neural Engine and machine learning inference benchmarks using CoreML and Accelerate.

Why a Separate AI Score?

The AI/ML score is not included in the Total Score. This mirrors Geekbench AI's approach:

AI workloads have different characteristics than traditional benchmarks
Neural Engine performance varies significantly between chip generations
Users may care about AI performance independently from general compute

CoreML Inference Tests

All CoreML tests use the same model (MobileNetV2 image classification) with different compute units:

CPU Inference

What: CoreML inference using CPU-only
Compute Units: .cpuOnly
Metric: IPS (inferences per second)
Measures: CPU's ability to run ML models without GPU/ANE

GPU Inference

What: CoreML inference using GPU
Compute Units: .cpuAndGPU
Metric: IPS (inferences per second)
Measures: GPU's ability to accelerate ML inference

Neural Engine Inference

What: CoreML inference with Neural Engine
Compute Units: .all (CoreML schedules to ANE when beneficial)
Metric: IPS (inferences per second)
Measures: ANE throughput for supported operations

BNNS (Accelerate Framework)

What: Matrix multiplication using vDSP
Size: 512×512 (quick) or 1024×1024 (normal)
Metric: GFLOPS
Measures: CPU vector math performance for ML workloads

AI Model Management

The benchmark downloads a CoreML model from Apple's official ML assets on first run:

Model: MobileNetV2 image classification (~17MB source)
Source: ml-assets.apple.com (Apple's official CoreML model repository)
Compilation: Compiled locally using CoreML framework (no Xcode required)
Cache: ~/Library/Application Support/osx-bench/models/
Offline mode: Use --offline to skip if model not cached
Custom model: Use --model-path for local .mlmodelc models

AI Test Parameters

Parameter	Quick Mode	Normal Mode
Warmup iterations	5	20
Min iterations	5	10
Max iterations	1,000	10,000
BNNS matrix size	512×512	1024×1024

AI Notes

Neural Engine availability depends on macOS version and chip
CoreML decides layer scheduling - not all operations run on ANE even with .all
Input is synthetic (deterministic pattern for reproducibility)
Results comparable to Geekbench AI methodology (see Geekbench AI Workloads paper)

Thermal Monitoring

Thermal state is tracked during a run using the macOS public API:

Nominal
Fair
Serious
Critical

If throttling is detected, scores may be lower than peak performance.

Duration Modes

Mode	Flag	Duration	Use Case
Quick	--quick	~3s per category	Fast iteration
Normal	default	10s per category	Standard runs
Custom	-d N	N seconds	Tuned runs
Stress	--stress	~60s per category	Sustained performance
Repeat	--repeats N	N × full run	Publishable results (median)

Benchmark Details

Benchmark Details

CPU Single-Core

Integer

Floating Point

SIMD (Accelerate)

Cryptography

Compression

CPU Multi-Core

Memory

Sequential Read

Sequential Write

Copy

Latency

Disk

Why NovaBench-Compatible?

Setup

Sequential Write

Sequential Read

Random Write

Random Read

Disk Parameters

Disk I/O Volume (per iteration)

Comparison with NovaBench

GPU (Metal)

Compute (Matrix Multiply)

Particles Simulation

Gaussian Blur

Edge Detection (Sobel)

GPU Parameters

GPU Notes

AI/ML (v2.0.0+)

Why a Separate AI Score?

CoreML Inference Tests

CPU Inference

GPU Inference

Neural Engine Inference

BNNS (Accelerate Framework)

AI Model Management

AI Test Parameters

AI Notes

Thermal Monitoring

Duration Modes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Apple Silicon Bench Wiki

Clone this wiki locally