rule-image

Compile-once, memory-map-many: a Java AOT rule-image compiler and runtime for immutable datasets.

Take a large, read-mostly dataset — rule sets, policy graphs, feature flags, GeoIP ranges, signature databases — compile it ahead-of-time into a flat binary .rimg image, and memory-map it at runtime with zero heap cost. Built on Java FFM (MemorySegment), benchmarked with Virtual Threads, and designed with a forward path to Valhalla value-class views.

The headline number

Metric	Heap (POJO) baseline	`rule-image` mapped	Delta
Heap after load	853.80 MB	8.07 MB	105× less heap
Load / open time	6,683 ms	145 ms	45× faster startup
Reload / swap time	6,595 ms	278 ms	23× faster reload

_{GeoIP-style showcase, 5 million synthetic entries. Full report: measurements/geoip-memory-showcase/win-5m-thin/report.md}

At 100k entries with inflated metadata payloads:

Metric	Heap baseline	`rule-image` mapped
Heap after load	1.47 GB	7.33 MB
Heap delta	—	1.46 GB freed

_{Full report: measurements/geoip-memory-showcase/win-100k-fat-23_5k/report.md}

What it does

┌──────────┐    ┌──────────┐    ┌──────────┐     ┌──────────┐
│ Extract  │──▶│ Compile  │───▶│  Map     │───▶│  Serve   │
│ (Postgres│    │ JSON →   │    │ mmap +   │     │ zero-    │
│  / JSON) │    │  .rimg   │    │ FFM API  │     │ alloc    │
└──────────┘    └──────────┘    └──────────┘     │ lookups  │
                                                 └──────────┘

Extract — pull rows from PostgreSQL (or feed normalized JSON directly)
Compile — build a versioned .rimg binary with MPHF index + optional Bloom filter
Map — FileChannel.map() + Arena.ofShared() → MemorySegment
Serve — absolute-offset reads, no heap allocation on the hot path
Swap — atomic hot-swap under load, zero-downtime refresh

Honest benchmarks

Where `rule-image` wins: memory, startup, reload

Workload	Heap	Mapped	Ratio
5M thin GeoIP entries — heap	853 MB	8 MB	105×
100k fat-payload entries — heap	1.47 GB	7.33 MB	200×
5M thin — load time	6.6s	145ms	45×
100k fat — reload time	6.5s	278ms	23×

Where heap still wins: warm lookup latency

Benchmark	Heap	Mapped	Ratio
JMH single warm lookup	21 ns/op	79 ns/op	3.7× slower
JMH composed (N=10)	400 ns/op	684 ns/op	1.7× slower
JMH composed (N=100)	4.3 µs/op	8.4 µs/op	1.9× slower

_{Full JMH matrix: measurements/week3-report.md}

Versus Netflix Hollow (head-to-head on Linux)

Benchmarked against Netflix Hollow on the same GeoIP-like workload. Reports:

Versus shared-store miss paths (FF4J Redis / JDBC)

Path	Avg latency
FF4J uncached Redis	301 µs
FF4J uncached network JDBC	679 µs
FF4J warm Redis cache	0.85 µs
`rule-image` warm lookup	sub-µs

The strongest case for rule-image: replacing repeated remote metadata fetch with a local compiled snapshot.

Hot-swap chaos test

10,000 concurrent virtual-thread readers
Forced swap every 500ms for 5 continuous minutes
Zero segfaults, zero stale reads, zero lost evaluations

Quick start

Prerequisites: JDK 26 (Temurin recommended), Gradle wrapper included.

# Clone
git clone https://github.com/AlphaSudo/rimg.git
cd rimg

# Build + test
./gradlew test

# Run the GeoIP memory showcase (100k entries)
./gradlew :geoip-showcase:run --args="--entries 100000 --lookups 10000 --warmup-lookups 2000"

# Compile a custom .rimg from JSON
./gradlew :compiler:run --args="--input fixtures/rules-10000.json --out rules.rimg"

# Inspect a compiled image
./gradlew :inspector:run --args="stats --input rules.rimg"

Windows (PowerShell)

.\gradlew.bat test
powershell -ExecutionPolicy Bypass -File .\scripts\Run-GeoIpMemoryShowcase.ps1 -Entries 100000 -Lookups 10000 -WarmupLookups 2000

Linux benchmarks

./gradlew test :benchmark:jmhJar
./scripts/bench-linux-harness.sh
sudo ./scripts/bench-linux-cold-cache.sh

Architecture

Modules

Module	Purpose
`format-spec`	Binary layout spec, Bloom filter, hash utilities
`compiler`	AOT compiler: JSON → `.rimg` with MPHF index
`runtime`	Mapped reader, header validation, Linux page-fault mitigations
`codegen`	Schema-driven accessor generator
`benchmark`	JMH microbenchmarks (warm, cold, concurrent, composed)
`inspector`	CLI: hexdump, stats, format validation
`service-harness`	Feature-flagged service with `heap\|rimg` + `platform\|virtual` modes
`load-driver`	HTTP load generator for the service harness
`geoip-showcase`	Dedicated heap-vs-mapped memory showcase
`hollow-showcase`	Netflix Hollow comparison harness
`record-showcase`	Real-data heap-vs-mapped benchmark
`postgres-export`	PostgreSQL → normalized JSON exporter

Binary format (v0.4)

┌────────────────────────────┐
│ Header (magic, version,    │
│   CRC32, SHA-256, flags)   │
├────────────────────────────┤
│ MPHF bucket seeds [int[]]  │
│ Slot offset table [long[]] │
│ Bloom filter (optional)    │
├────────────────────────────┤
│ Entry region               │
│  key + ruleId + priority   │
│  + action (packed, LE)     │
└────────────────────────────┘

Full spec: format-spec/SPEC.md

Page-fault mitigations

Virtual threads + mmap have a real operational subtlety: hard page faults stall the carrier thread (not fixed by JEP 491). This repo implements and benchmarks:

Pre-touch — sequential walk, one byte per 4 KiB page
madvise(WILLNEED) — FFM downcall, best tested cold-path mitigation
mlock — pin pages in RAM for strict tail-latency SLAs
Carrier pool tuning — jdk.virtualThreadScheduler.parallelism sweep

Details: docs/ADR.md §6.1

Who this is for

Best-fit workload shapes:

Feature/config evaluation services
Policy and authorization lookup
Pricing, routing, eligibility, decision-table services
Replay/state-transition services with mostly-static dispatch metadata
Large immutable lookup datasets (GeoIP-style range mappings)

Poor fit: CRUD-heavy, write-heavy, or highly dynamic per-request data.

Real-data pipeline

The repo includes a working end-to-end path from a real PostgreSQL database:

Export with postgres-export → normalized JSON
Compile with compiler → .rimg
Benchmark with record-showcase → heap-vs-mapped comparison

Example config: examples/postgres-rule-export-config.example.json

What we do NOT claim

"This is faster than everything" — warm lookup still loses to heap
"Virtual threads + mmap always win" — they don't; see the data
"Production-ready for all workloads" — this is a PoC with real evidence

The honest positioning: rule-image wins on memory shape, startup, reload, and cold/miss-path avoidance. If your current heap path is already warm and cheap, this is not automatically better. The data is in the repo — judge for yourself.

Deeper reading

Document	What it covers
Architecture Decision Record	Full technical rationale, prior art matrix, risk analysis
Architecture guide	Detailed comparison methodology and benchmark interpretation
Blog post draft	"Mmap + Virtual Threads + Panama: A Pattern, Not a Revolution"
Valhalla engagement	Value-class view pattern for `valhalla-dev`
Format specification	Binary layout v0.4
Week 3 report	Full Phase 3 benchmark artifacts
Phase 3 closeout	What was proven, what remains
Decision docs	7 decision records (hot-swap, layout, trust, go/no-go)

Requirements

JDK 26 (Temurin). JDK 22+ minimum (FFM GA). JDK 24+ recommended (JEP 491).
Linux for full feature support (madvise, mlock, perf stat)
Windows works for development and most benchmarks; no madvise/mlock

License

Apache License 2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rule-image

The headline number

What it does

Honest benchmarks

Where `rule-image` wins: memory, startup, reload

Where heap still wins: warm lookup latency

Versus Netflix Hollow (head-to-head on Linux)

Versus shared-store miss paths (FF4J Redis / JDBC)

Hot-swap chaos test

Quick start

Windows (PowerShell)

Linux benchmarks

Architecture

Modules

Binary format (v0.4)

Page-fault mitigations

Who this is for

Real-data pipeline

What we do NOT claim

Deeper reading

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
benchmark		benchmark
codegen		codegen
compiler		compiler
decisions		decisions
docs		docs
examples		examples
fixtures		fixtures
format-spec		format-spec
geoip-showcase		geoip-showcase
gradle/wrapper		gradle/wrapper
hollow-showcase		hollow-showcase
inspector		inspector
load-driver		load-driver
measurements		measurements
postgres-export		postgres-export
record-showcase		record-showcase
runtime		runtime
schema		schema
scripts		scripts
service-harness		service-harness
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Folders and files

Latest commit

History

Repository files navigation

rule-image

The headline number

What it does

Honest benchmarks

Where rule-image wins: memory, startup, reload

Where heap still wins: warm lookup latency

Versus Netflix Hollow (head-to-head on Linux)

Versus shared-store miss paths (FF4J Redis / JDBC)

Hot-swap chaos test

Quick start

Windows (PowerShell)

Linux benchmarks

Architecture

Modules

Binary format (v0.4)

Page-fault mitigations

Who this is for

Real-data pipeline

What we do NOT claim

Deeper reading

Requirements

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Where `rule-image` wins: memory, startup, reload

Packages