Skip to content

lovablelabs/maglev-hash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

maglev-hash

Crates.io Documentation License

Maglev consistent hashing with top-K preference lists for replica-aware routing.

This crate implements the lookup table construction algorithm from the Google Maglev paper, extended with a top-K preference list for replica-aware routing.

Usage

use maglev_hash::MaglevTable;

let nodes = vec!["node-a".to_string(), "node-b".to_string(), "node-c".to_string()];
let table = MaglevTable::new(nodes);

// Primary lookup — O(1)
let primary = table.get(&"my-key").unwrap();
println!("primary node: {primary}");

// Top-2 preference list (first element == primary)
let top2 = table.get_top_k(&"my-key", 2);
assert_eq!(top2[0], primary);
println!("fallback node: {}", top2[1]);

// Check if a specific node is in the top-2 for this key
let in_top2 = table.is_match(&"my-key", &"node-a".to_string(), 2);
println!("node-a in top-2: {in_top2}");

Top-K Preference Lists (crate extension)

Note: top-K preference lists are not part of the original Maglev paper, which defines only a single-node lookup. This crate extends the algorithm with get_top_k() and is_match() for replica-aware routing.

get_top_k(key, k) returns up to K distinct nodes in a deterministic preference order by walking the lookup table from the hashed slot and collecting unique nodes. This guarantees that:

  • The first element always agrees with get().
  • The ordering is deterministic and consistent across calls.
  • When a node is removed, the relative ordering of remaining nodes is largely preserved.

is_match() checks whether a specific node appears in the top-K for a key, useful for determining if a node should accept a request when running with replication.

Tradeoffs vs. suffix hashing

An alternative approach is to hash the key with different suffixes (e.g., get("key-0"), get("key-1")) and deduplicate. Compared to that approach, the table-walk method:

  • Guarantees K distinct nodes in a single pass (no dedup retries).
  • Preserves consistency: get_top_k(key, 1)[0] == get(key).
  • Couples ranks: removing a node at rank 1 can cascade and shift ranks 2+, whereas suffix hashing gives each rank independent disruption.

Performance

The get_top_k() and is_match() implementations use a u64 bitset for deduplication when the node count is ≤ 64 (zero heap allocation, single- instruction set/test on x86_64), falling back to a heap-allocated FixedBitSet for larger rings. The inner scan avoids modulo arithmetic by chaining two range iterators.

Properties

  • O(1) lookups — a single hash indexes into the table.
  • Minimal disruption — when a node is added or removed, only ~1/n of keys are remapped.
  • Uniform distribution — load is spread evenly across nodes.
  • Deterministic — same node set always produces the same table.

Correctness guarantees

This crate is tested against pitfalls found in other consistent-hashing implementations:

  • Deterministic seeds — fixed seeds ensure all processes build identical tables from the same node set. (Other crates use random SipHasher keys, producing different tables on each run.)
  • Independent node hashing — each node's hash is computed in isolation with a fresh hasher. Adding or removing nodes does not change existing nodes' hash values. (Other crates reuse a single hasher across nodes, creating input-order-dependent tables.)
  • Minimal disruption for preference lists — adding a node causes ~1/N disruption at each rank of the preference list, not just the primary. Set membership of the top-K is preserved with high probability.
  • On-the-fly permutation computation — the lookup table is built without pre-allocating the full N×M permutation matrix, using O(M) memory instead of O(N×M).

Development

# Set up pre-push hook (runs fmt, clippy, tests before each push)
git config core.hooksPath .githooks

# Release a new version (bumps Cargo.toml, commits, tags, pushes)
# CI publishes to crates.io automatically on tag push.
cargo release patch  # or: minor, major

Requires cargo-release: cargo install cargo-release

License

Licensed under either of

at your option.

About

Maglev consistent hashing with top-K preference lists for replica-aware routing

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors