Skip to content

Frozen{Map,Set} for H3 cells, based on finite state transducers

License

Notifications You must be signed in to change notification settings

HydroniumLabs/h3o-ice

Repository files navigation

h3o-ice — Frozen (aka read-only) sets and maps for H3 cell index

Crates.io Docs.rs CI Status Coverage License

Those data structures are built on top of finite state transducers, which allows them to be extremely compact while offering fast lookup.

This is especially for read-only indexes (both in-memory or on-disk).

Installation

Cargo

  • Install the rust toolchain in order to have cargo installed by following this guide.
  • run cargo install h3o-ice

Usage

use h3o::{LatLng, Resolution};
use h3o_ice::FrozenSet;

let coord = LatLng::new(48.872280706 2.332697839).expect("valid coord");
let set = FrozenSet::try_from_iter(
    coord.to_cell(Resolution::Nine)
        .children(Resolution::Ten)
)
.expect("failed to create set");

set.contains(CellIndex::try_from(0x8a1fb4666417fff).expect("valid cell"));

Comparison with other data structures

Frozen{Set,Map} BTree{Set,Map} Hash{Set,Map}
Memory overhead ✅ (negative 1) ❌ (50%2) ❌ (12-125%2)
Search complexity O(key size) O(log #items) O(1)
Range query
Prefix lookup
Insertion/Deletion
Direct lookup 163 ns 55 ns 19 ns
Compacted lookup 67 ns 401 ns 125 ns

About the lookup time:

  • input dataset is France coverage at resolution 10:
    • raw dataset: 44 250 550 cells (333.60M).
    • compacted (i.e. replacing clusters of cells their ancestors): 127 264 cells (0.97M)
  • Lookup of a resolution 10 cell:
    • single lookup in the raw dataset.
    • prefix lookup (or repetitive lookup if not supported) in the compacted dataset.

You can run the provided benchmarks to get measures relevant to your machine.

To sum up:

  • if you can afford the memory usage and don't care about ordering (e.g. range query) go with a HashSet for maximum speed.
  • if you need ordering but don't need to work on compacted set, go with a BTreeSet.
  • if you have a large read-only dataset, wants to optimize memory usage and be able to query compacted data directly then FrozenSet is your friend.

Frozen{Map,Set} are not general purpose data structures. They come with a bunch of extra constraints (read-only, must be build from pre-sorted input, ...) but they really shine in their niche and provides a bunch a goodies (e.g. key compressions, can be mmap'd, prefix lookup, ...).

License

BSD 3-Clause

Footnotes

  1. Frozen{Set,Map} compresses both common prefixes and suffixes of keys, resulting in a smaller size than the input data set (e.g. the 333MB test dataset fits into a 176KB FrozenSet)

  2. Measuring the overhead of HashMaps in Rust 2

About

Frozen{Map,Set} for H3 cells, based on finite state transducers

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages