A DuckDB loadable extension providing native scalar functions for the IGEO7 (Z7) hexagonal discrete global grid system. Wraps the C++ Z7 library for high-performance bit-level index manipulation and neighbour traversal directly inside DuckDB queries.
This is a work-in-progress to explore the possibility of neighbor traversal in the Z7 domain. Z7 is an indexing system for aperature 7 hexagonal discrete global grid systems, as coined in Kmoch et al. (2025). It is based on the Generalized Balanced Ternary (GBT) numeral system described in Lucas, Gibson (1982), van Roessel (1988), and Wikipedia (2025).
Lucas and Gibson (1982) gave examples of GBT with exclusively counter-clockwise (CCW) rotation. The disadvantage of this approach is that the orientation of the hexagons at the different hierarchy levels quickly diverge from each other. White et al. (1992) proposed an alternating rotation direction (CW, CCW, CW, ...) to maintain an alignment of hexagon orientations across hierarchy levels. Kmoch et al. (2025) adopted this alternating rotation pattern for Z7 assigning CW to odd resolutions and CCW to even resolutions.
- Kmoch, A., Sahr, K., Chan, W. T., & Uuemaa, E. (2025). IGEO7: A new hierarchically indexed hexagonal equal-area discrete global grid system. AGILE: GIScience Series, 6(32). https://doi.org/10.5194/agile-giss-6-32-2025
- Lucas, D., & Gibson, L. (1982). Automated Analysis of imagery. Air Force Office of Scientific Research. https://apps.dtic.mil/sti/tr/pdf/ADA125710.pdf. (No. AFOSRTR830055).
- Sahr, K. (2011). Hexagonal discrete global grid systems for geospatial computing. Archives of Photogrammetry Cartography and Remote Sensing, 22, 363–376.
- van Roessel, J. W. (1988). Conversion of Cartesian coordinates from and to Generalized Balanced Ternary addresses. Photogrammetric Engineering & Remote Sensing, 54(11), 1565–1570.
- White, D., Kimerling, J. A., & Overton, S. W. (1992). Cartographic and Geometric Components of a Global Sampling Design for Environmental Monitoring. Cartography and Geographic Information Systems, 19(1), 5-22. https://doi.org/10.1559/152304092783786636
- Wikipedia Contributors. (2025, December 16). Generalized balanced ternary. Wikipedia; Wikimedia Foundation.
Z7 packs every cell into a single UBIGINT:
bits [63:60] base cell (4 bits, values 0–11)
bits [59:57] digit 1 (3 bits, values 0–6; 7 = padding / unused slot)
bits [56:54] digit 2
...
bits [2:0] digit 20
The compact string form (igeo7_to_string) is a 2-digit zero-padded base
cell followed by the digit characters, stopping before the first 7:
0800432 → base 08, digits 0,0,4,3,2, resolution 5.
The verbose debug form (igeo7_decode_str, SQL macro) always shows all 20
digit slots separated by dots: 08-0.0.4.3.2.7.7.7.7.7.7.7.7.7.7.7.7.7.7.7.
| Function | Signature | Description |
|---|---|---|
igeo7_from_string |
(VARCHAR) → UBIGINT |
Parse Z7 string to packed index |
igeo7_to_string |
(UBIGINT) → VARCHAR |
Compact Z7 string (stops before first 7) |
igeo7_encode |
(base, d1…d20) → UBIGINT |
Pack base cell + 20 digit slots; accepts UTINYINT or INTEGER literals |
igeo7_get_resolution |
(UBIGINT) → INTEGER |
Resolution (0–20) of an index |
igeo7_get_base_cell |
(UBIGINT) → UTINYINT |
Base cell ID (0–11) |
igeo7_get_digit |
(UBIGINT, INTEGER) → UTINYINT |
Extract the i-th digit (1–20) |
igeo7_parent |
(UBIGINT) → UBIGINT |
Parent index (one level up) |
igeo7_parent_at |
(UBIGINT, INTEGER) → UBIGINT |
Ancestor at specific resolution |
igeo7_get_neighbours |
(UBIGINT) → UBIGINT[] |
All 6 neighbours (invalid neighbours = UINT64_MAX) |
igeo7_get_neighbour |
(UBIGINT, INTEGER) → UBIGINT |
Single neighbour by direction (1–6) |
igeo7_first_non_zero |
(UBIGINT) → INTEGER |
Position of first non-zero digit slot |
igeo7_is_valid |
(UBIGINT) → BOOLEAN |
FALSE when index equals the invalid sentinel (UINT64_MAX) |
Source this file after loading the extension. These macros add convenience operations that compose or extend the C++ functions above.
| Macro | Signature | Description |
|---|---|---|
igeo7_decode_str |
(UBIGINT) → VARCHAR |
Verbose base-d1.d2…d20 form showing all 20 digit slots |
igeo7_encode_at_resolution |
(base, res, d1…d20) → UBIGINT |
Encode then truncate to resolution (igeo7_encode + igeo7_parent_at) |
igeo7_string_parent |
(VARCHAR) → VARCHAR |
Drop the last character of a compact Z7 string |
igeo7_string_local_pos |
(VARCHAR) → VARCHAR |
Last character of a compact Z7 string (local position digit) |
igeo7_string_is_center |
(VARCHAR) → BOOLEAN |
True when the last digit is '0' (center of parent) |
- CMake ≥ 3.14
- C++17 compiler (GCC ≥ 9, Clang ≥ 10, MSVC ≥ 19.20)
- DuckDB v1.5.1 source tree (headers + built library + platform file)
Clone and build DuckDB at the matching tag. The build produces the headers,
shared library, and duckdb_platform_out file needed for the extension metadata
footer.
git clone --branch v1.5.1 https://github.com/duckdb/duckdb.git
cd duckdb
BUILD_EXTENSIONS="" make release
cd ..Always pass an absolute path to -DDUCKDB_DIR. A relative path is resolved
against the CMake source directory (not the build directory) and can silently
pick up the wrong DuckDB installation.
If DuckDB was cloned inside duckdb_igeo7/ (the layout in this repo):
cd duckdb_igeo7
mkdir build && cd build
cmake .. -DDUCKDB_DIR=$(pwd)/../duckdb \
-DCMAKE_BUILD_TYPE=Release
make -j$(nproc) # Linux / macOS: nproc or sysctl -n hw.logicalcpuFor a DuckDB checkout elsewhere, use its absolute path directly:
cmake .. -DDUCKDB_DIR=/absolute/path/to/duckdb \
-DCMAKE_BUILD_TYPE=ReleaseThis produces build/igeo7.duckdb_extension with the required binary
metadata footer automatically appended post-build.
cmake .. -DBUILD_TESTS=ON -DDUCKDB_DIR=$(pwd)/../duckdb
make igeo7_test
./igeo7_testDuckDB v1.5.1 only loads files ending in .duckdb_extension. Because this
extension is not signed by the DuckDB team, pass -unsigned when starting the
CLI (or run SET allow_unsigned_extensions = true; inside an existing session).
# One-liner from the duckdb_igeo7/ directory
duckdb/build/release/duckdb -unsigned \
-c "LOAD 'build/igeo7.duckdb_extension'; SELECT igeo7_to_string(igeo7_from_string('0800432'));"Inside a session or SQL script:
LOAD '/absolute/path/to/build/igeo7.duckdb_extension';Pipe the LOAD statement and the macro file together:
cat <(echo "LOAD 'build/igeo7.duckdb_extension';") sql/igeo7_macros.sql \
| duckdb/build/release/duckdb -unsignedOr in the interactive shell, source them with .read after loading the extension:
.read sql/igeo7_macros.sql
SELECT igeo7_to_string(igeo7_from_string('0800432'));
-- '0800432'
SELECT igeo7_get_resolution(igeo7_from_string('0800432'));
-- 5
SELECT igeo7_get_base_cell(igeo7_from_string('0800432'));
-- 8igeo7_to_string produces the compact form; igeo7_decode_str (SQL macro)
produces the verbose dot-separated form showing all 20 slots.
-- compact form
SELECT
32023330408103935::UBIGINT AS cell_id,
igeo7_get_base_cell(cell_id) AS base,
igeo7_get_resolution(cell_id) AS res,
igeo7_to_string(cell_id) AS compact;
-- ┌───────────────────┬──────┬─────┬─────────────────┐
-- │ cell_id │ base │ res │ compact │
-- │ uint64 │ uint8│ int │ varchar │
-- ├───────────────────┼──────┼─────┼─────────────────┤
-- │ 32023330408103935 │ 0 │ 13 │ 000161612062413 │
-- └───────────────────┴──────┴─────┴─────────────────┘
-- verbose form (requires sql/igeo7_macros.sql to be sourced first)
SELECT igeo7_decode_str(32023330408103935::UBIGINT);
-- '0-0.1.6.1.6.1.2.0.6.2.4.1.3.7.7.7.7.7.7.7'igeo7_encode is the inverse of igeo7_get_base_cell + igeo7_get_digit.
Integer literals are accepted without explicit casts. Use 7 for every padding
slot beyond the target resolution.
-- Reconstruct '0800432': base=8, resolution=5, digits 0,0,4,3,2
SELECT igeo7_to_string(
igeo7_encode(8, 0,0,4,3,2, 7,7,7,7,7, 7,7,7,7,7, 7,7,7,7,7)
);
-- '0800432'
-- Must exactly equal igeo7_from_string for the same cell
SELECT igeo7_encode(8, 0,0,4,3,2, 7,7,7,7,7, 7,7,7,7,7, 7,7,7,7,7)
= igeo7_from_string('0800432') AS equal;
-- true
-- Known cell round-trip: base=0, resolution=13
SELECT (igeo7_encode(0, 0,1,6,1,6,1,2,0,6,2,4,1,3, 7,7,7,7,7,7,7)
= 32023330408103935::UBIGINT)::BOOLEAN AS same;
-- trueSELECT
igeo7_get_digit(igeo7_from_string('0800432'), 1) AS d1, -- 0
igeo7_get_digit(igeo7_from_string('0800432'), 2) AS d2, -- 0
igeo7_get_digit(igeo7_from_string('0800432'), 3) AS d3, -- 4
igeo7_get_digit(igeo7_from_string('0800432'), 4) AS d4, -- 3
igeo7_get_digit(igeo7_from_string('0800432'), 5) AS d5; -- 2-- One level up
SELECT igeo7_to_string(igeo7_parent(igeo7_from_string('0800432')));
-- '080043'
-- Ancestor at resolution 3 (keeps digits 1–3, fills the rest with 7)
SELECT igeo7_to_string(igeo7_parent_at(igeo7_from_string('0800432'), 3));
-- '08004'-- All 6 neighbours as compact strings
SELECT igeo7_to_string(unnest(igeo7_get_neighbours(igeo7_from_string('0800432'))));
-- 0800433, 0800651, 0800064, 0800436, 0800430, 0800655
-- Single neighbour by direction (1–6); invalid neighbours = UINT64_MAX
SELECT igeo7_to_string(igeo7_get_neighbour(igeo7_from_string('0800432'), 1));
-- '0800433'
-- Pentagon check: one direction is always invalid at resolution 2
SELECT
generate_series AS dir,
igeo7_to_string(igeo7_get_neighbour(igeo7_from_string('0800'), generate_series::INTEGER)) AS neighbour,
igeo7_is_valid(igeo7_get_neighbour(igeo7_from_string('0800'), generate_series::INTEGER)) AS valid
FROM generate_series(1, 6);WITH RECURSIVE hierarchy AS (
SELECT igeo7_from_string('0800432') AS cell, 0 AS step
UNION ALL
SELECT igeo7_parent(cell), step + 1
FROM hierarchy
WHERE igeo7_get_resolution(cell) > 0
)
SELECT step, igeo7_to_string(cell) AS readable
FROM hierarchy;
-- step 0 → '0800432'
-- step 1 → '080043'
-- step 2 → '08004'
-- step 3 → '0800' (resolution 2)
-- ...
-- step 5 → '08' (resolution 0)SELECT cell_id, igeo7_to_string(cell_id) AS readable
FROM my_table
WHERE igeo7_is_valid(cell_id);LOAD '/absolute/path/to/build/igeo7.duckdb_extension';
SELECT
cell_id,
igeo7_get_base_cell(cell_id) AS base,
igeo7_get_resolution(cell_id) AS res,
igeo7_to_string(cell_id) AS readable
FROM read_parquet('data/*.parquet')
WHERE igeo7_is_valid(cell_id)
LIMIT 5;COPY (
SELECT *, igeo7_parent_at(cell_id, 3) AS parent_3
FROM read_parquet('data/*.parquet')
WHERE igeo7_is_valid(cell_id)
)
TO 'output/' (FORMAT PARQUET, PARTITION_BY (parent_3));WITH cells AS (
SELECT cell_id, value
FROM read_parquet('data/*.parquet')
WHERE igeo7_is_valid(cell_id)
),
expanded AS (
SELECT cell_id, value,
unnest(igeo7_get_neighbours(cell_id)) AS neighbour_id
FROM cells
)
SELECT
e.cell_id,
igeo7_to_string(e.cell_id) AS center_readable,
e.value AS center_value,
AVG(c2.value) AS neighbour_avg
FROM expanded e
JOIN cells c2
ON c2.cell_id = e.neighbour_id
AND igeo7_is_valid(e.neighbour_id)
GROUP BY e.cell_id, e.value;duckdb_igeo7/
├── CMakeLists.txt # Build system; appends metadata footer post-build
├── README.md # This file
├── src/
│ ├── igeo7_extension.cpp # Extension entry point and all C++ functions
│ └── z7/ # Z7 C++ core library (upstream, unmodified)
│ ├── library.h
│ ├── library.cpp
│ ├── BitFieldProxy.h
│ └── util.h
├── sql/
│ └── igeo7_macros.sql # Companion SQL macros (source after LOAD)
└── test/
├── test_z7.cpp # Standalone C++ smoke test (no DuckDB needed)
└── test_extension.sql # SQL smoke tests for all extension functions
Unsigned extension rejected on LOAD
DuckDB v1.5.1 rejects extensions without a cryptographic signature by default.
Start the CLI with -unsigned or run SET allow_unsigned_extensions = true;
before loading.
ABI / entrypoint mismatch
The extension must be built and run against the same DuckDB version. This
extension targets v1.5.1 and uses the ExtensionLoader API and
extern "C" DUCKDB_CPP_EXTENSION_ENTRY entry point introduced when
ExtensionUtil was removed in
PR #17772.
"Metadata invalid" on LOAD or INSTALL
DuckDB v1.5.1 validates a binary metadata footer at the end of every
.duckdb_extension file. This footer is appended automatically by the CMake
post-build step when -DDUCKDB_DIR=... is provided. If DUCKDB_DIR is omitted
the footer is skipped and loading will fail with a metadata error.
CMake finds the wrong DuckDB (e.g. a Homebrew install)
Always pass an absolute path to -DDUCKDB_DIR. A relative path like ../duckdb
is resolved from the CMake source directory, not the build directory, and can
silently point elsewhere. Use $(pwd)/../duckdb to force shell expansion before
CMake sees it.
Hex literal errors in SQL
DuckDB does not support 0x hex literals. Use decimal equivalents
(e.g., 15 instead of 0x0F).
Bit-manipulation functions (igeo7_get_resolution, igeo7_get_digit,
igeo7_parent_at, igeo7_encode, etc.) are essentially free — a handful of
shifts and masks per row, fully vectorised inside DuckDB's execution engine.
igeo7_get_neighbours performs GBT table lookups and carry propagation for
each of 6 directions across up to 20 digit slots, all in native C++. Expect
tens of millions of neighbour computations per second on a single core.
SQL macros (igeo7_decode_str, string helpers) expand to DuckDB SQL
expressions at plan time. igeo7_decode_str in particular emits 20 shift/mask
operations and a multi-argument CONCAT per row. For bulk inspection of millions
of cells, prefer igeo7_to_string (C++) and reserve igeo7_decode_str for small
result sets or debugging.
The Z7 core library files retain their original SPDX headers (see individual
source files). The DuckDB extension wrapper (src/igeo7_extension.cpp,
CMakeLists.txt, sql/) is currently AGPL licensed (due to DGGRID reference, could be revisited for later versions)