Skip to content

v1.6.0 — DuckDB Spatial Inside

Choose a tag to compare

@imagodata imagodata released this 07 May 12:15
· 144 commits to main since this release

The "DuckDB Spatial Inside" release. A 7-PR cascade (#129#135) lands the foundation, the DSL geom function whitelist, granular DML verbs, the declarative validate: block end-to-end (incl. mode: tag dispatch), and the long-standing B-08 DELETE predicate gap.

DuckDB spatial moves from "embedded if you opt in" to the universal compute substrate. The Atlas R1 bench against pyogrio justifies the pivot.

Highlights

DuckDB Spatial Inside

  • Lazy install on first DSL geom fct usage — no pip install gispulse[spatial] extra; INSTALL spatial; LOAD spatial; runs once when the first rule needs it.
  • gispulse doctor --install-spatial — pre-installs + probes EPSG roundtrips against pyproj (catches PROJ datum-shift gaps).
  • Engine inference from URI*.gpkggpkg, postgresql://...postgis, *.shp / *.geojson / *.fgbduckdb_diff (file-blob CDC).

DSL — declarative geom + validation

  • 7 whitelisted geom functions in set_field and validate: rules: geom_area_m2, geom_perimeter_m, geom_length_m, geom_centroid_x/y, geom_npoints, geom_is_valid. Measure functions auto-project to a metric CRS (default EPSG:2154, override per-call).
  • Cross-layer subqueries: geom_within(layer='communes', match='code_insee') and geom_overlaps_any(layer='self', exclude_self=True). The compiler emits EXISTS (SELECT 1 FROM "<layer>" AS _L WHERE …) with strict identifier validation.
  • Safe-by-construction parser: walks Python AST under a strict allowlist; rejects __import__, eval, attribute access, lambdas, comprehensions, …
  • validate: top-level with mode: warn (log + WS event) or mode: tag (auto-creates a status column and writes failed:<rule.id>).

DML — granular verbs

when: [INSERT, UPDATE_GEOM, UPDATE_ATTR, DELETE, BULK]

The watcher resolves a coarse UPDATE change-log row to its granular variant via the geom_changed flag — pure attribute edits and geometry edits route to different triggers without inspecting the row.

Atlas R1 bench — DuckDB COPY GDAL/GPKG vs pyogrio

Scenario pyogrio DuckDB COPY Speedup RSS pyogrio RSS DuckDB
Append +100k 8.19 s 3.63 s 2.26× 950 MB 273 MB
Update attribute 6.94 s 2.75 s 2.52× 839 MB 255 MB
Update geometry 8.87 s 2.47 s 3.59× 843 MB 275 MB

Median of 3 runs on 1M EPSG:2154 polygons. The pyogrio-only write-back doctrine of v1.5.x is officially retired for bulk paths; pyogrio remains the fallback for datasets > 5M rows, GPKGs with custom triggers/views, and append-in-place semantics.

B-08 — DELETE predicates finally filter on pre-delete state

The AFTER DELETE SQLite trigger has been writing OLD.* JSON to _gispulse_change_log.old_values since v1; the changelog reader was just dropping the column. Fix is one whitelist entry + ~30 lines of watcher hydration. No GPKG migration required — fully backward-compatible with every v1+ project.

triggers:
  - name: alert_active_archive
    table: parcels
    when: [DELETE]
    predicate: "status == 'active'"   # now actually fires
    actions:
      - type: webhook
        url: https://ops.example.com/archive-alert

ESRI Attribute Rules — drop-in vocabulary

triggers:
  - name: parcels_constraint_min_surface
    kind: constraint   # alias for "validation" — eases ESRI migration

Cosmetic for now. The runtime ignores kind:; the alias keeps your migration diff small. See docs-site/guide/migration-from-esri.md for the full mapping table.

Documentation

Security pins

  • dml.changed broadcast payload stays minimal even on DELETE — row attributes never leak through /ws/events. New regression test pins the contract.
  • validate: rule SQL is never spliced raw — every column / layer / EPSG identifier passes a strict [A-Za-z_][A-Za-z0-9_]{0,62} validator before reaching DuckDB; literals are SQL-quoted.

Deferred to v1.6.x

  • build_runtime auto-wiring of validate_rules — the runner is plumbed and tested, but the schema needs a product decision on rule-to-table mapping (per-rule table:, first trigger's table, every trigger table). Workaround: callers wire the runner manually using make_gpkg_sql_evaluator + dispatcher injection.
  • #122 cross-source ATTACHgeom_within(layer='communes') against a separate dataset compiles cleanly but executes only when the target layer is part of the current ATTACH.
  • #124 layer_lookup — depends on cross-source ATTACH.

Install

pip install --upgrade gispulse

QGIS plugin lockstep at the same version (no plugin behaviour change in 1.6.0; bumped to keep the bridge contract in sync).

Full changelog

See CHANGELOG.md.