GISPulse v1.6.2
[1.6.2] - 2026-05-07
The "Format Frontier" release — DuckDB Spatial as the universal CDC substrate. Adds two new engines (spatialite, duckdb_diff), brings DML detection to seven file formats (GPKG, SpatiaLite, GeoJSON, FlatGeobuf, Shapefile, KML, CSV+WKT) — five of which had no native trigger surface — and closes EPIC #139 (DML semantics ADRs + WAL connection safety).
Added
- SpatiaLite engine. New
persistence.spatialite_engine.SpatiaLiteEngineshares the SQLite trigger DDL of GPKG but writes through pyogrio'sSQLite + SPATIALITE=YESdriver and queriesgeometry_columnsinstead ofgpkg_contents. Auto-routed for*.sqlite/*.dbURIs. Nomod_spatialitePython extension required at runtime — pyogrio's OGR linkage handles the catalog. (EPIC #105 slice 1, PR #151) is_spatialite_file(path)detection helper. Narrow rule: file must havegeometry_columnsAND must NOT havegpkg_contents. Used by future auto-routing code; the URI inference layer maps the suffixes ahead of file inspection. (PR #151)bootstrap_spatialite_project(conn). Sibling tobootstrap_gpkg_project; installs the same_gispulse_*internal tables WITHOUT setting the GPKGapplication_idor creatinggpkg_*catalog rows (those would corrupt SpatiaLite identity). Refactor extracts a shared_bootstrap_gispulse_internals(conn)helper used by both bootstraps. (PR #151)FileBlobChangeDetector. Reusable mtime + DuckDBST_Readsnapshot diff CDC. Hash ismd5(ST_AsWKB(geom) || json_object(props))excluding OGR's syntheticOGC_FIDso reordering features in the source file does not produce false DELETE+INSERT noise. Snapshot persisted as a DuckDB sidecar<blob>.gispulse-snapshot.duckdb. Set-diff semantics: emits INSERT and DELETE only — UPDATE is undetectable without a stable PK in the file format. (EPIC #105 slice 2, PR #152)- Companion-file watching. Multi-file formats (Shapefile =
.shp / .dbf / .shx / .prj / .cpg) are watched viamax(mtime)across every existing companion so attribute-only edits (which only touch.dbf) surface correctly. Single-file formats (GeoJSON, FlatGeobuf, KML, CSV) keep single-file mtime semantics. New_COMPANION_EXTENSIONSmap is extensible. (EPIC #105 slice 4, PR #152) DuckDBDiffEngine.SpatialEngineimplementation backed by the file-blob detector. Supports GeoJSON, FlatGeobuf, Shapefile (and zero-code-change-ready for KML / CSV+WKT — those land in v1.6.2). I/O via pyogrio.get_pending_changesshape matchesGeoPackageEngine(idint,changed_atISO 8601,geom_changed0/1) soChangeLogWatcheriterates uniformly across engines.mark_changes_processedis a no-op (poll is destructive).execute_sqlraisesNotImplementedError— this engine is a CDC adapter, not a query engine; for ad-hoc SQL rungispulse runwith the standalone DuckDB engine. (EPIC #105 slices 3+5, PR #152)- Engine factory entries.
_spatialite_factoryand_duckdb_diff_factoryregistered as built-ins. URI inference (already shipped in v1.6.0 viagispulse.runtime.engine_inference) maps.sqlite/.dbtospatialiteand.geojson/.fgb/.shp/.kml/.csv/.tab/.dxftoduckdb_diffautomatically — no extra wiring required to consume the new engines. (PRs #151, #152) persistence.gpkg_connection.connect_gpkg(path, …). Single entry point that applies WAL +busy_timeout=5000on every GeoPackagesqlite3.connect. Migrated 8 scattered call sites (CLI track / triggers / runtime, HTTP datasets routers,project_io) so concurrent QGIS edits + watcher polls never raiseSQLITE_BUSY. Documents the historicaltest_p02flake's root cause. (#141, PR #145)- ADR 0001 — DuckDB-spatial as the contract SQL dialect. Records the de-facto rule that v1.6.0 already enforces: the DSL geom-fct templates and
run_sqlstrings are written in DuckDB-spatial dialect by default. Theengine:top-level key remains the documented escape hatch for users running exclusively against PostGIS or SpatiaLite. (#140, PR #147) - ADR 0002 — Trigger cascade is bounded fixed-point with origin-tagging. Documents the existing two-layer cascade design: SQLite
WHENclauses block self-loops at the file format level (B-02, v1.5.3), andevaluate_cascaderuns a fixed-point loop withMAX_CASCADE_DEPTH = 3raisingCascadeDepthExceededbeyond. Community tier capped at depth 1, Pro up to 3. (#142, PR #148) - ADR 0003 —
_gispulse_change_logis a poll log, not an event store. Promotes the currentid AUTOINCREMENT+changed_atinvariants to documented contract; defers replay / sub-second timestamps / row hashing to a future v1.7+ extension table. (#143, PR #150) - ADR 0004 — DDL hooks out of scope; passive schema-drift detection ships. Records that ALTER TABLE / DROP TABLE / CREATE INDEX hooks are intentionally absent. The B-13 schema-drift watchdog (#103, v1.5.3) covers ALTER TABLE ADD COLUMN passively — the runtime rebuilds triggers within one watchdog tick and surfaces the new column in subsequent
new_valuespayloads. (#144, PR #150) - KML CDC. Auto-routed for
*.kmlfiles. Single-file mtime watch + DuckDBST_Readsnapshot diff — zero-code-change pass-through of theDuckDBDiffEngineshipped in #152. (EPIC #106 slice 1, PR #153) - CSV+WKT CDC. Auto-routed for
*.csvfiles. Pyogrio writes the geometry as a WKT column when invoked withGEOMETRY=AS_WKT; DuckDBST_Readdecodes it transparently for the diff. (EPIC #106 slice 1, PR #153) - MapInfo TAB companion files. New
_COMPANION_EXTENSIONS[".tab"]entry watches the four-file MapInfo set (.tab / .dat / .map / .id, plus.indif present) so attribute-only edits (which only touch.dat) surface correctly. (EPIC #106 slice 1, PR #153) - MapInfo TAB read via pyogrio fallback. DuckDB's bundled GDAL wheel does not include the MapInfo driver, so
ST_Read('places.tab')hangs. New_PYOGRIO_FALLBACK_SUFFIXES = {".tab"}routing inFileBlobChangeDetectorreads.tabthroughgeopandas.read_filewhile keeping the hash contract identical (md5(geom.wkb || json_object(props))). A future DuckDB build that ships the driver promotes the format back to the fast path with no observable change in event identity. Adding a format to the fallback set is the cheapest path to coverage when DuckDB lags the system OGR. (EPIC #106 slice 2, PR #154) - Multi-engine
POST /datasets/{id}/enable_tracking. The HTTP route is no longer hardcoded toGeoPackageEngine. New_resolve_engine_kind_for_tracking(ds, path)helper picks the engine viagispulse.runtime.engine_inference.infer_engineon the URI suffix, with a short-circuit fords.format == "gpkg"(the upload path stamps this from pyogrio inspection — more reliable than URI suffix on demos). The route branches: SQLite-family (gpkg/spatialite) installs AFTER triggers per layer;duckdb_diffskips the install entirely (the detector creates its sidecar snapshot on first poll) and uses the file stem as the tracked layer name. PostGIS URIs and unknown extensions return 400tracking_unsupported_format.WatcherRegistry.register()now takes anengine_kindkwarg (default"gpkg"for back-compat) and dispatches to the right engine class. Demo SaaS users uploading.geojson/.fgb/.shpto the portal can now enable tracking through the HTTP API and receivedml.changedevents on/ws/events. (#157, PR #158)
Changed
bootstrap_gpkg_projectextracts a shared internal helper. New_bootstrap_gispulse_internals(conn)runs migrations + creates_gispulse_*tables without GPKG-specific identity work.bootstrap_gpkg_projectand the newbootstrap_spatialite_projectboth layer their format-specific setup on top. Behaviour for existing GPKG callers is identical — regression test asserts the GPKG path still produces a valid GeoPackage withapplication_id = 0x47504B47andgpkg_contents. (PR #151)
Documentation
docs/adr/0001-dsl-sql-dialect.mdthroughdocs/adr/0004-ddl-hooks-out-of-scope.md. Four ADRs introducing adocs/adr/directory; cross-linked fromdocs-site/guide/architecture.mdunder a new "Décisions de scope (ADRs)" sub-section.docs-site/guide/dsl-sql-dialect.md. User-facing reference of the DSL SQL dialect contract, with the portableST_*surface,ST_Transformarity gotcha, andengine:override. Cross-linked fromengines.md,dsl-geom-functions.md,dsl-validation.md. (PR #147)docs-site/guide/rules.md. Cascade tip block expanded into a proper "Cascade behaviour of triggers" sub-section with the tier table, the two-layer explanation, a JSON example showingcascade_depth: 2, and a link to ADR 0002. (PR #148)docs-site/guide/formats.md. SpatiaLite, GeoJSON, FlatGeobuf, Shapefile, KML and CSV rows bumped with their CDC support note. New "CDC file-blob" section explains the mechanism, formats covered, multi-file companion-watching rule, and known limitations (set-diff = INSERT/DELETE only, polling not inotify, single-layer per file). MapInfo TAB row mentions the pyogrio fallback path. (PRs #151, #152, #153, #154)docs-site/guide/walkthroughs/geojson-cdc.md(FR + EN). Fourth walkthrough showcases the file-blob CDC path end-to-end: 30-second setup creating aplaces.geojson, two edit demos (Python script append + QGIS edit), the exact webhook payload shape, "how it works" diagram, honest limitations section, variants table for the 8 supported formats, cross-links to ADR 0001 + formats.md. EN translation mirrors FR 1:1 so the demo URL has 4 walkthroughs in both languages. (PRs #155, #156)
Decision log
- EPIC #139 (DML semantics) closed same-day. Five sub-issues actioned in five PRs (#145 WAL fix code; #147/#148/#150 four ADRs). Out-of-scope topics — replay event sourcing (#143), DDL hooks (#144),
run_sqlPostGIS-only construct scanner (#146 follow-up) — are documented rather than implemented so v1.6.x ships without scope creep. The investigation surfaced one important course correction: the cascade design that ships is bounded fixed-point, not single-pass as the issue body initially proposed. - EPIC #105 (Format Frontier T1) closed same-day in five slices. SpatiaLite (PR #151) + GeoJSON / FlatGeobuf / Shapefile / watcher-wiring (PR #152) all delivered before v1.6.2 release prep. KML and CSV+WKT are zero-code-change-ready through the existing
DuckDBDiffEngineand will be promoted to T2 (#106) with test-only PRs.