Skip to content

Releases: imagodata/gispulse

GISPulse v2.3.0

14 Jun 18:10
ed5adb7

Choose a tag to compare

[2.3.0] - 2026-06-14

Added

  • Manifest v3 orchestration suite (#440). End-to-end pipeline runtime:
    PipelineRun entity + lifecycle events on the EventHub (#442), run-completion
    as a trigger source with scenario wiring (#445), run + validate manifest v3
    pipelines over HTTP (#447), a step-kind registry with external subprocess
    steps (#450), a run control surface — cancel / resume / partial execution
    (#452), and non-capability steps + selectless models in manifest v3 (#454).
  • Saved-map CRUD API (#405, #446). Persist and reload maps — layers, styles,
    view and filters — over the HTTP API.
  • Consolidate-networks capabilities (cn_* family, #465). Faithful
    pure-shapely port of the QGIS Consolidate Networks plugin
    (github.com/sducournau/consolidate_networks), bringing eight line-network
    topology cleaners to every GISPulse surface without QGIS:
    cn_calculate_dbscan, cn_consolidate_with_dbscan,
    cn_make_intersections_vertexes, cn_endpoints_trim_extend,
    cn_endpoints_snapping, cn_hub_snapping, cn_snap_hubs_to_layer,
    cn_snap_endpoints_to_layer. All work in crs_meters, never mutate input,
    support explode_and_gather / entity_identification_fields; the two
    *_to_layer capabilities take a reference layer via ref_layerref_gdf.
  • measure_spatial_impact capability (#436). Clip + overlap measurement for
    feature × parcel impact checks.
  • H3 multi-metric aggregation (#457). Several metrics in a single H3 pass.
  • New open-data source plugins. src-dpe (energy performance, #422),
    src-ocsge (land cover, #423), src-sitadel (building permits, #424), public
    OSM + GRB sources with in-zip member reading via /vsizip (#459), and OSM PBF
    road-tag extraction + materialize_pbf local download (#463).
  • Geo commons primitives (#438). Reusable HTTP / WFS / GeoJSON building
    blocks and generic geo models.
  • write_pmtiles_pyramid (#435). Multi-LOD layers in a single PMTiles
    archive.
  • Universal loader for non-geo tabular sources (#449). CSV/tabular inputs
    flow through the same loader path.
  • PG-direct building blocks (#432). ST_Subdivide materializer + batched
    DuckDB → PostGIS loader.
  • Source readiness probe engine (#431) with extensible probe kinds, and a
    bounded vector tiler (#430) — bbox-tiled parallel ingest to parquet.
  • Manifest-gated PostGIS column shed (#429, dry-run by default), opt-in
    DuckDB resource limits (memory/temp/threads, #427)
    , a unified source CLI
    (#421)
    , stream_vector_to_parquet (#420), CSV encoding detection with
    fallback (#426)
    , and BulkIngestRunner skip-if-staged resume (#433) plus
    dept scope-stamp / bulk_access_for / GeoJSON aliases (#419).

Fixed

  • Security audit 2026-06-09 (#418). SQL injection, HTTP-router auth, zip-slip
    (7z), SSRF and DoS hardening, plus CSV/XLSX write and COG export gaps.
  • CDC v2.3.0 hardening (#441). pg_notify DELETE/PK handling, composite
    primary keys, and documentation drift.
  • Scheduled pipelines actually execute their pipeline_config (#439).
  • Manifest path fixes. Selectless models run in declaration order (#458);
    cancel, heartbeat and timeout elevation on the manifest path (#456).
  • Storage: make the Garage backend explicit (#462).

Changed

  • CI / supply-chain (#400, #401, #444). SHA-pinned GitHub Actions,
    least-privilege DCO, and pyjwt / urllib3 bumps.
  • Python baseline raised to 3.12+. Dropped Python 3.11 support and aligned
    the plugin template and ruff target accordingly.

GISPulse v2.2.3

09 Jun 10:55
7e1851e

Choose a tag to compare

[2.2.3]

Added

  • PMTiles tiling (gispulse.tiling.write_pmtiles). GeoParquet → static
    PMTiles writer backed by DuckDB ST_AsMVT, with the new tiling extra
    (pmtiles, pyarrow). Ports the last milou-branch capability into mainline.
  • VectorFileFetcher / AccessProtocol.LOCAL_FILE. Core protocol roster now
    reads local vector files (KML/KMZ, GeoPackage, GeoJSON, Shapefile,
    FlatGeobuf, ...) through gispulse.persistence.io.read_vector() and returns a
    SourceResult carrying the materialized GeoDataFrame and CRS. This keeps
    MILOU-style local KMZ ingestion on the official imagodata/gispulse package.

Fixed

  • Line-volume memory corruption in tiling. Tile encoding ran one ST_AsMVT
    query per coverage tile on a single DuckDB connection; past a few hundred line
    features the spatial extension corrupted memory → non-deterministic segfault or
    ST_AsMVTGeom: tile width and height must be positive. Rewritten as a single
    grouped query (features × tiles spatial join → GROUP BY tile). Robust and much
    faster.
  • Unsupported MVT property types. A DATE/TIMESTAMP (or other non-numeric)
    column made tiling fail (ST_AsMVT accepts only VARCHAR/FLOAT/DOUBLE/INTEGER/
    BIGINT/BOOLEAN). Properties are now coerced (wide ints → BIGINT, decimals →
    DOUBLE, everything else → VARCHAR).

GISPulse v2.2.2

07 Jun 20:30
6a50ca2

Choose a tag to compare

[2.2.2]

Changed

  • snap_points_to_lines hardening. ref_id_col is now required: a
    missing/absent column raises a clear ValueError instead of silently
    falling back to a positional index. Ties between equidistant lines now break
    deterministically on the smallest edge_id, so the same inputs always
    produce the same outputs. The unsnapped-row contract is unchanged (a point
    beyond max_distance_m keeps snapped=False, null edge_id/measure and
    its original geometry, with only offset_distance reported) — downstream
    consumers (e.g. MILOU's build_site_network_candidates) need no changes.

GISPulse v2.2.1

07 Jun 12:53
1265827

Choose a tag to compare

[2.2.1]

Added

  • DuckDB-file datamarts (DP2b). A datamart can now be backed by a single
    DuckDB database file (kind="duckdb") instead of one Parquet file per
    table: datamart://<mart>/<table> attaches the .duckdb file read-only
    and selects the table, with bbox push-down via ST_Intersects.
    GISPULSE_DATAMARTS accepts "kind": "duckdb".

GISPulse v2.2.0

07 Jun 12:02
a7e34f6

Choose a tag to compare

[2.2.0]

Feature release adding a coherent set of generic GIS capabilities for
network analysis, linear referencing and clustering (the "A–K" plan), plus a
unified multi-source data provider. Fully backwards-compatible.

Added

  • Unified data loader & providers (#374). gispulse.load(source, …) /
    app.load() resolve files (incl. GeoParquet), remote URIs (s3/http),
    wfs:// / stac:// / ogc-features://, datamarts (datamart://, curated
    Parquet) and GeoNode instances (geonode://, read + publish() write) to a
    GeoDataFrame or a lazy DuckDB scan.
  • SpatialIndex (K) & NetworkGraph (F). Reusable core infrastructure: a
    thin STRtree wrapper (build once / query many) and a persistent routing
    handle that builds the NetworkX graph once and snaps points to nodes in
    O(log n). All network capabilities (shortest_path, isochrone,
    od_matrix, mst, network_allocation, connectivity_check) now reuse the
    handle instead of rebuilding the graph and scanning nodes linearly.
  • build_network_graph (A). Turns a line layer into a tagged node/edge
    GeoDataFrame with stable ids, degrees and metric lengths.
  • snap_points_to_lines (B) and split_lines_at_points (C). Snap a
    point layer onto a line network; cut lines at a reference point layer.
  • planarize (D) and connected_components (E). Attribute-preserving
    planar noding; label network islands.
  • steiner_tree (G). Approximate minimum Steiner tree connecting a subset
    of terminal points at near-minimum total cost.
  • cluster_network_dbscan (H). DBSCAN using shortest-path distance along a
    network instead of straight-line distance.
  • community_detection (I). Partition a network into communities (Louvain
    / greedy modularity / label propagation); tags each edge with a
    community_id.
  • cluster_st_dbscan (J). Spatio-temporal DBSCAN combining a spatial
    (eps_m) and a temporal (eps_time) threshold.

GISPulse v2.1.0

05 Jun 12:12
bf0e879

Choose a tag to compare

[2.1.0]

Feature release consolidating everything that landed on main since 2.0.0:
a wave of new declarative source plugins, the ELT push-down / manifest-v3
pipeline, S3 bulk materialisation for tabular sources, and an env-driven
Garage object store. Fully backwards-compatible — the _compat meta-path
shim and the PluginHub = ExtensionHub alias stay in place.

Added

  • TABLE_FILE bulk materialisation to S3 (#358). The AccessProtocol.TABLE_FILE
    fetcher can write a parsed DuckDB table scan to S3/Garage as Parquet
    (s3_uri / s3_keyCOPY … TO 's3://…' (FORMAT PARQUET), MATERIALIZE
    mode), enabling national-scale tabular ingestion.
  • RestTableFetcher / REST_TABLE (#337). Paginated tabular-JSON REST
    adapter for declarative sources.
  • New declarative source plugins: Géorisques over REST_TABLE (#338),
    protected areas — Natura 2000 + ZNIEFF (#341), SUP servitudes incl. ABF /
    PPR over WFS (#342), INSEE IRIS statistical units over WFS (#343), Cadastre
    Etalab bulk per département (#353), BDNB (#360), BODACC (#361), RNB (#362),
    loyers (#363), BAN (#364). Belgium: business parks (#367), Statbel
    statistical sectors + demographics (#368), GIPOD planned works / public
    domain — Flanders (#369).
  • milou client (#366). S3/R2-backed DuckDB client with KMZ/XLSX ingestion
    in Lambert-72.
  • ELT push-down pipeline. SQL push-down for the 12 attribute capabilities
    (#264), aggregating / two-layer / CRS geometry caps (#296), dissolve +
    spatial_join (#297), nearest_neighbor + overlay (#298), temporal_filter +
    temporal_join (#299); manifest-v3 schema + loader + compiler + gispulse migrate (#300), load-time cycle / ref validation (#301), manifest runner +
    view/table materialisation (#302), gispulse explain DAG inspection (#303),
    per-model data-quality asserts (#304).
  • Garage object store (#348). Env-driven S3 landing-zone service in compose.
  • Changelog coverage gate (#335). CI guards that a version bump ships a
    matching changelog entry.

Fixed

  • _compat class identity (#334). The legacy meta-path finder now prepends
    to sys.meta_path and aliases legacy modules to the same object, so
    isinstance / pytest.raises hold across the persistence.*
    gispulse.persistence.* boundary (#333).
  • WfsFetcher registration (#355). AccessProtocol.WFS now resolves from the
    core roster.
  • src-dvf (#354). Re-sourced from the live geo-DVF CSV after the upstream
    parquet mirror was removed.

Changed

  • Dependency bumps: starlette 1.0.1 (#365), redis >=5.0,<9.0 (#357),
    actions/checkout 6 (#351), actions/setup-python 6 (#350),
    actions/github-script 9 (#352).
  • Docs: Phase 2 feature pages (#323), changelog backfill + 2.0 migration guide
    (#322), manifest-v3 migration guide (#305), absolute repo links (#325).

GISPulse v2.0.0

20 May 18:36
087ad30

Choose a tag to compare

[2.0.0]

The first major release. Numerically it's a jump from 1.6.2, but in
practice the API surface is the same as 1.7.0 / 1.8.0 / 1.9.0
features that accumulated on main without ever being published to
PyPI. We promote the whole stack in one tag and reset the public version
to match the product story.

Three threads converge here:

  1. Foundations (was tagged internally v1.8.0) — gispulse.*
    mono-package, ExtensionHub replacing PluginHub, GISPulseApp
    façade, full MCP server, data-pack regime, CLI / HTTP / template
    routers.
  2. Worldwide aggregator (was tagged internally v1.9.0) — lazy
    DuckDB-backed fetcher network covering 4 protocol families
    (GeoParquetS3, OGCFeatures, STAC, HttpFile) and a curated
    worldwide_catalog.yml listing France / EU / world data sources.
  3. Data-pack rails — the first third-party data-packs can now ship
    on PyPI: a discovery channel via the gispulse.data_packs entry-point,
    an Ed25519 signature gate on EXTERNAL manifests, and a shared licence
    payload format that also covers the future SaaS tenant licence.

The full migration path is in MIGRATION-2.0.md. The
short version: no application code change is strictly required — the
_compat.py meta-path shim absorbs the import-path move and the
PluginHub rename keeps a working alias until 2.1.

Added

  • Data-pack regime — PyPI discovery channel (T5). A third channel
    alongside the bundled OSS manifests and GISPULSE_DATA_PACKS_DIR: a
    Python entry-point group gispulse.data_packs lets a third-party
    package register its manifests at install time. One bad pack never
    locks out the others (every failure path is isolated and logged).
    (#269)
  • Data-pack regime — Ed25519 signature gate (G1a). DataPackManifest
    gains an optional signature field. An EXTERNAL manifest carrying a
    signature is verified against the public key in
    GISPULSE_DATA_PACK_PUBLIC_KEY before being registered; tampered or
    foreign-signed manifests are dropped with explicit log events
    (data_pack_signature_invalid,
    data_pack_signature_no_public_key). INTERNAL (bundled) manifests
    are exempt — the OSS tree is the source of truth. Unsigned EXTERNAL
    packs are admitted by default (rollout-friendly); set
    GISPULSE_DATA_PACK_REQUIRE_SIGNATURE=true to refuse them. (#271)
  • Unified Ed25519 licence payload format (L0). New
    gispulse.core.licence_format defines the single payload schema
    shared by the per-machine licence key (Mode A), the future SaaS
    tenant licence (Mode B) and the data-pack manifest signature.
    Versioned via schema_version, forward-compat (unknown fields land
    in LicencePayload.extra, never crash an older verifier),
    canonicalised JSON (sort-keys, compact, UTF-8) so bytes-to-sign are
    stable across runs and Python versions. (#266)
  • High-level OGC client for data packs (T1). New
    gispulse.core.fetchers.ogc_client.fetch_features(...) — a one-liner
    any data-pack can use without building an AccessSpec. Thin layer
    over the consolidated transport: argument normalisation, WFS vs OGC
    API Features dispatch, typed network-error surface
    (OGCEndpointUnreachable, OGCClientError) so callers don't depend
    on httpx classes. (#267)
  • Declarative ZoningElement normaliser (T2). New
    gispulse.core.zoning_normalizer maps heterogeneous source records
    into a common 8-field schema (geometry, zone_code, zone_label, hilucs_class, plan_id, plan_date, regulation_ref, source_country)
    inspired by INSPIRE PlannedLandUse. ZoningMapping is a flat
    declarative {target -> source_key} table plus an optional
    best-effort HILUCS lookup. CRS is mandatory and must be explicit
    (EPSG:XXXX). HILUCS misses leave the column None, never crash
    the batch. (#268)
  • regulatory-zoning data-pack content type (T3). New value in
    DATA_PACK_CONTENTS so a manifest of that type is recognised by the
    discovery and signature gate. New RegulatoryZoningEntry dataclass
    • from_dict() validator: required-field set, no unknown fields,
      ISO-3166-1 alpha-2 country, known protocol, explicit EPSG: CRS,
      bbox 4-numbers. DataPackManifest.entries stays a list of opaque
      dicts — entries are validated on demand so the pack format can
      evolve without changing the manifest loader. (#270)

Changed

  • PluginHub renamed to ExtensionHub. The class lives in the
    same module (gispulse.core.plugin_hub); a PluginHub = ExtensionHub
    alias preserves existing imports. The alias is scheduled for removal
    in 2.1.0. The data-pack regime (_discover_data_packs) is wired
    into both bundled discovery and the new PyPI channel.
  • gispulse.core.plugin_contracts public surface frozen via
    __all__.
    The 8 symbols actually exported by the 1.6.2 wheel are
    gelled with an explicit __all__ and an anti-regression test. The
    types that moved to plugin_model.py during the consolidation
    (Tier, PluginManifest, DataPackManifest, …) were never in
    plugin_contracts in 1.6.2 — no compat shim is needed.
  • _compat.py deprecation horizon corrected. The docstring and
    DeprecationWarning message had carried over a stale "removed in
    1.9.0" line; both now correctly point at 2.1.0, in step with
    the release plan.

Fixed

  • security-audit job — silence two disputed upstream advisories.
    pip-audit was failing on joblib PYSEC-2024-277 (disputed by
    upstream, only triggered when loading untrusted cache content) and
    pyjwt PYSEC-2025-183 (disputed by upstream, the key length is
    chosen by the application, not the library). Both are added to the
    --ignore-vuln allowlist with a re-evaluation note. No code change.

Migration

See MIGRATION-2.0.md. TL;DR:

  • imports under top-level core.*, capabilities.*, rules.*,
    orchestration.*, persistence.*, catalog.* continue to work via
    the _compat.py meta-path shim with a one-time DeprecationWarning;
  • PluginHub continues to work via the PluginHub = ExtensionHub
    alias;
  • both will be removed in 2.1.0 — migrate to gispulse.* /
    ExtensionHub at your leisure.

GISPulse v1.6.2

07 May 20:49

Choose a tag to compare

[1.6.2] - 2026-05-07

The "Format Frontier" release — DuckDB Spatial as the universal CDC substrate. Adds two new engines (spatialite, duckdb_diff), brings DML detection to seven file formats (GPKG, SpatiaLite, GeoJSON, FlatGeobuf, Shapefile, KML, CSV+WKT) — five of which had no native trigger surface — and closes EPIC #139 (DML semantics ADRs + WAL connection safety).

Added

  • SpatiaLite engine. New persistence.spatialite_engine.SpatiaLiteEngine shares the SQLite trigger DDL of GPKG but writes through pyogrio's SQLite + SPATIALITE=YES driver and queries geometry_columns instead of gpkg_contents. Auto-routed for *.sqlite / *.db URIs. No mod_spatialite Python extension required at runtime — pyogrio's OGR linkage handles the catalog. (EPIC #105 slice 1, PR #151)
  • is_spatialite_file(path) detection helper. Narrow rule: file must have geometry_columns AND must NOT have gpkg_contents. Used by future auto-routing code; the URI inference layer maps the suffixes ahead of file inspection. (PR #151)
  • bootstrap_spatialite_project(conn). Sibling to bootstrap_gpkg_project; installs the same _gispulse_* internal tables WITHOUT setting the GPKG application_id or creating gpkg_* catalog rows (those would corrupt SpatiaLite identity). Refactor extracts a shared _bootstrap_gispulse_internals(conn) helper used by both bootstraps. (PR #151)
  • FileBlobChangeDetector. Reusable mtime + DuckDB ST_Read snapshot diff CDC. Hash is md5(ST_AsWKB(geom) || json_object(props)) excluding OGR's synthetic OGC_FID so reordering features in the source file does not produce false DELETE+INSERT noise. Snapshot persisted as a DuckDB sidecar <blob>.gispulse-snapshot.duckdb. Set-diff semantics: emits INSERT and DELETE only — UPDATE is undetectable without a stable PK in the file format. (EPIC #105 slice 2, PR #152)
  • Companion-file watching. Multi-file formats (Shapefile = .shp / .dbf / .shx / .prj / .cpg) are watched via max(mtime) across every existing companion so attribute-only edits (which only touch .dbf) surface correctly. Single-file formats (GeoJSON, FlatGeobuf, KML, CSV) keep single-file mtime semantics. New _COMPANION_EXTENSIONS map is extensible. (EPIC #105 slice 4, PR #152)
  • DuckDBDiffEngine. SpatialEngine implementation backed by the file-blob detector. Supports GeoJSON, FlatGeobuf, Shapefile (and zero-code-change-ready for KML / CSV+WKT — those land in v1.6.2). I/O via pyogrio. get_pending_changes shape matches GeoPackageEngine (id int, changed_at ISO 8601, geom_changed 0/1) so ChangeLogWatcher iterates uniformly across engines. mark_changes_processed is a no-op (poll is destructive). execute_sql raises NotImplementedError — this engine is a CDC adapter, not a query engine; for ad-hoc SQL run gispulse run with the standalone DuckDB engine. (EPIC #105 slices 3+5, PR #152)
  • Engine factory entries. _spatialite_factory and _duckdb_diff_factory registered as built-ins. URI inference (already shipped in v1.6.0 via gispulse.runtime.engine_inference) maps .sqlite / .db to spatialite and .geojson / .fgb / .shp / .kml / .csv / .tab / .dxf to duckdb_diff automatically — no extra wiring required to consume the new engines. (PRs #151, #152)
  • persistence.gpkg_connection.connect_gpkg(path, …). Single entry point that applies WAL + busy_timeout=5000 on every GeoPackage sqlite3.connect. Migrated 8 scattered call sites (CLI track / triggers / runtime, HTTP datasets routers, project_io) so concurrent QGIS edits + watcher polls never raise SQLITE_BUSY. Documents the historical test_p02 flake's root cause. (#141, PR #145)
  • ADR 0001 — DuckDB-spatial as the contract SQL dialect. Records the de-facto rule that v1.6.0 already enforces: the DSL geom-fct templates and run_sql strings are written in DuckDB-spatial dialect by default. The engine: top-level key remains the documented escape hatch for users running exclusively against PostGIS or SpatiaLite. (#140, PR #147)
  • ADR 0002 — Trigger cascade is bounded fixed-point with origin-tagging. Documents the existing two-layer cascade design: SQLite WHEN clauses block self-loops at the file format level (B-02, v1.5.3), and evaluate_cascade runs a fixed-point loop with MAX_CASCADE_DEPTH = 3 raising CascadeDepthExceeded beyond. Community tier capped at depth 1, Pro up to 3. (#142, PR #148)
  • ADR 0003 — _gispulse_change_log is a poll log, not an event store. Promotes the current id AUTOINCREMENT + changed_at invariants to documented contract; defers replay / sub-second timestamps / row hashing to a future v1.7+ extension table. (#143, PR #150)
  • ADR 0004 — DDL hooks out of scope; passive schema-drift detection ships. Records that ALTER TABLE / DROP TABLE / CREATE INDEX hooks are intentionally absent. The B-13 schema-drift watchdog (#103, v1.5.3) covers ALTER TABLE ADD COLUMN passively — the runtime rebuilds triggers within one watchdog tick and surfaces the new column in subsequent new_values payloads. (#144, PR #150)
  • KML CDC. Auto-routed for *.kml files. Single-file mtime watch + DuckDB ST_Read snapshot diff — zero-code-change pass-through of the DuckDBDiffEngine shipped in #152. (EPIC #106 slice 1, PR #153)
  • CSV+WKT CDC. Auto-routed for *.csv files. Pyogrio writes the geometry as a WKT column when invoked with GEOMETRY=AS_WKT; DuckDB ST_Read decodes it transparently for the diff. (EPIC #106 slice 1, PR #153)
  • MapInfo TAB companion files. New _COMPANION_EXTENSIONS[".tab"] entry watches the four-file MapInfo set (.tab / .dat / .map / .id, plus .ind if present) so attribute-only edits (which only touch .dat) surface correctly. (EPIC #106 slice 1, PR #153)
  • MapInfo TAB read via pyogrio fallback. DuckDB's bundled GDAL wheel does not include the MapInfo driver, so ST_Read('places.tab') hangs. New _PYOGRIO_FALLBACK_SUFFIXES = {".tab"} routing in FileBlobChangeDetector reads .tab through geopandas.read_file while keeping the hash contract identical (md5(geom.wkb || json_object(props))). A future DuckDB build that ships the driver promotes the format back to the fast path with no observable change in event identity. Adding a format to the fallback set is the cheapest path to coverage when DuckDB lags the system OGR. (EPIC #106 slice 2, PR #154)
  • Multi-engine POST /datasets/{id}/enable_tracking. The HTTP route is no longer hardcoded to GeoPackageEngine. New _resolve_engine_kind_for_tracking(ds, path) helper picks the engine via gispulse.runtime.engine_inference.infer_engine on the URI suffix, with a short-circuit for ds.format == "gpkg" (the upload path stamps this from pyogrio inspection — more reliable than URI suffix on demos). The route branches: SQLite-family (gpkg / spatialite) installs AFTER triggers per layer; duckdb_diff skips the install entirely (the detector creates its sidecar snapshot on first poll) and uses the file stem as the tracked layer name. PostGIS URIs and unknown extensions return 400 tracking_unsupported_format. WatcherRegistry.register() now takes an engine_kind kwarg (default "gpkg" for back-compat) and dispatches to the right engine class. Demo SaaS users uploading .geojson / .fgb / .shp to the portal can now enable tracking through the HTTP API and receive dml.changed events on /ws/events. (#157, PR #158)

Changed

  • bootstrap_gpkg_project extracts a shared internal helper. New _bootstrap_gispulse_internals(conn) runs migrations + creates _gispulse_* tables without GPKG-specific identity work. bootstrap_gpkg_project and the new bootstrap_spatialite_project both layer their format-specific setup on top. Behaviour for existing GPKG callers is identical — regression test asserts the GPKG path still produces a valid GeoPackage with application_id = 0x47504B47 and gpkg_contents. (PR #151)

Documentation

  • docs/adr/0001-dsl-sql-dialect.md through docs/adr/0004-ddl-hooks-out-of-scope.md. Four ADRs introducing a docs/adr/ directory; cross-linked from docs-site/guide/architecture.md under a new "Décisions de scope (ADRs)" sub-section.
  • docs-site/guide/dsl-sql-dialect.md. User-facing reference of the DSL SQL dialect contract, with the portable ST_* surface, ST_Transform arity gotcha, and engine: override. Cross-linked from engines.md, dsl-geom-functions.md, dsl-validation.md. (PR #147)
  • docs-site/guide/rules.md. Cascade tip block expanded into a proper "Cascade behaviour of triggers" sub-section with the tier table, the two-layer explanation, a JSON example showing cascade_depth: 2, and a link to ADR 0002. (PR #148)
  • docs-site/guide/formats.md. SpatiaLite, GeoJSON, FlatGeobuf, Shapefile, KML and CSV rows bumped with their CDC support note. New "CDC file-blob" section explains the mechanism, formats covered, multi-file companion-watching rule, and known limitations (set-diff = INSERT/DELETE only, polling not inotify, single-layer per file). MapInfo TAB row mentions the pyogrio fallback path. (PRs #151, #152, #153, #154)
  • docs-site/guide/walkthroughs/geojson-cdc.md (FR + EN). Fourth walkthrough showcases the file-blob CDC path end-to-end: 30-second setup creating a places.geojson, two edit demos (Python script append + QGIS edit), the exact webhook payload shape, "how it works" diagram, honest limitations section, variants table for the 8 supported formats, cross-links to ADR 0001 + formats.md. EN translation mirrors FR 1:1 so the demo URL has 4 walkthroughs in both languages. (PRs #155, #156)

Decision log

  • EPIC #139 (DML semantics) closed same-day. Five sub-issues actioned in five PRs (#145 WAL fix code; #147/#148/#150 four ADRs). Out-of-scope topics — replay event sourcing (#143), DDL hooks (#144), run_sql PostGIS-only construct scanner (#146 follow-up) — are documented rather than implemented so v1.6.x ships without scope c...
Read more

GISPulse v1.6.1

07 May 17:15
42bfc47

Choose a tag to compare

[1.6.1] - 2026-05-07

Same-day follow-up to v1.6.0. Closes the 3 deferred items from the v1.6.0 sprint kickoff in a single PR (#138) so the v1.6.x line ships its full promised surface — cross-source push-down, scalar lookup, and zero-config validate auto-wire — instead of trickling them across point releases.

Added

  • layer_lookup(layer, match, take, layer_geom) DSL fct. Scalar attribute lookup against a (cross-source) layer with three match modes: spatial_within (default), spatial_intersects, or any column identifier as attribute-equality shorthand (consistent with geom_within(match='code_insee')). Compiles to (SELECT _L."<take>" FROM "<layer>" AS _L WHERE <pred> LIMIT 1). (#124, PR #138)
  • Cross-source layer registry. New gispulse.runtime.layer_registry.LayerRegistry ATTACHes external GeoPackage / Parquet / PostgreSQL sources read-only and creates a DuckDB view per declared layer in the in-memory catalog. The DSL emits bare-name FROM "communes"; DuckDB's optimiser pushes spatial and attribute predicates down to the underlying scanner — no SQL rewriting downstream. (#122, PR #138)
  • Top-level layers: block in triggers.yaml. Declarative cross-source layer references via LayerSourceConfigModel. Duplicate-name guard at config-load time. (#122, PR #138)
  • build_runtime validate auto-wire. New validate_rules, default_table, layer_sources, source_epsg kwargs wire a ValidationRunner directly onto the change-log watcher. The DuckDB session ATTACHes the project GPKG read-only and mirrors each user table as a view in the in-memory catalog so bare-name SQL resolves while cross-source CREATE VIEW statements remain legal. (PR #138)
  • Per-rule table: and top-level default_table:. ValidateRuleConfigModel.table lets each validate: rule pin its target table; GISPulseConfig.default_table provides a config-level fallback. (PR #138)

Changed

  • compile_validate_rules accepts a table_resolver callable. The signature now supports per-rule resolution via a rule -> table callable. The legacy table= parameter is preserved for v1.6.0 callers (single-table use). (PR #138)

Decision log

  • "Quelle table" question for validate: rules — closed. build_runtime resolves the target table per rule in priority order: rule.table (operator pin) > default_table (config fallback) > GPKG single-table autodetect > ValidationTableResolutionError listing the candidate tables. Single-table GPKGs (the dominant case) get zero-config UX; multi-table GPKGs surface a clear actionable error. (PR #138)

v1.6.0 — DuckDB Spatial Inside

07 May 12:15

Choose a tag to compare

The "DuckDB Spatial Inside" release. A 7-PR cascade (#129#135) lands the foundation, the DSL geom function whitelist, granular DML verbs, the declarative validate: block end-to-end (incl. mode: tag dispatch), and the long-standing B-08 DELETE predicate gap.

DuckDB spatial moves from "embedded if you opt in" to the universal compute substrate. The Atlas R1 bench against pyogrio justifies the pivot.

Highlights

DuckDB Spatial Inside

  • Lazy install on first DSL geom fct usage — no pip install gispulse[spatial] extra; INSTALL spatial; LOAD spatial; runs once when the first rule needs it.
  • gispulse doctor --install-spatial — pre-installs + probes EPSG roundtrips against pyproj (catches PROJ datum-shift gaps).
  • Engine inference from URI*.gpkggpkg, postgresql://...postgis, *.shp / *.geojson / *.fgbduckdb_diff (file-blob CDC).

DSL — declarative geom + validation

  • 7 whitelisted geom functions in set_field and validate: rules: geom_area_m2, geom_perimeter_m, geom_length_m, geom_centroid_x/y, geom_npoints, geom_is_valid. Measure functions auto-project to a metric CRS (default EPSG:2154, override per-call).
  • Cross-layer subqueries: geom_within(layer='communes', match='code_insee') and geom_overlaps_any(layer='self', exclude_self=True). The compiler emits EXISTS (SELECT 1 FROM "<layer>" AS _L WHERE …) with strict identifier validation.
  • Safe-by-construction parser: walks Python AST under a strict allowlist; rejects __import__, eval, attribute access, lambdas, comprehensions, …
  • validate: top-level with mode: warn (log + WS event) or mode: tag (auto-creates a status column and writes failed:<rule.id>).

DML — granular verbs

when: [INSERT, UPDATE_GEOM, UPDATE_ATTR, DELETE, BULK]

The watcher resolves a coarse UPDATE change-log row to its granular variant via the geom_changed flag — pure attribute edits and geometry edits route to different triggers without inspecting the row.

Atlas R1 bench — DuckDB COPY GDAL/GPKG vs pyogrio

Scenario pyogrio DuckDB COPY Speedup RSS pyogrio RSS DuckDB
Append +100k 8.19 s 3.63 s 2.26× 950 MB 273 MB
Update attribute 6.94 s 2.75 s 2.52× 839 MB 255 MB
Update geometry 8.87 s 2.47 s 3.59× 843 MB 275 MB

Median of 3 runs on 1M EPSG:2154 polygons. The pyogrio-only write-back doctrine of v1.5.x is officially retired for bulk paths; pyogrio remains the fallback for datasets > 5M rows, GPKGs with custom triggers/views, and append-in-place semantics.

B-08 — DELETE predicates finally filter on pre-delete state

The AFTER DELETE SQLite trigger has been writing OLD.* JSON to _gispulse_change_log.old_values since v1; the changelog reader was just dropping the column. Fix is one whitelist entry + ~30 lines of watcher hydration. No GPKG migration required — fully backward-compatible with every v1+ project.

triggers:
  - name: alert_active_archive
    table: parcels
    when: [DELETE]
    predicate: "status == 'active'"   # now actually fires
    actions:
      - type: webhook
        url: https://ops.example.com/archive-alert

ESRI Attribute Rules — drop-in vocabulary

triggers:
  - name: parcels_constraint_min_surface
    kind: constraint   # alias for "validation" — eases ESRI migration

Cosmetic for now. The runtime ignores kind:; the alias keeps your migration diff small. See docs-site/guide/migration-from-esri.md for the full mapping table.

Documentation

Security pins

  • dml.changed broadcast payload stays minimal even on DELETE — row attributes never leak through /ws/events. New regression test pins the contract.
  • validate: rule SQL is never spliced raw — every column / layer / EPSG identifier passes a strict [A-Za-z_][A-Za-z0-9_]{0,62} validator before reaching DuckDB; literals are SQL-quoted.

Deferred to v1.6.x

  • build_runtime auto-wiring of validate_rules — the runner is plumbed and tested, but the schema needs a product decision on rule-to-table mapping (per-rule table:, first trigger's table, every trigger table). Workaround: callers wire the runner manually using make_gpkg_sql_evaluator + dispatcher injection.
  • #122 cross-source ATTACHgeom_within(layer='communes') against a separate dataset compiles cleanly but executes only when the target layer is part of the current ATTACH.
  • #124 layer_lookup — depends on cross-source ATTACH.

Install

pip install --upgrade gispulse

QGIS plugin lockstep at the same version (no plugin behaviour change in 1.6.0; bumped to keep the bridge contract in sync).

Full changelog

See CHANGELOG.md.