Skip to content

Fix worker-thread and linkage crashes; complete extended-type MFJSON input#187

Closed
estebanzimanyi wants to merge 6 commits into
feat/trgeometry-port-corefrom
fix/mduck-crashes-and-mfjson
Closed

Fix worker-thread and linkage crashes; complete extended-type MFJSON input#187
estebanzimanyi wants to merge 6 commits into
feat/trgeometry-port-corefrom
fix/mduck-crashes-and-mfjson

Conversation

@estebanzimanyi
Copy link
Copy Markdown
Member

Scalar and cast bodies run MEOS on DuckDB TaskScheduler worker threads, which each need meos_initialize() before first use; a thread-local guard in the scalar exec wrapper and a cast trampoline cover every entry point so timestamp formatting no longer dereferences a NULL session_timezone. Per-type DuckDB callbacks and blob helpers have internal linkage and share one canonical blob round-trip (TemporalToBlob / BlobToTemporal) declared once in temporal/temporal_blob.hpp, so each function-pointer registration binds to its own body; Tpoint_make_simple builds its list through UnifiedVectorFormat and scalar results take their vector type from the executor. tcbuffer, tnpoint and trgeometry expose FromMFJSON (tcbuffer through tcbuffer_from_mfjson, tnpoint and trgeometry through temporal_from_mfjson with their temporal type), and MEOS is built with the circular-buffer and network-point MF-JSON input/output. The full SQL test suite passes 16/16 with no crashes.

Enable the CBUFFER, NPOINT, POSE and RGEO MEOS modules in the vcpkg port so the
extended temporal types (tcbuffer, tnpoint, tpose, trgeometry) can be ported
into MobilityDuck. RGEO depends on POSE.
Bring the temporal circular buffer (tcbuffer) type into MobilityDuck:
construction, text/EWKT/WKB I/O, casts and accessors, gated by the CBUFFER
config flag that mirrors the MEOS CBUFFER build module. Disabling CBUFFER drops
the tcbuffer sources and its registration in LoadInternal. MFJSON input binds to
MEOS tcbuffer_from_mfjson, which the pinned MEOS does not yet expose, so it is
omitted here.
Brings the temporal network point (tnpoint) type into the accumulate:
construction, text/EWKT/MFJSON I/O, casts, and accessors, registered in
LoadInternal after the tcbuffer surface. The network model resolves route
geometries from a ways CSV; MEOS hardcodes its path
(/usr/local/share/ways1000.csv) with no setter, so the canonical ways1000.csv
is embedded (ways_csv.inc) and materialized there on load (best-effort).

Fixes applied on top of the #150 branch:
- getValue/startValue/endValue value executors renamed to tnpoint-unique names
  (Tnpoint_{get,start,end}_value). Generic names (Tinstant_value /
  Temporal_start_value / Temporal_end_value) are also defined at namespace
  scope in the other geo .cpp — an ODR violation that would otherwise dispatch
  these to the surviving (geometry) definition and render the npoint as a
  geometry ("Unknown geometry type: 0").
- Refresh the asEWKT test expecteds with the ways-network SRID prefix
  (SRID=5676;), matching MobilityDB's asEWKT(npoint) reference; asText omits it.

Full sqllogictest suite green: 1799 assertions across 78 test cases.

(cherry picked from commit 94e01ad)
Brings the temporal pose (tpose) type into the accumulate: construction,
text/EWKT/MFJSON/WKB I/O, casts, and accessors, registered in LoadInternal
after the tnpoint surface.

Fixes applied on top of the #151 branch:
- getValue/startValue/endValue value executors renamed to tpose-unique names
  (Tpose_{get,start,end}_value) and rendered via pose_as_text (WKT form
  "Pose(POINT(x y),r)"). Generic names collide with the other geo .cpp (ODR
  violation -> geometry renderer -> "Unknown geometry type: 0"); pose_out
  emits the HexWKB/uppercase "POSE (...)" form, not what asText/getValue want.
- Close the output-serialization gap the #151 test exercises but the port
  omitted: register asMFJSON (via temporal_as_mfjson) and asBinary / asEWKB /
  asHexWKB / asHexEWKB (via temporal_as_wkb / temporal_as_hexwkb), with
  tpose-unique exec names. MobilityDB exposes all of these for tpose.
- Fix the setInterp-to-discrete test: a multi-instant continuous sequence
  cannot be discretized (MEOS errors); lift an instant instead, matching
  MobilityDB's 102_tpose reference.

Full sqllogictest suite green: 1830 assertions across 79 test cases.

(cherry picked from commit ec74d58)
Bring the temporal rigid geometry (trgeometry) type into MobilityDuck:
construction, text/EWKT/WKB I/O, casts and accessors, gated by the RGEO config
flag that mirrors the MEOS RGEO build module. Disabling RGEO drops the
trgeometry sources and its registration in LoadInternal. The MEOS calls use the
trgeo_* names exported by the pinned MEOS; MFJSON input binds to a MEOS symbol
the pinned MEOS does not yet expose, so it is omitted here.
@estebanzimanyi estebanzimanyi force-pushed the feat/trgeometry-port-core branch from 2a6634e to cb39d26 Compare May 29, 2026 21:33
…input

Scalar and cast bodies run MEOS on DuckDB TaskScheduler worker threads, which
need a per-thread meos_initialize(); a thread-local guard in the scalar exec
wrapper and a cast trampoline (RegisterMeosCastFunction) cover every entry
point, so timestamp formatting no longer dereferences a NULL session_timezone.

Per-type DuckDB callbacks and blob helpers have internal linkage and a single
canonical blob round-trip (TemporalToBlob / BlobToTemporal) declared once in
temporal/temporal_blob.hpp, so each function-pointer registration binds to its
own body. Tpoint_make_simple builds its list via UnifiedVectorFormat. Scalar
results take their vector type from the executor.

tcbuffer, tnpoint and trgeometry expose FromMFJSON: tcbuffer through
tcbuffer_from_mfjson, tnpoint and trgeometry through temporal_from_mfjson with
their temporal type. MEOS is built with the circular-buffer and network-point
MF-JSON input/output.
@estebanzimanyi estebanzimanyi force-pushed the fix/mduck-crashes-and-mfjson branch from 1faca71 to 60855a4 Compare May 29, 2026 21:35
@estebanzimanyi estebanzimanyi force-pushed the feat/trgeometry-port-core branch from cb39d26 to 744b0df Compare May 29, 2026 22:56
@estebanzimanyi
Copy link
Copy Markdown
Member Author

The worker-thread and linkage (ODR) crash fix lives in #190; each extended type's canonical blob round-trip and MFJSON I/O ship in that type's own port PR (#148 tcbuffer, #150 tnpoint, #151 tpose, #153 trgeometry), so every port PR is self-contained and green on its own.

estebanzimanyi added a commit that referenced this pull request May 30, 2026
Wraps 36+ individual commits across the addressable-temporal+geo
surface into a single feature commit, conforming to the
ecosystem-wide "1 feature = 1 commit" PR-compaction policy.

- `bearing(geometry,geometry|tgeompoint,*)` for tgeompoint and tgeogpoint.
- `eCovers` / `aCovers` (BOOLEAN) and `tCovers` (tbool) for tgeometry /
  tgeography / tgeompoint.
- `stboxX` / `stboxZ` / `stboxT` / `stboxXT` / `stboxZT` dimensional
  constructors plus `geodstbox*` geographic variants.
- `<type>SeqSetGaps(LIST<type>, interval [, interp])` for every
  base temporal type (tbool/tint/tfloat/ttext/tgeometry/tgeography/
  tgeompoint/tgeogpoint) — closes the long-standing MobilityDB #187
  parity gap.
- `stboxes`, `splitNStboxes`, `splitEachNStboxes` multi-entry bbox
  emitters for tspatial / geo (Temporal* + GSERIALIZED* inputs).
- `geomsetFromBinary / EWKB / HexWKB / Text / EWKT` plus matching
  `geogset*` parsers, `asBinary` / `asHexWKB` round-trip.
- `temporal_hash` for the four base temporal types,
  `stbox_hash` / `stbox_hash_extended`, `stboxFromHexWKB`,
  `asHexWKB(stbox)`.
- `atElevation` / `minusElevation` for tgeompoint.
- `time_distance` (tstzspanset × {timestamptz / tstzspan / tstzspanset}).
- SRID accessor for stbox and the audit-driven OOS-name discipline.

- TZ: drop ICU autoload, initialize MEOS timezone to UTC at extension
  load, regenerate every TIMESTAMPTZ expected output to `+00`, reset
  MEOS errno before every cast / aggregate to avoid stale-state
  SIGSEGVs.
- MEOS API drift: re-pin port to MobilityDB PR #949
  (`feat/meos-thread-safe-complete`, tip `742c1fb5`) which
  combines the latest master with the thread-safety work
  (per-thread GEOS context in MEOS + liblwgeom, reentrant GEOS
  API across every spatial helper, JVM-safe error handler).  Sync
  template signatures (`meosType` → `MeosType`) and dropped /
  renamed exports.
- VARCHAR-cast SIGSEGV cascade: rewrite ~15 test files
  (`031_aggregates_skiplist`, `040_tgeometry_parity`,
  `041_tgeography_parity`, `042_temporal_waggfuncs`,
  `042_tgeogpoint_parity`, `056b_bearing`, `070b_covers`,
  `015_span_aggfuncs`, `030_aggregates_extent`, `076b_*`, `026b_*`,
  `parquet/temporal_parquet`, `025_temporal_tile`, `tgeompoint`) to
  `CREATE TEMP TABLE … (… <type>); INSERT INTO … VALUES ('…'::type)`
  shape plus accessor-coverage assertions, working around the
  upstream binding bug documented in
  `project_mobilityduck_cast_segv.md`.  The MEOS thread-safety
  pin (above) eliminates the underlying state-leak that drove
  many of these reroutes.

- macOS: wrap `bigint_to_set` in an `int64_t` forwarder (LP64 `int64`
  ≠ `int64_t` on Apple's clang).
- Wasm: exclude `wasm_*` triples (MEOS port incompatible with
  emscripten's `pg_bitutils.h` integer-cascade).
- Audit: pick up `TableFunction fn("name", ...)` variable form,
  per-subtype dynamic constructors, OOS-name discipline (range,
  multirange, box2d/3d, unnest, transform_gk, create_trip).

Active addressable parity: 943/943 (100%) for the temporal+geo
surface.

Squash of 37 commits on `feat/parity-additions-batch`; full
history preserved on `backup/parity-additions-batch-orig`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant