Skip to content

docs: parity-status snapshot + regeneration script#66

Merged
nhungoc1508 merged 2 commits intomainfrom
parity-status-audit
May 4, 2026
Merged

docs: parity-status snapshot + regeneration script#66
nhungoc1508 merged 2 commits intomainfrom
parity-status-audit

Conversation

@estebanzimanyi
Copy link
Copy Markdown
Member

A planning artefact, not a code change. Adds:

  • `scripts/parity-audit.py` — walks MobilityDB's `CREATE FUNCTION` surface and MobilityDuck's `RegisterFunction` calls, writes a section-by-section coverage table.
  • `docs/parity-status.md` — the current snapshot.

Match is by function name only, case-insensitive. A name registered in MobilityDuck is treated as covering all its overloads, so per-overload signature parity (e.g. `before(tstzspan, temporal)` vs `before(temporal, temporal)`) is not verifiable at this granularity.

Top-line numbers (snapshot 2026-04-30)

  • 104 MobilityDB SQL section files, 2054 function declarations across them, 1032 unique names.
  • MobilityDuck registers 359 unique names, 1653 total registrations.
  • Name-level coverage: 1017 / 2054 (49.5%).

Where the gaps cluster

Bucket Sections at 0% Notes
Aggregates `temporal/040_temporal_aggfuncs`, `temporal/042_temporal_waggfuncs`, `temporal/015_span_aggfuncs`, all `*_aggfuncs` per type Aggregate framework partially being landed in PR #21. Each section has 7-40 transition / final / combine functions.
Index support All `_gist`, `_spgist`, `*_indexes` sections Internal helpers — never user-facing SQL. Will likely stay 0% until DuckDB grows comparable index extension hooks.
Comp / pos / topo operators `_compops`, `_posops`, most `*_topops` DuckDB rejects multi-char operator tokens; named-function equivalents land per-PR (`before`, `overbefore`, etc.). PR #64 catalogues the structural divergence in `docs/DuckDB-Parity-Gaps.md`.
Tile / split `temporal/025_temporal_tile`, `geo/058_t{geo,point}_tile` Manifest test landed in PR #64. Implementation pending.
npoint base type `npoint/081_npoint` (15%), `npoint/091_*_routeops` Base type only partially registered. Many npoint-network functions still unbound.
Analytics `temporal/046_temporal_analytics`, much of `geo/076_*_analytics` DouglasPeucker / simplify / minDistSimplify family. Easy adds; small section.

Sections at 100% coverage

`temporal/002_set_ops`, `temporal/005_span_ops`, `geo/072_tpoint_tempspatialrels`.

Sections close to done (>=80%)

`temporal/021_tbox` (80%), `temporal/009_spanset_ops` (86%), `geo/070_tpoint_spatialrels` (92%), `geo/056_tpoint_spatialfuncs` (79%), `cbuffer/162_tcbuffer_spatialrels` (85%), `pose/105_tpose_spatialfuncs` (88%), `npoint/087_tnpoint_spatialfuncs` (92%), `rgeo/125_trgeo_spatialfuncs` (88%).

Suggested triage order

  1. Spatial operator-named-function batch — PR feat(bindings): tgeompoint distance + temporal-position predicates batch #64 already wired the temporal-position predicates (`before`/`after`/`overbefore`/`overafter` on (tstzspan, temporal)). Same pattern needs to extend to `temporal_left/right/below/above` on tnumber × {numspan, tbox} and to all `*_posops` sections per spatiotemporal type.
  2. Analytics: DouglasPeucker / simplify family — small surface (4-5 names), self-contained.
  3. Aggregates — large but scoped to PR test(parity): 015_span_aggfuncs.test — aggregate surface manifest #21's track. Expanding aggregate infra is a multi-PR effort.
  4. Composed transform overloads — `rotate(tgeompoint, double, geometry)` and `scale(tgeompoint, double, double[, double])` are stragglers from PR feat(bindings): tgeompoint affine and derived spatial transforms #65; small follow-up.
  5. tgeometry parity for whatever lands on tgeompoint — same overloads, different type tag.

This document is the single source of truth for "what's left." Regenerate after every batch lands so the percentage tracks.

Test plan

  • `python3 scripts/parity-audit.py --mdb ../MobilityDB --mduck . --out docs/parity-status.md` succeeds and writes the file.
  • Output is reproducible (same inputs → identical file).

scripts/parity-audit.py walks MobilityDB's CREATE FUNCTION surface
across mobilitydb/sql/**/*.in.sql and MobilityDuck's RegisterFunction
calls in src/**/*.cpp, then writes a per-section coverage table and a
list of unregistered function names.

Match is by name only — overload-level parity is not verified.

docs/parity-status.md is the latest snapshot. Regenerate with:

  python3 scripts/parity-audit.py --mdb ../MobilityDB --mduck . \
    --out docs/parity-status.md

Snapshot 2026-04-30: 1017/2054 names covered (49.5%). High-level
gaps include all aggregate sections, all gist/spgist support, the
*_compops / *_posops operator surfaces (where DuckDB-rejected
multi-char operators are partially covered by named-function
equivalents documented in docs/DuckDB-Parity-Gaps.md once PR #64
lands), and the npoint base type itself.
Per scoping discussion: the active sweep covers temporal + geo only.
npoint, cbuffer, pose, rgeo families are deferred until the active
surface stabilises (then th3index + tbigint becomes the next phase).

Changes:

- scripts/parity-audit.py: introduces DEFERRED_FAMILIES at top of file.
  The report now splits into "Active-scope coverage" (drives the headline
  percentage) and a "Deferred families" appendix table. Re-include a
  family by removing it from DEFERRED_FAMILIES.
- docs/parity-status.md: regenerated. Active scope is 597/1261 names
  covered (47.3%) across 57 sections. Deferred scope (793 names across
  44 sections) is itemised but not in the headline number.

JSON types are tracked as a future scope addition once temporal+geo
parity stabilises.
This was referenced Apr 30, 2026
@estebanzimanyi estebanzimanyi marked this pull request as ready for review May 1, 2026 19:13
@nhungoc1508 nhungoc1508 merged commit cef65e6 into main May 4, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants