diff --git a/profile/README.md b/profile/README.md index fcda079..5fe8714 100644 --- a/profile/README.md +++ b/profile/README.md @@ -22,6 +22,13 @@ The companion datasets and reproducible scripts live in [MobilityDataScienceBook Mobility Data Science Book +## Parity & benchmarks + +Evidence that the ecosystem is complete, correct, and ready for real-world deployment — **measured, not asserted**. Two published assessments, one per platform group, each covering function-surface coverage, BerlinMOD query-result parity (identical results across engines), and BerlinMOD performance benchmarks: + +- **[Streaming — parity & benchmarks](streaming-parity.md)** — MobilityFlink / MobilityKafka / MobilityNebula. +- **[Databases — parity & benchmarks](database-parity.md)** — MobilityDB / MobilityDuck / MobilitySpark. + ## Repository map MobilityDB Ecosystem — MEOS C core as the foundation; MobilityDB/MobilityDuck/MobilitySpark SQL layers, MobilityNebula/MobilityKafka/MobilityFlink stream layers, MobilityAPI plus the MEOS-API OpenAPI/MCP/runtime contracts as the HTTP/API layer, and six language bindings as peer surfaces; MEOS-API as a side codegen catalog (IDL JSON + shape catalog); application, visualization and cloud as the top layer; a portable-data band (Arrow C Data Interface, TemporalParquet, Temporal Data Lake) beneath MEOS diff --git a/profile/database-parity.md b/profile/database-parity.md new file mode 100644 index 0000000..f3d0710 --- /dev/null +++ b/profile/database-parity.md @@ -0,0 +1,122 @@ +MobilityDB Databases — Parity & Benchmarks +========================================== + +The ecosystem exposes the same MEOS temporal and spatiotemporal operations through SQL on +three engines — [MobilityDB](https://github.com/MobilityDB/MobilityDB) (the reference +implementation), [MobilityDuck](https://github.com/MobilityDB/MobilityDuck), and +[MobilitySpark](https://github.com/MobilityDB/MobilitySpark). This page reports, +**measured rather than estimated**, how much of the MobilityDB SQL surface each engine +covers, how that coverage is verified, and how the engines perform on the BerlinMOD +benchmark. Together these answer the question a prospective adopter asks first — *is the +database layer complete, does it return correct results, and is it fast enough to deploy +on real workloads?* — with reproducible evidence rather than assertion. + +> Companion: **[Streaming — parity & benchmarks](streaming-parity.md)** reports the same +> for the stream processors (MobilityFlink / MobilityKafka / MobilityNebula). + +## Results at a glance + +Coverage of the **MobilityDB SQL surface** — each engine over its active-addressable scope: + +MobilityDB SQL surface coverage: MobilityDB 100% (reference), MobilityDuck 100.0% (943/943 active scope), MobilitySpark 99.6% (1571/1577) plus 29/29 portable dialect + +- **Correctness** — BerlinMOD reference queries return **identical results** across all three engines ✓ +- **Benchmark** — cross-platform timings ▸ [report #29](https://github.com/MobilityDB/MobilityDB-BerlinMOD/pull/29) *(dated runs publishing)* + +**▶ Full results & raw data** +- MobilityDuck — [`parity-status.md`](https://github.com/MobilityDB/MobilityDuck/blob/main/docs/parity-status.md) · [`PARITY.md`](https://github.com/MobilityDB/MobilityDuck/blob/main/docs/PARITY.md) +- MobilitySpark — [`parity-100.md`](https://github.com/MobilityDB/MobilitySpark/blob/main/docs/parity-100.md) +- MobilityDB cross-type — [methodology #1002](https://github.com/MobilityDB/MobilityDB/pull/1002) · [audit harness #1110](https://github.com/MobilityDB/MobilityDB/pull/1110) + +## The surface + +The reference is the **public SQL API of MobilityDB** — every `CREATE FUNCTION` it +defines. MobilityDB *is* the reference, so its surface is the parity target; +MobilityDuck and MobilitySpark are each measured against the portion applicable to them. + +Functions outside an engine's scope are excluded **by reason, not by omission**, and are +never counted as gaps — chiefly PostgreSQL-only plumbing (`*_in`/`*_out`/`*_recv`/`*_send`, +aggregate transition/combine/finalize internals, selectivity/support functions, GiST/SPGiST +operator classes) for which no DuckDB or Spark equivalent exists, plus a small set of +operations reserved by [semantic exclusion](#cross-type-coverage) because they are +formally meaningless on a given type. + +## Coverage + +Three layers, increasing in strength: + +- **L1 — exported.** The operation is present in the engine's library (MobilityDB `nm -D`). +- **L2 — registered.** It is a callable SQL function in the engine's catalog + (`pg_proc` / DuckDB scalar registry / Spark UDF registry). +- **L3 — tested.** A regression test mirrored from MobilityDB exercises it and the result + is checked. + +| Engine | L2 registered | L3 tested | of MobilityDB SQL surface | +|---|---|---|---| +| **MobilityDB** | reference | full regression suite | **100 % (reference)** | +| **MobilityDuck** | 943 / 943 active-addressable | ported MobilityDB `*.test.sql` mirrors | **100 %** active scope · extended families in progress | +| **MobilitySpark** | 1571 / 1577 active-addressable + 29/29 portable dialect | 907 SQL-parity + BerlinMOD tests | **99.6 %** · all six type families | + +Each engine's coverage is audited from its own repository — MobilityDuck via +[`docs/parity-status.md`](https://github.com/MobilityDB/MobilityDuck/blob/main/docs/parity-status.md) +and [`PARITY.md`](https://github.com/MobilityDB/MobilityDuck/blob/main/docs/PARITY.md); +MobilitySpark via [`docs/parity-100.md`](https://github.com/MobilityDB/MobilitySpark/blob/main/docs/parity-100.md). +The figures advance as the accumulated-PR builds of the three engines land; this page +tracks the measured state. + +## Cross-type coverage + +MobilityDB's parity also has a *cross-type* axis: every temporal spatial type is held to +the same function surface as its reference family — `tgeompoint` is the reference for the +`Point` family (`tgeogpoint`, `tnpoint`), `tgeometry` for the extended-shape family +(`trgeometry`, `tcbuffer`, `tpose`). The methodology and the audit harness are in +MobilityDB ([cross-type parity methodology](https://github.com/MobilityDB/MobilityDB/pull/1002), +[audit harness](https://github.com/MobilityDB/MobilityDB/pull/1110), +[RFC #868](https://github.com/MobilityDB/MobilityDB/discussions/868)). + +A handful of operations are **reason-marked as formally meaningless** on specific types +and are deliberately never implemented — they are not gaps: + +- `convexhull` on `tgeogpoint` — a point's continuous form collapses to `trajectory()` / `stbox()`. +- `affine` / `rotate` / `scale` / `translate` on `tnpoint`, `tcbuffer`, `tpose`, `trgeometry` — an affine transform bypasses the type invariant (rigid pose / centre+radius / route+fraction). +- `atGeometry` / `minusGeometry` on `tnpoint` — a network point is constrained to its 1-D edge; use route filtering. + +## Query-result parity (BerlinMOD) + +The three engines run the **BerlinMOD** reference queries and must return **identical +results** — the cross-engine portability contract +([discussion #861](https://github.com/MobilityDB/MobilityDB/discussions/861), conformance +suite [MobilityDB-BerlinMOD](https://github.com/MobilityDB/MobilityDB-BerlinMOD)). This +batch result is also the anchor for the streaming **snapshot** form, linking database +parity to [streaming parity](streaming-parity.md). + +## Benchmarks (BerlinMOD) + +Coverage and correctness establish that the queries *run* and *agree*; the BerlinMOD +benchmark establishes that they run *fast enough to deploy*. The cross-platform timing +report is organised by **what each measurement licenses to claim** +([MobilityDB-BerlinMOD #29](https://github.com/MobilityDB/MobilityDB-BerlinMOD/pull/29)): +the `th3index` matrix is the cross-platform-comparable axis, native-index results are +intra-platform only, and their composition is reported only where it beats both alone. +Results are published here as each dated run is finalised; the benchmark harness lives in +[MobilityDB-BerlinMOD](https://github.com/MobilityDB/MobilityDB-BerlinMOD) with the +Spark driver in [MobilitySpark `berlinmod/bench`](https://github.com/MobilityDB/MobilitySpark/tree/main/berlinmod/bench). + +## Reproduce it + +The coverage numbers come from per-repo audit scripts, and the cross-type figures from a +config-driven harness: + +```sh +# MobilityDuck +python3 scripts/parity-audit.py # -> docs/parity-status.md (943/943 active) +# MobilitySpark +python3 scripts/parity-audit.py # -> docs/parity-status.md (1571/1577) +python3 scripts/portable_parity.py # 29/29 portable bare names +# MobilityDB cross-type +tools/parity_audit/ # nm -D / pg_proc / test — 3-condition gate +``` + +*Status as of 2026-05-23. MobilityDB is the reference; MobilityDuck and MobilitySpark +coverage and the benchmark tables advance as the accumulated-PR builds and dated runs +land.* diff --git a/profile/images/coverage_charts.py b/profile/images/coverage_charts.py new file mode 100644 index 0000000..686968c --- /dev/null +++ b/profile/images/coverage_charts.py @@ -0,0 +1,74 @@ +#!/usr/bin/env python3 +"""Coverage bar charts for the profile parity & benchmark pages. + +Reproducible generator (mirrors MobilitySpark/berlinmod/bench/chart.py in spirit: +matplotlib -> committed PNG, embedded via ). Emits two horizontal-bar charts +from the measured coverage numbers: + + streaming-coverage.png — Flink / Kafka / Nebula vs the 1,945 streamable surface + database-coverage.png — MobilityDB / MobilityDuck / MobilitySpark vs the SQL surface + +Solid bar = confirmed (L3 proven / reference); lighter bar = wired but not yet +proven. Re-run after the numbers change: python3 coverage_charts.py +""" +from pathlib import Path +import matplotlib +matplotlib.use("Agg") +import matplotlib.pyplot as plt + +HERE = Path(__file__).parent +TRACK = "#ececec" # the 0->100% track behind each bar + + +def hbar(path, title, rows, subtitle): + """rows: list of (label, pct, value_text, color, proven_bool).""" + fig, ax = plt.subplots(figsize=(7.6, 2.5), dpi=150) + ys = list(range(len(rows)))[::-1] # top-to-bottom + for y, (label, pct, vtext, color, proven) in zip(ys, rows): + ax.barh(y, 100, color=TRACK, height=0.55, zorder=1) # track + ax.barh(y, pct, color=color, height=0.55, zorder=2, + alpha=1.0 if proven else 0.55, + hatch=None if proven else "////", edgecolor=color) + ax.text(pct + 1.5, y, vtext, va="center", ha="left", + fontsize=10, color="#222", zorder=3) + ax.set_yticks(ys) + ax.set_yticklabels([r[0] for r in rows], fontsize=11) + ax.set_xlim(0, 128) + ax.set_xticks([0, 25, 50, 75, 100]) + ax.set_xticklabels(["0", "25", "50", "75", "100%"], fontsize=9, color="#666") + ax.set_title(title, fontsize=12, fontweight="bold", loc="left", pad=14) + ax.text(0, 1.04, subtitle, transform=ax.transAxes, fontsize=9, color="#666") + for s in ("top", "right", "left"): + ax.spines[s].set_visible(False) + ax.tick_params(left=False) + ax.set_axisbelow(True) + ax.xaxis.grid(True, color="#f0f0f0", zorder=0) + fig.tight_layout() + fig.savefig(path, bbox_inches="tight", facecolor="white") + plt.close(fig) + print(f"wrote {path}") + + +# Streaming — solid = proven callable, hatched/light = wired (not yet proven) +hbar( + HERE / "streaming-coverage.png", + "Streaming — MEOS function coverage", + [ + ("MobilityFlink", 100.0, "100.0% · 1945 / 1945 · proven", "#1a9988", True), + ("MobilityKafka", 100.0, "100.0% · 1945 / 1945 · proven", "#1a9988", True), + ("MobilityNebula", 20.9, "20.9% · 406 / 1945 · wired", "#1a9988", False), + ], + "of 1,945 streamable MEOS functions (solid = proven callable · hatched = wired)", +) + +# Database — per-platform colour family from chart.py; MobilityDB = reference +hbar( + HERE / "database-coverage.png", + "Databases — MobilityDB SQL surface coverage", + [ + ("MobilityDB", 100.0, "100% · reference", "#0f4ec9", True), + ("MobilityDuck", 100.0, "100.0% · 943 / 943 active", "#c47000", True), + ("MobilitySpark", 99.6, "99.6% · 1571 / 1577", "#1f7a23", True), + ], + "of the MobilityDB public SQL API (each engine over its active-addressable scope)", +) diff --git a/profile/images/database-coverage.png b/profile/images/database-coverage.png new file mode 100644 index 0000000..c7a20cf Binary files /dev/null and b/profile/images/database-coverage.png differ diff --git a/profile/images/streaming-coverage.png b/profile/images/streaming-coverage.png new file mode 100644 index 0000000..017df63 Binary files /dev/null and b/profile/images/streaming-coverage.png differ diff --git a/profile/streaming-parity.md b/profile/streaming-parity.md new file mode 100644 index 0000000..a9e60ef --- /dev/null +++ b/profile/streaming-parity.md @@ -0,0 +1,107 @@ +MobilityDB Streaming — Parity & Benchmarks +========================================== + +The ecosystem exposes the same MEOS temporal and spatiotemporal operations on three +stream processors — [MobilityFlink](https://github.com/MobilityDB/MobilityFlink), +[MobilityKafka](https://github.com/MobilityDB/MobilityKafka), and +[MobilityNebula](https://github.com/MobilityDB/MobilityNebula). This page reports, +**measured rather than estimated**, how much of the streamable MEOS surface each one +covers, how that coverage is verified, and how the platforms perform on the BerlinMOD +benchmark. Together these answer the question a prospective adopter asks first — *is the +streaming layer complete, does it return correct results, and is it fast enough to +deploy on real workloads?* — with reproducible evidence rather than assertion. + +> Companion: **[Databases — parity & benchmarks](database-parity.md)** reports the same +> for the SQL engines (MobilityDB / MobilityDuck / MobilitySpark). + +## Results at a glance + +Coverage of the **1,945 streamable MEOS functions** — confirmed callable on a real `libmeos`: + +Streaming MEOS function coverage of 1,945 streamable functions: MobilityFlink 100.0% (1945/1945, proven callable), MobilityKafka 100.0% (1945/1945, proven callable), MobilityNebula 20.9% (406/1945, wired) + +- **Correctness** — all **9 BerlinMOD queries × 3 streaming forms = 27/27 cells** reproduce the batch result on every platform; the Flink snapshot output is **byte-identical** to the batch oracle. + +**▶ Full results & raw data** +- Per-function coverage — committed feeds `flink-kafka.feed.tsv` + `nebula.feed.tsv`, reproduced by `ci_gate.py` (MobilityNebula `tools/streaming_parity/`). +- Methodology — `doc/methodology/streaming_parity_assessment.md` (MobilityNebula). + +## The surface + +The reference is the **1,945 streamable MEOS public functions** — every exported MEOS +function that can run inside a streaming dataflow, across four tiers: `stateless`, +`bounded-state`, `windowed`, and `cross-stream`. + +Functions outside that surface are excluded **by reason, not by omission**, and are +never counted as gaps: + +| reason | count | why it is not streamable | +|---|--:|---| +| internal | 1,308 | not part of the public API | +| io-meta | 218 | parsing / output / catalog plumbing (`*_in`, `*_out`, …) | +| ambiguous | 59 | reserved pending a semantic decision | +| sequence-only | 14 | need a fully materialized sequence; no per-event form | + +## Coverage + +Three layers, increasing in strength: + +- **L1 — exported.** The symbol is present in `libmeos` (`nm -D`). +- **L2 — wired.** A binding operator or facade method calls it. +- **L3 — proven.** It is actually invoked on a real `libmeos`: confirmed callable + (JVM tools) or exercised by a passing systest (NebulaStream). + +| Platform | L3 proven | L2 wired | of 1,945 | +|---|--:|--:|--:| +| **MobilityFlink** | **1,945 callable** | 1,945 | **100.0 %** | +| **MobilityKafka** | **1,945 callable** | 1,945 | **100.0 %** | +| **MobilityNebula** | 61 systest-confirmed | 406 wired · compile-verified | 20.9 % wired | + +Flink and Kafka share one generated JNR-FFI facade, so their callability is identical; +it is confirmed by a type-aware per-method harness that invokes **every** facade method +on a real `libmeos` (a returned value or a caught MEOS semantic error counts as +callable — only a linkage or marshalling failure does not). NebulaStream operators are +generated C++ physical operators; each is **compile-verified** against the build's +`libmeos` in the NebulaStream development image, and a systest suite confirms +callability operator by operator. + +## Query-result parity (BerlinMOD) + +The three platforms run the **9 BerlinMOD reference queries in three streaming forms** — +continuous, windowed, and snapshot — **27 of 27 cells per platform**, with the snapshot +form anchored to the batch (SQL-engine) result at the same scale factor. Every cell +reproduces the batch result, and the Flink snapshot output is **byte-identical** to the +batch oracle. This links streaming parity to [database parity](database-parity.md). + +## Benchmarks (BerlinMOD) + +Coverage and correctness establish that the queries *run* and *agree*; the BerlinMOD +benchmark establishes that they run *fast enough to deploy*. The same reference queries +are timed on each platform across scale factors and the three streaming forms +(continuous / windowed / snapshot), reporting per-query throughput and latency. Because +the workload definition is shared with the SQL engines +([MobilityDB-BerlinMOD](https://github.com/MobilityDB/MobilityDB-BerlinMOD)), streaming +and batch timings are directly comparable at the same scale factor. + +The timing methodology is the one defined for the SQL engines in +[MobilityDB-BerlinMOD #29](https://github.com/MobilityDB/MobilityDB-BerlinMOD/pull/29) +(measurements reported by what each is licensed to claim); the workload and harness are +in place, and the timing tables join this page as they are measured. SQL-engine timings +appear in the companion [databases — parity & benchmarks](database-parity.md) page. + +## Reproduce it + +The numbers come from committed feeds and need no toolchain: + +```sh +# MobilityNebula: tools/streaming_parity/ +python3 ci_gate.py # Flink/Kafka: 1945/1945 = 100.0%, no over-claim +cat feeds/nebula.feed.tsv # NebulaStream: wired / proven, per function +``` + +The feeds (`flink-kafka.feed.tsv`, `nebula.feed.tsv`) are regenerated by the +per-platform adapters and the callability harness in `tools/streaming_parity/`; the full +methodology is in `doc/methodology/streaming_parity_assessment.md` (MobilityNebula). + +*Measured against the `accumulate/parity-1.4` MEOS build, 2026-05-23. Flink and Kafka are +complete at 100%; the NebulaStream figure is the operator surface wired and compile-verified.*