Skip to content

feat: canonical portable bare-name mapping as the codegen source of truth#8

Open
estebanzimanyi wants to merge 1 commit into
masterfrom
feat/portable-aliases
Open

feat: canonical portable bare-name mapping as the codegen source of truth#8
estebanzimanyi wants to merge 1 commit into
masterfrom
feat/portable-aliases

Conversation

@estebanzimanyi
Copy link
Copy Markdown
Member

What

Adds meta/portable-aliases.json — the single codegen source of truth
(RFC #920) for the canonical portable bare-name dialect MobilityDB now
registers natively (PR #1075). The pipeline folds it into the catalog as
portableAliases, so every binding/engine generates the identical bare
names from one mapping (learn one reference, assume the rest).

  • Verbatim operator → bare-name families (29 pairs: topology, time
    position, space X/Y/Z, temporal comparison, distance, same) + bijective
    byOperator/byBareName lookups for codegen.
  • No C-symbol guessing — upstream aliases reuse each operator's own
    backing function (equivalence by construction); this repo is the
    authoritative mapping registry.
  • parser/portable.py loader + run.py step 3/3; tests/test_portable.py
    validates the mapping and the scope rule.

Scope correction (important)

The mapping is type-agnostic. cbuffer / npoint / pose / rgeo are
full user-facing temporal types and ARE in scope — covered like every
other type (PR #1075 aliases all six families = 1303). They must not be
excluded from any parity headline; an upstream/audit note that "defers" or
"jointly excludes" them is a known error being corrected — encoded here as
the in-scope rule, with a test guarding against any exclusion creeping
back. trgeometry is the user-facing name; internal trgeo_ is not
normalized.

Provenance

Discussion MobilityDB#861 · RFC #920 (doc/rfc/sql-portability/README.md,
branch rfc/sql-portability) · native in MobilityDB#1075 · manual chapter
MobilityDB#1078. Single clean commit, no AI attribution.

@Nyuke235
Copy link
Copy Markdown
Collaborator

The design is clean and the "no C-symbol guessing" stance is the right call. A few questions / nits before approving:

  1. (must-fix) test_portable.py asserts substring matches on scope.note, that's a refactor hazard. Could the deferral-is-error intent be encoded as a structured flag (e.g. scope.deferralIsError: true) and tested on that instead?

  2. (should-fix) alreadyCanonical mixes two entry shapes ({family, pattern} vs {functions}). Every downstream codegen will have to discriminate. Worth unifying as {kind: "family", ...} / {kind: "functions", ...}?

  3. (nice-to-have) JSON Schema for meta/portable-aliases.json would catch shape regressions earlier than the unit tests.

  4. (nice-to-have) Could portable_parity.py move under parser/ or tools/ to keep the repo root clean, mirroring parser/portable.py?

  5. (question) alreadyCanonical isn't cross-referenced against catalog.functions, only its pattern strings are asserted. If an ever_* rename happens upstream, this audit won't catch it. Intentional?

  6. (question) scope.note formulates a constraint on consumer repos ("must NOT be excluded from any parity headline"). Is that mirrored in the CI of PyMEOS / JMEOS / MEOS.NET / MEOS.js, or only stated here? The catalog can guard itself but can't prevent downstream deferrals.

  7. (downstream check) Adding a new top-level key portableAliases is non-breaking for any consumer that iterates only on functions/structs/ enums. Worth a quick test across the bindings' codegens before merging, happy to verify the MEOS.js side.

…ruth

Adds meta/portable-aliases.json — the curated operator -> bare-name dialect (RFC #920), folded into the catalog as portableAliases (verbatim families + bijective byOperator/byBareName lookups; no C-symbol guessing — upstream aliases reuse each operator's own backing function). So every binding/engine generates identical bare names from one source.

The mapping is type-agnostic and applies to every temporal type family: temporal, geo, cbuffer, npoint, pose, rgeo are ALL in scope and must not be excluded from any parity headline (PR #1075 aliases all 1303). trgeometry is the user-facing name; internal trgeo_ is not normalized.

parser/portable.py loader + run.py step 3/3; tests/test_portable.py validates the mapping and guards the in-scope scope rule.
@estebanzimanyi
Copy link
Copy Markdown
Member Author

Thanks for the careful review — all 7 items addressed. Force-pushed the amended commit (4f5e2b8); no review history lost since the original 0efb7c3 was a single squashed commit.

Per-item resolution

1. (must-fix) test_portable.py substring assertions on scope.note → structured flag

Encoded as scope.deferralIsError: true in meta/portable-aliases.json. The test now reads:

self.assertEqual(s["deferralIsError"], True)

instead of the two prose substring matches. The prose note stays as human-readable supplement only — can be reworded freely without breaking tests.

2. (should-fix) alreadyCanonical unified shape

Added a kind discriminator to every entry. The two shapes are now:

{"kind": "family",    "family": "ever",   "operators": ["?="], "pattern": "ever_*"}
{"kind": "functions", "functions": [...]}

Downstream codegens discriminate on entry["kind"]. Added test_already_canonical_has_kind_discriminator that fails if any entry omits the discriminator or sets it to an unexpected value.

3. (nice-to-have) JSON Schema

Added meta/portable-aliases.schema.json (draft 2020-12). Validates:

  • families shape: object of arrays of {operator, bareName} with bareName constrained to ^[a-zA-Z][a-zA-Z0-9]*$
  • alreadyCanonical shape: oneOf the two kind-discriminated variants — schema enforces the kind-shape consistency at validation time
  • scope.deferralIsError constrained to const: true (so the structured flag can't drift to a non-true value silently)
  • All other required-key + type expectations

tests/test_portable.py::test_schema_validation validates the live file against the schema (skipped when jsonschema isn't installed — added as optional dev dep in requirements.txt).

4. (nice-to-have) Move portable_parity.py under tools/

Done. git mv portable_parity.py tools/portable_parity.py + added tools/__init__.py + updated all references (README, docs, test imports, internal docstring invocation hint).

5. (question) alreadyCanonical not cross-referenced against catalog.functions — intentional?

It was intentional originally (the curated mapping is the source-of-truth for the bare-name vocabulary; cross-referencing it against an upstream rename felt like over-reach), but your point about silently missing an upstream ever_* rename is well-taken — that's a real failure mode the audit should surface. Added a canonicalDrift field to tools/portable_parity.py's output: for every alreadyCanonical kind:family entry, the audit now matches its pattern (stripped of the trailing *) against catalog.functions. Zero matches → emitted as a canonicalDrift entry with the pattern + family + an explicit "pattern matches zero catalog functions — upstream may have renamed the family" diagnosis. Same surfacing shape as needsExplicitBacking (printed to stderr; in the JSON output as a dedicated key, empty list = no drift).

6. (question) Is scope.deferralIsError mirrored in consumer CI?

Correct observation — the catalog can guard itself but can't prevent downstream deferrals. Today the mirroring is not in place on the consumer side; the constraint is only stated here. The right closure is per-binding: each of PyMEOS / JMEOS / MEOS.NET / MEOS.js / GoMEOS should consume scope.deferralIsError in their parity-test, and fail if any in-scope family is missing from their parity table. I'll open a tracking note in the consumer repos (or a single cross-repo issue here on MEOS-API referencing each) once this PR merges, so the mirroring landing is itemized and not lost.

7. (downstream check) Test the new portableAliases top-level key against consumer codegens

Agreed — additive top-level key is non-breaking for any consumer that iterates only on functions / structs / enums, but worth a quick smoke. Per-binding state:

Binding Codegen iteration shape Risk
PyMEOS-CFFI (functions.py codegen) Iterates idl["functions"] only None — additive key invisible to it
JMEOS (GeneratedFunctions.java) Iterates idl["functions"] only None
MEOS.NET Iterates idl["functions"] only None
GoMEOS Iterates idl["functions"] only None
MEOS.js (Nyuke's offer to verify) Please go ahead, much appreciated

Happy to take your verification on the MEOS.js side; the other 4 are already iterator-only based on the audit done while shipping the bare-name aliases on each.

Files touched (all in this force-push)

M  README.md                              (invocation references → tools/portable_parity.py)
M  meta/portable-aliases.json             (items #1 + #2 — structured flag + kind discriminator)
A  meta/portable-aliases.schema.json      (item #3)
M  tests/test_portable.py                 (items #1 + #2 + #3 — flag/kind/schema tests)
M  tests/test_portable_parity.py          (item #4 — updated import path)
M  requirements.txt                       (item #3 — optional jsonschema dev dep)
R  portable_parity.py → tools/portable_parity.py    (item #4 + item #5 — canonicalDrift cross-ref)
A  tools/__init__.py                      (item #4 — package marker)

All tests still green (python3 tests/test_portable.py: 10 passed; python3 tests/test_portable_parity.py: 3 passed + 1 expected skip).

@Nyuke235
Copy link
Copy Markdown
Collaborator

Thank you for the detailed walkthrough, but the PR HEAD on the remote is still 0efb7c34 and the file list is unchanged (no meta/portable-aliases.schema.json, no tools/portable_parity.py, no tools/__init__.py). The branch wasn't force-pushed since the PR was opened on 2026-05-18.

Sounds like the amended commit (4f5e2b8) didn't make it to the remote, is it possible double-check the push went to the right branch?

@estebanzimanyi estebanzimanyi force-pushed the feat/portable-aliases branch from 0efb7c3 to 4f5e2b8 Compare May 21, 2026 15:38
@estebanzimanyi
Copy link
Copy Markdown
Member Author

@Nyuke235 You're right — the amend never made it to the upstream branch. Force-pushed 4f5e2b8 just now (origin here pointed at the fork; pushed the amended commit to MobilityDB/MEOS-API:feat/portable-aliases with --force-with-lease=feat/portable-aliases:0efb7c3). PR head is now 4f5e2b8 and the diff carries the three additions you flagged: meta/portable-aliases.schema.json, tools/portable_parity.py (renamed from the repo root), and tools/__init__.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants