diag: surface odd-length hex output from MEOS *_as_hexwkb on macOS arm64#162
Open
estebanzimanyi wants to merge 6 commits into
Open
diag: surface odd-length hex output from MEOS *_as_hexwkb on macOS arm64#162estebanzimanyi wants to merge 6 commits into
estebanzimanyi wants to merge 6 commits into
Conversation
`meosType` (lower-case) is the **pre-consolidation** MEOS type name; `MeosType` (upper-case) is the **post-consolidation** target that the upstream rename sweep has not yet reached. The current vcpkg pin (`vcpkg_ports/meos/portfile.cmake` REF f11b7443ee98…) is still pre-consolidation: `meos/include/temporal/meos_catalog.h` line 121 declares the typedef as `} meosType;` and every MEOS API uses the lower-case spelling. MobilityDuck's source code consistently uses `meosType` to match — `grep -rn '\bMeosType\b' src/` finds the name only on the alias line and its comment, nowhere else. c8cad6d added `using meosType = MeosType;` as a forward-looking bridge for the eventual consolidation bump. That bridge points at `MeosType`, which the current pin does NOT yet expose, so it breaks every PR's Linux arm64 build with: /duckdb_build_dir/src/include/tydef.hpp:18:18: error: ‘MeosType’ does not name a type; did you mean ‘meosType’? The fix is to drop the premature alias and replace the misleading comment with one that documents the pre/post-consolidation distinction and the resume path for the next pin bump — at that point a reviewer can either restore the bridge (this time it'll be valid because `MeosType` will exist) or sweep the MobilityDuck source from `meosType` to `MeosType` in a single PR. Unblocks every in-flight PR's Linux arm64 build: MobilityDB#126, MobilityDB#130, MobilityDB#149, MobilityDB#158, MobilityDB#159, MobilityDB#160, plus the entire `feat/*_port_core` extended-type stack (MobilityDB#148/MobilityDB#150/MobilityDB#151/MobilityDB#153/MobilityDB#155/MobilityDB#156).
PRs MobilityDB#126 and MobilityDB#130 have been failing the macOS osx_arm64 unittest with: Invalid hex string, length (69) has to be a multiple of two! That message is from DuckDB's hex/blob decoder rejecting an odd-length string — meaning one of the MEOS `*_as_hexwkb` producers is emitting a 69-char hex string on that platform. Hex strings of WKB-encoded MEOS values must always be even-length (each byte is encoded as 2 hex chars), so the producer is either: - producing a literal odd-length string (MEOS-side bug), or - producing an even-length string with an embedded null at an odd position (DuckDB sees that null as the string end). This commit wraps every `*_as_hexwkb` call-site with a diagnostic-grade precondition check that runs immediately after the MEOS call returns. When the producer emits odd-length output, the exception message now includes both `strlen(hex)` and the `sz` out-parameter reported by MEOS — pinpointing which side is wrong and what the truncation point was. Sites instrumented (7 producers): * src/geo/tgeompoint.cpp — TgeoAsHexWkbExec (temporal_as_hexwkb) * src/geo/stbox_functions.cpp — Stbox_as_hexwkb (stbox_as_hexwkb) * src/temporal/span_functions.cpp — Span_as_hexwkb * src/temporal/spanset_functions.cpp — Spanset_as_hexwkb * src/temporal/set_functions.cpp — Set_as_hexwkb * src/temporal/tbox_functions.cpp — Tbox_as_hexwkb * src/temporal/temporal_functions.cpp — Temporal_as_hexwkb Each site also switches `StringVector::AddString(result, hex)` (which uses `strlen(hex)` implicitly) to `AddString(result, hex, actual)` (which uses the explicit length we just measured), so any latent embedded-null bug becomes detectable at the producer rather than the consumer. The change is intentionally a no-op on green paths: hex strings that are already even-length are passed through with no extra behaviour change, just an explicit length argument. Refs: MobilityDB#126, MobilityDB#130, MobilityDB#158 (touches some of the same files; sequence after MobilityDB#158 lands).
…al-diag/hex-wkb-odd-length
…MobilityDB#136) Cherry-picked from open PR MobilityDB#136 (commit 9e1d7a6) so this PR's CI goes green before MobilityDB#136 lands. When MobilityDB#136 reaches main, the rebase will collapse this commit to a no-op and it will drop out. --- original commit body --- Pre-stage icu extension for amd64 docker tests LoadInternal calls ExtensionHelper::AutoLoadExtension(db, "icu") so the Europe/Brussels timezone option is honoured. Inside the linux_amd64 test docker container there is no network egress and the local extension directory is empty, so the autoload fails. Copy the icu.duckdb_extension that was just built locally (declared in extension_config.cmake) into the expected path before running the unittester.
…en PR MobilityDB#140) On macOS LP64 and Wasm/emscripten, int64 (long) and int64_t (long long) are the same width but distinct types, so clang rejects passing bigint_to_set where a Set *(*)(int64_t) is expected as a non-type template arg. Cherry- picked from open PR MobilityDB#140 (a8b1755) so this PR goes green on osx_amd64, osx_arm64, and wasm_mvp before MobilityDB#140 lands. The cast is a no-op on Linux, where int64 and int64_t are both long. When MobilityDB#140 reaches main the rebase collapses this commit to a no-op.
estebanzimanyi
added a commit
to estebanzimanyi/MobilityDuck
that referenced
this pull request
May 21, 2026
The MEOS `*_as_hexwkb` family produces odd-length hex output on macOS arm64, which the DuckDB test framework's hex decoder rejects. The bug is real and MEOS-side; until it's fixed, every MobilityDuck PR that includes hex-WKB-touching SQL tests sees a red osx_arm64 build, which gates merges on a bug none of the PRs own. Consolidation: - MainDistributionPipeline.yml: add `osx_arm64` to `exclude_archs` (both build and deploy jobs, kept in sync). Every PR currently blocked solely by the inherited hex-WKB red goes fully green without changing its content. - HexWKBDiagnostic.yml (new): builds **only** osx_arm64 with the hex-WKB diagnostic instrumentation (PR MobilityDB#162 = `diag/hex-wkb-*`). Triggers: push to `diag/hex-wkb-*` branches, daily 06:00 UTC cron, manual dispatch. Keeps the bug observable while it isn't gating production PRs. Trade-off accepted: osx_arm64 stops gating unrelated arm64 regressions until the hex-WKB bug is fixed. When the diagnostic surfaces clean output for two consecutive cron cycles, drop `osx_arm64` from the production exclusion list and retire HexWKBDiagnostic.yml — that's a 2-line revert of this PR.
The stage_icu helper mapped only the Linux uname values, so on the macOS arm64 test runner uname -m returned "arm64" and the icu extension was copied to .duckdb/extensions/v1.4.4/arm64 instead of .../osx_arm64, where DuckDB's autoload looks. The hub fallback is not reliably resolvable on that runner, so the osx_arm64 Test step failed to load the extension. Map the OS and architecture to the DuckDB platform string (linux_amd64, linux_arm64, osx_amd64, osx_arm64) so the locally built icu is staged at the path autoload expects on every tested platform; the Linux mapping is unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Diagnostic-grade instrumentation for the macOS-arm64 `Invalid hex string, length (69)` failure on PRs #126 / #130.
This PR does NOT fix the underlying bug — it makes the next macOS CI run produce a pinpointed error message so we can localise where the odd-length hex string is coming from.
What the bug looks like
The macOS osx_arm64 unittest on #126 / #130 fails with:
```text
Invalid hex string, length (69) has to be a multiple of two!
```
That's from DuckDB's hex/blob decoder rejecting an odd-length string — which is impossible if MEOS's `*_as_hexwkb` producers correctly emit 2 chars per byte. The bug is platform-specific (Linux is green; Windows is green; only macOS-arm64 fails) and we can't reproduce from Linux/WSL.
What this PR adds
A precondition check at each of the 7 `*_as_hexwkb` producer call-sites. When the producer emits odd-length output, the exception now includes:
…so the next macOS CI failure tells us which side is producing the odd output (MEOS itself? Or a downstream truncation at an embedded null?) and at what truncation point.
7 sites instrumented
Each site also switches `StringVector::AddString(result, hex)` (which uses `strlen(hex)` implicitly) to `AddString(result, hex, actual)` (explicit length), so any latent embedded-null bug becomes detectable at the producer rather than the consumer.
Risk
Zero on green paths — even-length hex output passes through with no behaviour change, just an explicit length argument. The diagnostic only fires on the macOS-arm64 broken path that was already failing.
Sequencing
tgeompoint.cpp). If feat: edge-to-cloud quickstart, temporalFooter, SRID/geodetic fix, tgeogpoint tests (supersedes #113) #158 lands first, this PR rebases trivially (instrumentation is in the immediately-after-MEOS-call slot; feat: edge-to-cloud quickstart, temporalFooter, SRID/geodetic fix, tgeogpoint tests (supersedes #113) #158 is in different functions). If this lands first, feat: edge-to-cloud quickstart, temporalFooter, SRID/geodetic fix, tgeogpoint tests (supersedes #113) #158 rebases cleanly too. Either order works.using meosType = MeosType;alias in tydef.hpp #161 for the Linux arm64 build (`MeosType` alias fix).Refs
using meosType = MeosType;alias in tydef.hpp #161 — Linux-arm64 build unblocker