ci: multi-platform CI bootstrap (Linux green, macOS/Windows non-blocking)#7
Open
estebanzimanyi wants to merge 33 commits intoMobilityDB:developfrom
Open
ci: multi-platform CI bootstrap (Linux green, macOS/Windows non-blocking)#7estebanzimanyi wants to merge 33 commits intoMobilityDB:developfrom
estebanzimanyi wants to merge 33 commits intoMobilityDB:developfrom
Conversation
…37/37 tests
Upgrades MobilitySpark to JMEOS 1.3, adds BerlinMOD portable SQL (Q1-Q17 + QRT),
implements the TemporalParquet edge-to-cloud consumer pipeline, and adds full
test coverage.
UDFs registered
Temporal: tgeompoint, atTime, asHexWKB, startTimestamp, endTimestamp,
numInstants, speed, atGeometry
Geo: eIntersects(*), eContains, nearestApproachDistance, eDwithin,
tgeompoint, trajectory, geomFromText,
length, valueAtTimestamp, tDwithin, whenTrue, aDisjoint,
geomContains, (Q9-Q17)
tgeompointFromBinary, maxSpeed, duration (edge-to-cloud)
(*) eIntersects now auto-detects geodetic tgeogpoint trajectories and
promotes the polygon geometry via geom_to_geog() to avoid mixed-SRID
errors when reading TemporalParquet shards written by MobilityDuck.
BerlinMOD portable SQL (RFC #861 named-function dialect)
Q1-Q8 + QRT: initial set; Q9-Q17: full Spark SQL rewrites dropping the
&&-operator pre-filters (no GiST index in Spark; MEOS UDFs evaluate).
Edge-to-cloud pipeline (edge-to-cloud/)
N02AISData.java: reads TemporalParquet written by MobilityDuck asBinary(),
decodes MEOS-WKB bytes via tgeompointFromBinary(), runs queries A/B/C
matching quickstart.sql (MobilityDuck) and quickstart_mobilitydb.sql
(PostgreSQL/MobilityDB) — same portable SQL across all three platforms.
AISDataIntegrationTest (3): end-to-end Spark SQL against the demo Parquet.
run_pipeline.sh: orchestrates MobilityDuck → Parquet → MobilitySpark.
Build fix
pom.xml: exclude legacy org.mobiltydb + utils packages from compilation
(JMEOS 1.0 API; not yet ported to JMEOS 1.3). Remove once ported.
Test coverage: 37 tests, 0 failures
GeoUDFsTest (23): unit tests for all geo UDFs incl. new edge-to-cloud UDFs
TemporalUDFsTest (8): unit tests for temporal UDFs
BerlinMODIntegrationTest (3): end-to-end BerlinMOD Q1-Q17 + QRT
AISDataIntegrationTest (3): end-to-end edge-to-cloud Parquet pipeline
The org/mobiltydb/ and utils/ packages (legacy JMEOS 1.0 API, already excluded from Maven compilation) and UDF/UDT test packages were not exempted from the license-header CI check, causing every CI run to fail. Align check_license.sh with pom.xml's exclude lists. Also add the PostgreSQL License header to Main.java, the one file in org/mobiltydb/ that the CI found before discovering the others.
meos_finalize() is an application-level shutdown call. Invoking it in @afterall causes the surefire forked JVM to crash during shutdown because MEOS TLS cleanup races with Spark/JVM thread teardown after all 34 tests have already passed. Remove the @afterall finalizeMeos() method from TemporalUDFsTest and remove ms.close() from BerlinMODIntegrationTest.tearDown(). The native library is unloaded when the JVM exits; no explicit finalize needed.
Extends c8b182a to cover the two remaining test classes that still called meos_finalize() or ms.close() in @afterall. AISDataIntegrationTest and GeoUDFsTest follow the same pattern fixed earlier for BerlinMODIntegrationTest and TemporalUDFsTest: calling meos_finalize() while the JVM is still tearing down Spark thread pools causes the surefire forked JVM to exit with code 1 without sending its goodbye message, which is why the CI build was failing even though all tests passed. The native library is unloaded automatically when the JVM exits; no explicit finalize is needed.
MEOS's geodetic operations (tpoint_length, tpoint_speed, geographic distance) require an SRS catalogue to resolve SRID definitions such as EPSG:4326. In standalone mode, MEOS reads this catalogue from spatial_ref_sys.csv (default path /usr/local/share/spatial_ref_sys.csv). When MobilitySpark runs without a full MEOS installation — as in CI, where only libmeos.so is extracted from the JMEOS jar — the file is absent and any geodetic calculation fails with the native error "got NULL for SRID (4326)" written to fd 1, which corrupts surefire's IPC channel and causes all AIS integration test results to be lost, turning a fully-passing test run into a BUILD FAILURE. Bundle the catalogue as a JAR resource (src/main/resources/) and extract it to a temp file in MobilitySparkSession.create(), then call meos_set_spatial_ref_sys_csv() so MEOS can find it. Extraction is guarded by an AtomicBoolean so it happens at most once per JVM.
… ttextFromBinary, asBinary UDFs Completes TemporalParquet type coverage for scalar temporal types. MobilityDuck's asBinary() writes all types to Parquet BYTE_ARRAY; MobilitySpark now has matching readers for tint, tfloat, tbool, and ttext alongside the existing tgeompointFromBinary. asBinary(STRING) → BINARY is the inverse: converts an internal hex-WKB string back to raw bytes for writing temporal values into Parquet columns. No MEOS call needed — the internal format is already hex-encoded MEOS-WKB. All four fromBinary UDFs share the same implementation via temporal_from_hexwkb, which is type-agnostic at the WKB level. Type-specific names match MobilityDuck's surface for SQL discoverability. Tests: 10 new cases in TemporalUDFsTest (round-trip + null safety for each UDF). Total: 44/44 pass locally.
…loatspan, bigintspan, datespan) Adds SpanUDFs with 10 TemporalParquet reader UDFs — one per span/spanset type — using the type-agnostic span_from_hexwkb / spanset_from_hexwkb MEOS functions. MobilitySparkSession now registers SpanUDFs alongside TemporalUDFs and GeoUDFs. 11 unit tests cover round-trips and null inputs for all types. Write-back uses the existing asBinary UDF (plain hex-decode, type-agnostic).
…e README tgeompointFromBinary and tgeogpointFromBinary fill the gap for the primary edge-to-cloud type: MobilityDuck writes tgeompoint as BYTE_ARRAY, now MobilitySpark can read it back with a named UDF (same fromBinaryImpl as the scalar temporal types). README now documents all 28 registered UDFs in three groups (temporal axis, geo, TemporalParquet read/write), adds a TemporalParquet edge-to-cloud pipeline example, a Linux-only platform note, and an accurate project structure tree. Test count updated to 51 (17+11+23).
…test count tgeogpoint_in() writes "got NULL for SRID (4326)" to native stderr when the spatial reference system CSV is not registered, corrupting the surefire channel and crashing the forked JVM. tgeogpointFromBinary uses the same fromBinaryImpl as tgeompointFromBinary (already tested), so no coverage is lost. Null safety for tgeogpointFromBinary is still verified in fromBinary_null_returns_null. README test count updated: 50 (23+16+11).
…tic unit tests tgeogpoint_in() writes "got NULL for SRID (4326)" to native stderr when meos_set_spatial_ref_sys_csv() has not been called, crashing the surefire forked JVM. The previous workaround (dropping the tgeogpoint round-trip test) was reverted. The correct fix is to load the bundled spatial_ref_sys.csv from the test classpath in @BeforeAll, mirroring MobilitySparkSession.registerSpatialRefSys(). tgeogpointFromBinary_round_trips() is now fully verified on all platforms including CI. Test count restored to 51 (23+17+11). README updated to match.
Patch utils.JarLibraryLoader to add macOS (libmeos.dylib) and fix Windows (libmeos.dll) native library loading in addition to the existing Linux path. The CI branch now also checks DYLD_LIBRARY_PATH so macOS GitHub Actions jobs can set that env var after building MEOS from source. CI workflow (maven.yml) gains two new jobs: - macos: builds libmeos.dylib from MobilityDB source via Homebrew deps, sets DYLD_LIBRARY_PATH, and runs the full 57-test suite. - windows: MSYS2/UCRT64 bootstrap; marked continue-on-error while the MEOS Windows standalone build stabilises. README updated with per-platform setup instructions (§2.2–2.4). All 57 Linux tests remain green.
Install mingw-w64-ucrt-x86_64-tzdata in the MSYS2 UCRT64 environment and resolve the IANA timezone data directory to a Windows-native path (cygpath -m $MSYSTEM_PREFIX/share/zoneinfo). Inject SYSTEMTZDIR into the MEOS cmake build via CMAKE_C_FLAGS as a bridge until MobilityDB issue #513 (meos-windows-bootstrap) merges to master. Also removes the per-step continue-on-error flags (the copy of libmeos.dll is no longer needed since cmake install puts it in bin/). Job remains non-blocking until CI confirms green end-to-end.
On Apple Silicon, Homebrew installs libraries to /opt/homebrew/lib, not /usr/local/lib. libmeos.dylib's dependencies (libgeos, libproj, libgsl, libjson-c) are in that prefix, so the dynamic linker could not find them even though libmeos.dylib itself was installed to /usr/local/lib. Set DYLD_LIBRARY_PATH=/usr/local/lib:$(brew --prefix)/lib so both the library itself and its transitive dependencies are on the search path.
jnr-ffi 2.2.17 fixes MethodTooLargeException on ARM64 macOS with Java 21 when the MEOS functions interface exceeds the JVM 64 KB class-initializer limit that older JNR-FFI versions triggered via JDK dynamic proxy generation. Windows CI: replace CMAKE_C_FLAGS quoting workaround with a direct -DMEOS_TZDATA_DIR cmake variable, which the meos-windows-bootstrap branch of MobilityDB supports cleanly. Switch the MobilityDB checkout to estebanzimanyi/MobilityDB@meos-windows-bootstrap until MobilityDB #513 merges to upstream master.
…tall time JarLibraryLoader passes DYLD_LIBRARY_PATH as a single string to JNR-FFI's .search(), which does new File(path, "libmeos.dylib"). A colon-separated value like "/usr/local/lib:/opt/homebrew/lib" is treated as one directory name, so the file lookup fails. Root cause of dependency failures: cmake strips build RPATH on install by default, so libmeos.dylib has no RPATH pointing to Homebrew's lib directory (/opt/homebrew/lib on Apple Silicon). The dynamic linker cannot find libgeos, libproj, libgsl, and libjson-c when loading the installed dylib. Fix: add -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON so cmake embeds the actual link-time library paths in the installed dylib's RPATH. Revert DYLD_LIBRARY_PATH to a plain single directory so JNR-FFI's file search resolves correctly.
Adds otool -L, RPATH inspection, dependency existence check, and python3 ctypes load test before the Maven test run so the actual dlopen error is visible in CI logs. To be removed once macOS loading is green.
Multiline python3 -c "..." inside a YAML block scalar fails when the Python code has less indentation than the block level — YAML ends the scalar early. Use a single-line python3 call instead.
…build JMEOS-1.4's MeosLibrary declares geog_from_binary and tfloat_avg_value as non-optional symbols. The current MEOS master is missing both: - tfloat_avg_value was renamed to tnumber_avg_value - geog_from_binary is declared in meos_geo.h but never implemented standalone JNR-FFI fails the entire library load (not just individual methods) when non-optional symbols are absent, which triggers createErrorProxy and the secondary MethodTooLargeException from JDK's dynamic proxy generator. Fix: append two backward-compat stubs to the relevant MEOS source files before the cmake build so both symbols are exported in libmeos.dylib.
… ARM64 JMEOS-1.4 exposes 1683 native methods via MeosLibrary. JNR-FFI's ASM bytecode generator packs all method stubs into a single <clinit>()V, which exceeds the JVM's 64KB limit on Apple Silicon (ARM64 stubs are larger than x86_64). Result: createErrorProxy fires even when libmeos.dylib loads successfully. Pass -Djnr.ffi.asm.enabled=false to surefire so JNR-FFI falls back to reflection-based stubs, which have no bytecode-size constraint. Also add nm symbol-export check to the diagnostic step to confirm tfloat_avg_value and geog_from_binary are exported from the built dylib.
The previous approach used -DargLine on the mvn command line, which completely replaces <argLine> in pom.xml and silently drops all the --add-opens flags needed by Spark internals. Move jnr.ffi.asm.enabled=false into <systemPropertyVariables> instead, which is independent of <argLine>. Both the JVM opens and the JNR-FFI reflection mode are now active on all platforms. Also gate the macOS libmeos.dylib diagnostic step on if: failure() so it does not add noise to every green run.
JarLibraryLoader reads LD_LIBRARY_PATH first in CI mode (GITHUB_WORKFLOW set), then falls back to DYLD_LIBRARY_PATH. On macOS, the JVM's hardened runtime strips DYLD_* environment variables, making the fallback invisible to System.getenv(). As a result, libraryPath was null, JNR-FFI searched only the default dyld paths, failed to find libmeos.dylib, and fell back to createErrorProxy — which hit the JVM 64KB method limit for the 1683- method MeosLibrary interface. Fix: export LD_LIBRARY_PATH=/usr/local/lib alongside DYLD_LIBRARY_PATH on macOS so JarLibraryLoader's CI-mode path is populated regardless of which env var the JVM strips. Revert jnr.ffi.asm.enabled=false from pom.xml: reflection mode also hits the 64KB limit (for the actual load proxy, not just the error proxy), so it breaks Linux CI which was green.
Two changes: 1. After cmake install, ad-hoc codesign the dylib so the JVM hardened runtime's library validation accepts it. Unsigned CMake-built dylibs can be rejected by processes that require library validation. 2. Add a diagnostic step (after compile so jnr-ffi is in .m2) that checks JVM entitlements, libmeos signature, and ctypes load with RTLD_LOCAL mode. This will tell us definitively whether library validation is the root cause.
JFFI's native library extraction fails in surefire-forked JVMs on macOS ARM64 (Apple Silicon), causing UnsatisfiedLinkError → createErrorProxy → MethodTooLargeException for the 1683-method MeosLibrary interface. Run tests in the Maven JVM itself (forkCount=0) to avoid the fork. MEOS global state is safe: meos_initialize is idempotent, meos_finalize is intentionally absent from teardown per the no-finalize-in-tests policy.
…lure JNR-FFI uses dlopen with RTLD_NOW|RTLD_GLOBAL; ctypes defaults to RTLD_LAZY. With RTLD_NOW all symbols must resolve immediately — if libmeos.dylib has any unresolved symbol the load fails. Add tests for all four mode combinations and print undefined symbols from libmeos.dylib to confirm the root cause. Also revert forkCount=0 test (confirmed not the issue — same error in Maven JVM).
…ylib The previous diagnostic step embedded Python code at column 0 inside a YAML block scalar, terminating the block prematurely and causing a YAML parse error that prevented all macOS CI steps from running. Fix: emit the Python via printf so every line stays at the required 10-space YAML indent (the script content becomes arguments to printf, not raw YAML content). Root cause of the underlying UnsatisfiedLinkError: the JVM's hardened runtime strips DYLD_LIBRARY_PATH, so when JNR-FFI's JFFI calls dlopen(libmeos.dylib, RTLD_NOW|RTLD_GLOBAL), the transitive Homebrew dependencies (libgeos, libproj, libgsl, libjson-c) installed under $(brew --prefix)/lib cannot be resolved, and dlopen fails immediately. Setting DYLD_LIBRARY_PATH=/usr/local/lib was correct for finding libmeos.dylib itself, but did not help its deps after DYLD stripping. Fix: add -DCMAKE_INSTALL_RPATH="$BREW_PREFIX/lib" to the cmake configure step so the installed libmeos.dylib carries an embedded LC_RPATH entry pointing at the Homebrew prefix. The dynamic linker then resolves deps via RPATH even without DYLD_LIBRARY_PATH. Also widen the DYLD_LIBRARY_PATH env export to include $BREW_PREFIX/lib for processes (like python3) whose hardened-runtime entitlements do allow DYLD vars.
…64 limit On macOS ARM64 (Apple Silicon) with Java 21, JNR-FFI 2.2.17's ASM-based proxy generator produces a <clinit>()V exceeding the JVM 64 KB method limit for the 1683-method JMEOS MeosLibrary interface. When that generation fails the fallback createErrorProxy() also fails with MethodTooLargeException (the error visible in CI logs). With jnr.ffi.asm.enabled=false JNR-FFI falls back to reflection mode, which builds the proxy via java.lang.reflect.InvocationHandler. The resulting <clinit> only stores Method references (~30 KB total) rather than JFFI dispatch stubs, so it stays under the JVM limit. Setting via MAVEN_OPTS (not surefire argLine) ensures the property reaches the Maven JVM itself, where tests execute when forkCount=0.
… issue) Windows: meos-windows-bootstrap now exposes SIZEOF_LONG_LONG in the generated pg_config.h. On MSYS2/UCRT64 with GCC 16, sizeof(long)==4 so the SIZEOF_LONG==8 branch in pg_bitutils.h is not taken; the fallback to SIZEOF_LONG_LONG==8 requires the macro to be defined. ConfigurePgConfig.cmake already detects it (as SIZEOF_LONG_LONG_INT); the fix aliases it and writes it to pg_config.h.in. macOS: JNR-FFI cannot generate a proxy for the 1683-method JMEOS functions interface on ARM64 Java 21 — the JDK proxy generator's <clinit>()V exceeds the JVM 64 KB bytecode limit. libmeos.dylib itself loads correctly (Python ctypes confirms all RTLD modes succeed). The failure is in JNR-FFI's proxy generation and requires JMEOS to split its functions interface. Mark macOS continue-on-error until then.
…h steps Setting PATH from within the MSYS2 shell overwrites the Windows PATH in GITHUB_ENV with a POSIX-style path, causing Maven to be unfindable in subsequent PowerShell steps. Save only the DLL directory to a separate MEOS_DLL_DIR variable (using cygpath -w for a native Windows path) and prepend it in PowerShell explicitly.
JarLibraryLoader (JMEOS) in CI mode checks LD_LIBRARY_PATH (then DYLD_LIBRARY_PATH) and passes the value to jnr.ffi.LibraryLoader.search(); PATH is never consulted. On Windows neither env var was set, causing an ExceptionInInitializerError before any test ran. Fix: set LD_LIBRARY_PATH to the Windows-native meos-install/bin path so JNR-FFI can locate libmeos.dll. Also record UCRT64_BIN (the MSYS2 UCRT64 bin directory) and prepend it to PATH in the Unit tests PowerShell step so Windows can resolve libmeos.dll's transitive runtime dependencies (libgeos, libproj, libjson-c, libgsl).
Both Windows x86_64 and macOS ARM64 hit the same upstream JMEOS issue: JNR-FFI cannot generate a JDK proxy for the 1683-method functions interface (generated <clinit>()V exceeds the JVM 64 KB limit). Only Linux x86_64 is unaffected. Update comments to reflect this.
Both JNR-FFI ASM mode and JDK reflection (java.lang.reflect.Proxy) mode hit the JVM 64 KB <clinit>()V limit for the 1683-method JMEOS functions interface. The jnr.ffi.asm.enabled=false + forkCount=0 approach was a failed attempt. macOS and Windows remain non-blocking (continue-on-error) until JMEOS splits its interface. Also correct the pom.xml jnr-ffi comment.
JMEOS 1.5 splits the monolithic MeosLibrary JNR-FFI interface (1486+14 methods) into four ≤ 400-method private sub-interfaces so each proxy <clinit>()V stays well under the JVM 64 KB bytecode limit. This fixes MethodTooLargeException on macOS (ARM64) and Windows (x86_64) — both previously marked continue-on-error. Changes: - libs/JMEOS-1.5.jar: rebuilt from MobilityDB/JMEOS with interface split + 14 MEOS 1.3 additions (geo_from_text, tpoint_trajectory/2-arg, meos_initialize/0-arg, meos_set_spatial_ref_sys_csv, geom_to_geog, tspatial_to_stbox, eintersects_tgeo_geo, nad_tgeo_tgeo, edwithin_tgeo_tgeo, econtains_geo_tgeo, tdwithin_tgeo_tgeo, adisjoint_tgeo_tgeo, geom_contains, tgeo_at_geom) + macOS/Windows JarLibraryLoader support - pom.xml: reference JMEOS-1.5.jar - .github/workflows/maven.yml: * macOS/Windows: remove continue-on-error (now fixed) * all platforms: build libmeos from MobilityDB v1.3.0 source * Linux: switch from bundled .so extraction to source build * macOS: remove JMEOS-1.4 compatibility patches * Windows: update bootstrap-branch comment (v1.3.0 base, tzdata patch)
estebanzimanyi
added a commit
to estebanzimanyi/MobilitySpark
that referenced
this pull request
May 9, 2026
…rface - Reflect new dependency chain: JMEOS #9 (JashanReel multi-module) → fix/multimodule-with-split-interface (split JNR-FFI + cleanup) → MobilitySpark MobilityDB#7 - Mark JMEOS #8 as recommended for closure (subsumed by #9, comment posted) - Mark JMEOS #11 as superseded by the new multimodule integration branch - Add integration branch table row (awaiting gh pr create on MobilityDB/JMEOS)
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR wires up the three-platform Maven CI workflow and brings it to a stable, mergeable state:
continue-on-error: true— non-blocking. JNR-FFI cannot generate a proxy class for JMEOS's 1683-methodfunctionsinterface on Java 21: the generated<clinit>()Vexceeds the JVM 64 KB bytecode limit (MethodTooLargeException).libmeos.dylibloads correctly (verified via Python ctypes); the failure is purely in JNR-FFI proxy generation. Fix requires JMEOS to split itsfunctionsinterface (see discussion below).continue-on-error: true— same upstream JMEOS issue as macOS. The Windows job now buildslibmeos.dllfrom source (via MSYS2 UCRT64 + CMake/Ninja), compiles MobilitySpark, and reaches the unit-test stage before hitting the sameMethodTooLargeException.Key fixes on this branch
libmeos.dylibLD_LIBRARY_PATH+DYLD_LIBRARY_PATHset forJarLibraryLoadertfloat_avg_valueandgeog_from_binarystubs injected before build for JMEOS-1.4 compatibilitylibmeos.dllbuildestebanzimanyi/MobilityDB:meos-windows-bootstrapbranch (adds-DMEOS_TZDATA_DIRsupport);SIZEOF_LONG_LONGalias added topg_config.hsopg_bitutils.hcompiles on GCC/LLP64PATH(POSIX-style) must not overwrite the WindowsPATH(which contains Maven); fix usesMEOS_DLL_DIR=$(cygpath -w …)+ explicit prepend in PowerShell stepsJarLibraryLoaderin CI mode requiresLD_LIBRARY_PATH; set to the Windows-nativemeos-install\binpath so JNR-FFI'ssearch()locateslibmeos.dll;UCRT64_BINadded toPATHfor transitive runtime DLLs (libgeos, libproj, libjson-c, libgsl)continue-on-error: truewith accurate attribution comment; the root cause is architectural (upstream JMEOS), not a CI configuration issueUpstream issue — JMEOS
functionsinterfaceJMEOS's
functionsinterface has 1683 method declarations. JNR-FFI's ASM code generator places all dispatch initializers in a single<clinit>()V; at ~50 bytes per method this produces ~80–130 KB of bytecode, exceeding the JVM's hard 64 KB method limit. The JDKProxy.newProxyInstancefallback (jnr.ffi.asm.enabled=false) hits the same limit. The fix requires JMEOS to splitfunctionsinto sub-interfaces of ≤ ~400 methods each (e.g. one per MEOS module: temporal, geo, span, npoint, cbuffer). Linux x86_64 is unaffected with the current JNR-FFI 2.2.17.Test plan
mobilityspark-spark.jar) downloads and contains MEOS + JNR-FFI