Skip to content

fix: ensure extensions link correctly with libneug on Linux#5

Merged
shirly121 merged 2 commits intomainfrom
fix-gcov-extension
Mar 8, 2026
Merged

fix: ensure extensions link correctly with libneug on Linux#5
shirly121 merged 2 commits intomainfrom
fix-gcov-extension

Conversation

@longbinlai
Copy link
Copy Markdown
Collaborator

@longbinlai longbinlai commented Mar 7, 2026

Summary

Fixes the undefined symbol error when loading the JSON extension via dlopen at runtime.

Root cause

When Python imports neug_py_bind, it loads libneug.so with RTLD_LOCAL by default, keeping all neug symbols private to the importing module. When the JSON extension is later loaded via dlopen() (triggered by LOAD JSON), the dynamic linker cannot resolve neug symbols (e.g., InvalidArgumentException), resulting in undefined symbol errors.

Changes

  1. tools/python_bind/neug/__init__.py: Set RTLD_GLOBAL before importing neug_py_bind so that libneug.so symbols are globally visible for subsequent dlopen calls. This is the standard pattern used by PyTorch, TensorFlow, and other Python packages that load plugins dynamically.

  2. extension/CMakeLists.txt: Add --coverage compile/link flags to extension builds when ENABLE_GCOV is active, ensuring consistent instrumentation with the core library (fixes the push-to-main GCOV failure).

  3. extension/json/CMakeLists.txt: Remove redundant ${ARROW_LIB} from the json extension link. Arrow is already statically embedded in libneug.so and propagated transitively. Duplicating it causes ODR violations.

Test plan

  • PR CI passes: extension Python test resolves all symbols via RTLD_GLOBAL
  • Push-to-main CI passes: coverage-instrumented extension links correctly

When ENABLE_GCOV=ON (push to main), src/ targets are compiled with
--coverage but extension targets are not. This mismatch causes the
extension shared library to fail resolving symbols from the
coverage-instrumented libneug.so at runtime via dlopen.

Also remove the redundant ${ARROW_LIB} link from the json extension,
since it is already transitively provided through neug's PUBLIC
dependency.

Made-with: Cursor
@longbinlai longbinlai changed the title fix: add --coverage flags to extension builds when GCOV is enabled fix: ensure extensions link correctly with libneug on Linux Mar 7, 2026
@longbinlai longbinlai force-pushed the fix-gcov-extension branch 3 times, most recently from ec810d2 to 0a1ad92 Compare March 7, 2026 15:10
When the host process (e.g. Python) loads libneug.so with RTLD_LOCAL,
neug symbols stay in a local scope and are invisible to extensions
loaded via dlopen, even though the extensions list libneug.so in
DT_NEEDED. This causes "undefined symbol" errors for neug symbols
like InvalidArgumentException.

Fix by re-opening libneug.so with RTLD_NOLOAD | RTLD_GLOBAL in the
load_extension() path. This promotes the already-loaded instance to
global visibility without reloading, allowing extensions to resolve
neug symbols. The promotion is done once, only when extensions are
actually loaded.

Also:
- Add --coverage flags for extension builds when ENABLE_GCOV is active
- Remove redundant ${ARROW_LIB} from json extension (avoids ODR with
  statically-linked Arrow already in libneug.so)
- Add diagnostic CI step for extension linkage debugging

Made-with: Cursor
@longbinlai longbinlai force-pushed the fix-gcov-extension branch from 0a1ad92 to 9f701ae Compare March 7, 2026 15:18
@longbinlai longbinlai requested a review from shirly121 March 8, 2026 03:58
@shirly121 shirly121 merged commit fedb5c5 into main Mar 8, 2026
15 of 16 checks passed
Louyk14 pushed a commit that referenced this pull request Mar 12, 2026
Introduce pybind and nexg_python_bind
@longbinlai longbinlai deleted the fix-gcov-extension branch March 17, 2026 09:54
BingqingLyu added a commit to BingqingLyu/neug that referenced this pull request Apr 8, 2026
- Fix empty endpoint override being ignored (comment #2)
  - Modified getOptionWithEnv to respect explicitly set empty values
  - Allows ENDPOINT_OVERRIDE="" to force default AWS S3 endpoint

- Fix HTTP ReadAt returning incorrect byte count (comment alibaba#3)
  - Changed ReadRange to return actual bytes read (Result<int64_t>)
  - Updated ReadAt to return actual bytes instead of requested nbytes
  - Handle zero-length reads and HTTP 416 responses
  - Use arrow::SliceBuffer for correct buffer sizing

- Fix HTTP timeout parsing uncaught exceptions (comment alibaba#4)
  - Added try-catch around std::stoi for timeout values
  - Added std::exception handler in OpenInputFile

- Fix s3_extension_test missing curl link library (comment alibaba#5)
  - Added CURL_LIBRARIES and CURL_INCLUDE_DIRS to test target
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants