build: Filter rtapi module version scripts to defined symbols only#3925
build: Filter rtapi module version scripts to defined symbols only#3925grandixximo wants to merge 2 commits intoLinuxCNC:masterfrom
Conversation
b0bc902 to
2ece847
Compare
|
@BsAtHome please see if this lines up with your build direction. |
|
Tested on Gentoo with both GCC+BFD and Clang+LLD, cooling.hal and axis load perfectly. |
That quote is the real error. All components are supposed to export both rtapi_app_main() and rtapi_app_exit(). IMO, you should fix the components instead of the linker script. Just make an empty and add it like: void rtapi_app_exit() {} |
rtapi_app.h unconditionally declares EXPORT_SYMBOL for both
rtapi_app_main and rtapi_app_exit, so every module is expected to
provide both entry points. enum.c only implemented rtapi_app_main,
which slipped past GNU ld but trips lld 17+'s default
--no-undefined-version (LLVM D135402) when the generated version
script references the missing symbol:
ld.lld: error: version script assignment of 'global' to symbol
'rtapi_app_exit' failed: symbol not defined
Provide an empty rtapi_app_exit so enum.c satisfies the contract.
Builds clean with clang 19 / ld.lld 19 for a uspace configuration.
Fixes LinuxCNC#3191.
2ece847 to
387ccce
Compare
|
Confirmed, working with Clang+LLD. @andypugh I'd like to get this merged into |
|
I think the real problem here has been that the uspace run does not complain when rtapi_app_exit is missing. In rtapi/uspace_rtapi_app.cc:do_load_cmd() it is not checked that both rtapi_app_main and rtapi_app_exit. Only main is extracted and run. At unload in do_unload_cmd() it is ignored if the exit function is missing. I think it is necessary to fail hard in do_load_cmd() when the exit function is not present and refuse to load the component with the appropriate error message. |
Agreed, my thinking with the first attempt was misguided, I was implementing a work around to compile non-compliant components, the correct approach is refusing to load and force the component to be compliant. |
|
Right! Now, to be a bit pedantic... This: DLSYM<int(*)(void)>(module, "rtapi_app_exit")should probably read: DLSYM<void(*)(void)>(module, "rtapi_app_exit")(int -> void) It is also wrong in the original code (probably copy-paste). Not that it does matter (much) because the return value is not used anyway. However, the function rtapi_app_exit() is declared void and does not return a value. Therefore, indicating that it would is pure misdirection. Using the int return is correct for rtapi_app_main() because it has a return value. |
536145b to
d0224f4
Compare
@BsAtHome pedantic but entirely correct, the best kind. Fixed both casts... |
rtapi_app.h mandates that every component exports both rtapi_app_main and rtapi_app_exit, but do_load_cmd() only checked for main and do_unload_cmd() silently skipped the exit call when it was absent. That let enum.c slip into the tree without rtapi_app_exit for years (Fixes LinuxCNC#3191), visible only once lld 17+'s default --no-undefined-version made it a hard link error. Check for rtapi_app_exit in do_load_cmd() alongside rtapi_app_main and refuse to load the component otherwise, so future omissions surface immediately at load time with a clear error message. While here, also correct the dlsym cast for rtapi_app_exit in both do_load_cmd() and do_unload_cmd() from int(*)(void) to void(*)(void) to match the component's actual declaration. The int version was copy-pasted from the rtapi_app_main lookup and while harmless in practice (the return value was never read) it misrepresents the function signature.
d0224f4 to
d32fedc
Compare
rtapi_app.h unconditionally declares EXPORT_SYMBOL(rtapi_app_main) and EXPORT_SYMBOL(rtapi_app_exit), causing every module to list both names in its generated linker version script. Modules that only implement rtapi_app_main (e.g. hal/components/enum.c) then triggered lld's --no-undefined-version (default since LLD 17, see LLVM D135402):
Intersect the exported-name list with the actual defined symbols from nm -g --defined-only so the version script only mentions symbols that are really present in the linked object. Behavior under GNU ld is unchanged (the new list is a strict subset of the previous one), and LLD now accepts the build without needing --undefined-version.
Fixes #3191.