Skip to content

build: Filter rtapi module version scripts to defined symbols only#3925

Open
grandixximo wants to merge 2 commits intoLinuxCNC:masterfrom
grandixximo:fix-lld-version-script-undefined-symbols
Open

build: Filter rtapi module version scripts to defined symbols only#3925
grandixximo wants to merge 2 commits intoLinuxCNC:masterfrom
grandixximo:fix-lld-version-script-undefined-symbols

Conversation

@grandixximo
Copy link
Copy Markdown

rtapi_app.h unconditionally declares EXPORT_SYMBOL(rtapi_app_main) and EXPORT_SYMBOL(rtapi_app_exit), causing every module to list both names in its generated linker version script. Modules that only implement rtapi_app_main (e.g. hal/components/enum.c) then triggered lld's --no-undefined-version (default since LLD 17, see LLVM D135402):

ld.lld: error: version script assignment of 'global' to symbol
'rtapi_app_exit' failed: symbol not defined

Intersect the exported-name list with the actual defined symbols from nm -g --defined-only so the version script only mentions symbols that are really present in the linked object. Behavior under GNU ld is unchanged (the new list is a strict subset of the previous one), and LLD now accepts the build without needing --undefined-version.

Fixes #3191.

@grandixximo grandixximo force-pushed the fix-lld-version-script-undefined-symbols branch from b0bc902 to 2ece847 Compare April 11, 2026 06:18
@grandixximo
Copy link
Copy Markdown
Author

@BsAtHome please see if this lines up with your build direction.

@NTULINUX NTULINUX mentioned this pull request Apr 11, 2026
@NTULINUX
Copy link
Copy Markdown
Contributor

Tested on Gentoo with both GCC+BFD and Clang+LLD, cooling.hal and axis load perfectly.

@BsAtHome
Copy link
Copy Markdown
Contributor

Modules that only implement rtapi_app_main

That quote is the real error.

All components are supposed to export both rtapi_app_main() and rtapi_app_exit(). IMO, you should fix the components instead of the linker script. Just make an empty and add it like:

void rtapi_app_exit() {}

rtapi_app.h unconditionally declares EXPORT_SYMBOL for both
rtapi_app_main and rtapi_app_exit, so every module is expected to
provide both entry points. enum.c only implemented rtapi_app_main,
which slipped past GNU ld but trips lld 17+'s default
--no-undefined-version (LLVM D135402) when the generated version
script references the missing symbol:

    ld.lld: error: version script assignment of 'global' to symbol
    'rtapi_app_exit' failed: symbol not defined

Provide an empty rtapi_app_exit so enum.c satisfies the contract.
Builds clean with clang 19 / ld.lld 19 for a uspace configuration.

Fixes LinuxCNC#3191.
@grandixximo grandixximo force-pushed the fix-lld-version-script-undefined-symbols branch from 2ece847 to 387ccce Compare April 11, 2026 08:07
@grandixximo
Copy link
Copy Markdown
Author

@BsAtHome you're right, this is the better fix. enum.tmp is the only object with rtapi_app_main but no rtapi_app_exit I was able to find.

@NTULINUX could you pull this and confirm it builds and runs on your Gentoo setup?

@NTULINUX
Copy link
Copy Markdown
Contributor

NTULINUX commented Apr 11, 2026

Confirmed, working with Clang+LLD. @andypugh I'd like to get this merged into 2.9 as well since it impossible to build that branch with a full, recent LLVM toolchain.

@BsAtHome
Copy link
Copy Markdown
Contributor

I think the real problem here has been that the uspace run does not complain when rtapi_app_exit is missing.

In rtapi/uspace_rtapi_app.cc:do_load_cmd() it is not checked that both rtapi_app_main and rtapi_app_exit. Only main is extracted and run. At unload in do_unload_cmd() it is ignored if the exit function is missing.

I think it is necessary to fail hard in do_load_cmd() when the exit function is not present and refuse to load the component with the appropriate error message.

@grandixximo
Copy link
Copy Markdown
Author

grandixximo commented Apr 11, 2026

I think the real problem here has been that the uspace run does not complain when rtapi_app_exit is missing.

In rtapi/uspace_rtapi_app.cc:do_load_cmd() it is not checked that both rtapi_app_main and rtapi_app_exit. Only main is extracted and run. At unload in do_unload_cmd() it is ignored if the exit function is missing.

I think it is necessary to fail hard in do_load_cmd() when the exit function is not present and refuse to load the component with the appropriate error message.

Agreed, my thinking with the first attempt was misguided, I was implementing a work around to compile non-compliant components, the correct approach is refusing to load and force the component to be compliant.

@BsAtHome
Copy link
Copy Markdown
Contributor

Right! Now, to be a bit pedantic...

This:

DLSYM<int(*)(void)>(module, "rtapi_app_exit")

should probably read:

DLSYM<void(*)(void)>(module, "rtapi_app_exit")

(int -> void) It is also wrong in the original code (probably copy-paste).

Not that it does matter (much) because the return value is not used anyway. However, the function rtapi_app_exit() is declared void and does not return a value. Therefore, indicating that it would is pure misdirection. Using the int return is correct for rtapi_app_main() because it has a return value.

@grandixximo grandixximo force-pushed the fix-lld-version-script-undefined-symbols branch from 536145b to d0224f4 Compare April 11, 2026 10:47
@grandixximo
Copy link
Copy Markdown
Author

Right! Now, to be a bit pedantic...

@BsAtHome pedantic but entirely correct, the best kind.

Fixed both casts...

rtapi_app.h mandates that every component exports both rtapi_app_main
and rtapi_app_exit, but do_load_cmd() only checked for main and
do_unload_cmd() silently skipped the exit call when it was absent.
That let enum.c slip into the tree without rtapi_app_exit for years
(Fixes LinuxCNC#3191), visible only once lld 17+'s default
--no-undefined-version made it a hard link error.

Check for rtapi_app_exit in do_load_cmd() alongside rtapi_app_main
and refuse to load the component otherwise, so future omissions
surface immediately at load time with a clear error message.

While here, also correct the dlsym cast for rtapi_app_exit in both
do_load_cmd() and do_unload_cmd() from int(*)(void) to void(*)(void)
to match the component's actual declaration. The int version was
copy-pasted from the rtapi_app_main lookup and while harmless in
practice (the return value was never read) it misrepresents the
function signature.
@grandixximo grandixximo force-pushed the fix-lld-version-script-undefined-symbols branch from d0224f4 to d32fedc Compare April 11, 2026 11:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ld.lld: error: version script assignment of 'global' to symbol 'rtapi_app_exit' failed: symbol not defined

3 participants