-
-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offloading with GCC fails as not all .gnu.offload_funcs sections are merged #1196
Comments
The assembled result of file1.c doesn't seem to contain Here is the full output of assembly: https://gist.github.com/rui314/43d92d355b419fb9d2310df05302500b |
Looks as if I mixed up the file names – try file-2.c. (It is the file that contains '#pragma target' as that's where the host code needs to invoke the device code, i.e. he code inside the block following the pragma is run on the device while the arguments to the pragma are processed on the host. — The function name is the one of the containing function (here "main" followed by I pasted my .s file as comment to your gist: https://gist.github.com/rui314/43d92d355b419fb9d2310df05302500b#gistcomment-4918366 |
Can you try again with the above commit? |
Yes, I can confirm that it now works. |
Followup to #1190 / #1188.
Summary: Device side now okay (thanks!) but required host-side data missing.
The entries (well, one entry) in the '.gnu.offload_funcs sections' of 'file-1.c' (→ #1190 for the testcase) is ignored, leading to too few entries in a table and a run time fail.
Namely:
Background: For offloading, the run-time library needs to map a host function to a device function. It does so by creating an array of host-function pointers on one side and of device-functions on the other side – and then, when the the n-th host function is used, the run-time library calls the n-th device function.
The device-side code generation is handled via the LTO plugin as described in #1190 (now working) or at https://gcc.gnu.org/wiki/Offloading#Compilation_process – the latter also describes the following:
As the host side might be processed without LTO, the symbols are collected by writing them into special sections. GCC 14 has:
.gnu.offload_funcs
(host → device function mapping).gnu.offload_funcs
(host → device global variable mapping).gnu.offload_ind_funcs
(new since GCC 14: for device → host function mapping for some functions; used to permit passing host function pointers to the device and calling the device function there)Those are constructed as follows:
(A) The compiler links
crtoffloadbegin.o
andcrtoffloadend.o
, which is created from https://github.com/gcc-mirror/gcc/blob/master/libgcc/offloadstuff.c – the files contain (here only showing one section / array):The first file contains – note the explicit setting of the section name:
the last file contains:
And there is additionally crtoffloadtable.o, which shows how it is used.
(B) The
file-1.c
– see #1190 for the used example – contains the following:And the OFFLOAD_TABLE is passed (for each offload target) as argument to the run-time library together with information about the device side. — The number of entries is given by
((uintptr_t)&__offload_funcs_end – (uintptr_t))/sizeof(void*)
.Current run-time result with MOLD:
(The '2' instead of '1' is due to a separate loaded entry, i.e. '2' is correct but misleading.)
Thus, on the host side the '.gnu.offload_funcs' array is empty – missing the 'main._omp_fn.0' entry from 'file-1.s' – i.e. there is a problem with section merging.
The text was updated successfully, but these errors were encountered: