Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hermiticity issues with ARCH=um #1715

Open
nickdesaulniers opened this issue Sep 21, 2022 · 4 comments
Open

hermiticity issues with ARCH=um #1715

nickdesaulniers opened this issue Sep 21, 2022 · 4 comments
Labels
[BUG] linux A bug that should be fixed in the mainline kernel. hermetic builds has implications for doing gcc/binutils-free builds

Comments

@nickdesaulniers
Copy link
Member

via https://lore.kernel.org/lkml/20220921064855.2841607-1-davidgow@google.com/ there's a comment:

Note that this still doesn't seem to be working properly with make
LLVM=1.

I checked our CI and sure enough I can see warnings from /usr/bin/ld.
https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/3095779516/jobs/5012260390
That's unexpected for LLVM=1 builds.

@nickdesaulniers nickdesaulniers added the [BUG] linux A bug that should be fixed in the mainline kernel. label Sep 21, 2022
@nathanchance
Copy link
Member

This is because UML links the final kernel and vDSO with $(CC), instead of $(LD), which means we use the default linker for the platform by default, which is ld.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/link-vmlinux.sh#n78
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/um/vdso/Makefile#n63

We should be able to stick -fuse-ld=lld somewhere in a similar vein as #774, as untangling the $(CC) linking might be a little difficult because UML is linked against user space libraries, so having the driver be the compiler makes sense.

@nickdesaulniers
Copy link
Member Author

nickdesaulniers commented Sep 21, 2022

because UML is linked against user space libraries, so having the driver be the compiler makes sense.

That shouldn't be a problem; you can pass -### to the compiler to observe precisely how it invokes the linker to see how various compiler flags translate to linker flags.

@sulix
Copy link
Member

sulix commented Sep 22, 2022

I tried this with --fuse-ld=lld and am getting a bunch of nasty errors from UML, largely to do with relocations and PIE. (This is not dissimilar to some of the issues we're seeing with Rust, as well: Rust-for-Linux#881).

So it looks like the hermeticity issues with $(LD) are only part of the issue here, and lld is presumably a bit stricter with what it accepts than ld.

Details below:

Patch
diff --git a/arch/um/Makefile b/arch/um/Makefile
index f1d4d67157be..01d9eae736be 100644
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -139,6 +139,9 @@ ifeq ($(CONFIG_LD_IS_BFD),y)
 LDFLAGS_EXECSTACK += $(call ld-option,--no-warn-rwx-segments)
 endif
 
+# Since we're using CC as the driver, we need to force LLD if it is requested.
+LINK-$(CONFIG_LD_IS_LLD) += $(call cc-option,--ld-path=$(LD),-fuse-ld=lld)
+
 LD_FLAGS_CMDLINE = $(foreach opt,$(KBUILD_LDFLAGS),-Wl,$(opt))
 
 # Used by link-vmlinux.sh which has special support for um link
Errors (CONFIG_STATIC_LINK=n)
ERROR:root:ld.lld: error: relocation R_X86_64_64 cannot be used against symbol '__init_begin'; recompile with -fPIC                                                                                                               [585/1935]
>>> defined in ./arch/um/kernel/vmlinux.lds:170                                                                       
>>> referenced by main.c                                                                                                                                                                                                                    
>>>               main.o:(.text+0x2) in archive init/built-in.a                                                                                                                                                                             
                                                                                                                                                                                                                                            
ld.lld: error: relocation R_X86_64_64 cannot be used against symbol '__init_end'; recompile with -fPIC                                                                                                                                      
>>> defined in ./arch/um/kernel/vmlinux.lds:172                                                                                                                                                                                             
>>> referenced by main.c                                                                                                                                                                                                                    
>>>               main.o:(.text+0xC) in archive init/built-in.a                                                       
                                                                                                                                                                                                                                            
ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC                                                                                                                                             
>>> defined in init/built-in.a(main.o)                                                                                
>>> referenced by main.c                                                                                              
>>>               main.o:(.text+0x16) in archive init/built-in.a                                                      
                                                                                                                      
ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(run_init_process) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(run_init_process) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(run_init_process) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(run_init_process) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(run_init_process) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(run_init_process) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(run_init_process) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC                                                                                                                                   [535/1935]
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(warn_bootconfig) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(init_setup) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(init_setup) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(rdinit_setup) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(rdinit_setup) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(parse_early_options) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(parse_early_options) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against symbol '__setup_end'; recompile with -fPIC
>>> referenced by main.c
>>>               main.o:(do_early_param) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against symbol '__setup_start'; recompile with -fPIC
>>> referenced by main.c
>>>               main.o:(do_early_param) in archive init/built-in.a

ld.lld: error: relocation R_X86_64_64 cannot be used against local symbol; recompile with -fPIC
>>> defined in init/built-in.a(main.o)
>>> referenced by main.c
>>>               main.o:(do_early_param) in archive init/built-in.a

ld.lld: error: too many errors emitted, stopping now (use -error-limit=0 to see all errors)
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Errors (CONFIG_STATIC_LINK=y)
ERROR:root:ld.lld: error: section: .got is not contiguous with other relro sections
ld.lld: error: section: .tdata is not contiguous with other relro sections
clang: error: linker command failed with exit code 1 (use -v to see invocation)

@nathanchance
Copy link
Member

With the -no-pie change in #1982 and linking with -Wl,-z,norelro to avoid the error: section: ... is not contiguous with other relro sections error, I can successfully link a kernel.

diff --git a/arch/um/Makefile b/arch/um/Makefile
index 34957dcb88b9..dd42248dead8 100644
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -120,6 +120,9 @@ LINK-$(call gcc-min-version, 60100)$(CONFIG_CC_IS_CLANG) += -no-pie
 endif
 LINK-$(CONFIG_LD_SCRIPT_DYN_RPATH) += -Wl,-rpath,/lib

+# Since we're using CC as the driver, we need to force LLD if it is requested.
+LINK-$(CONFIG_LD_IS_LLD) += $(call cc-option,--ld-path=$(LD),-fuse-ld=lld) -Wl,-z,norelro
+
 CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \
        -fno-stack-protector $(call cc-option, -fno-stack-protector-all)

Unfortunately, that kernel does not boot for me, at least with clang-17 and clang-18:

$ boot-uml.py -k .
...
<5>Linux version 6.7.0-00001-g3e57d05b0b07-dirty (nathan@dev-arch.thelio-3990X) (ClangBuiltLinux clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18), ClangBuiltLinux LLD 17.0.6) #1 Wed Jan 17 17:36:03 MST 2024
...
<5>Sorting __ex_table...
<6>Built 1 zonelists, mobility grouping on.  Total pages: 17182
<6>mem auto-init: stack:all(zero), heap alloc:off, heap free:off
<6>Memory: 57636K/69684K available (3949K kernel code, 1098K rwdata, 1192K rodata, 162K init, 225K bss, 12048K reserved, 0K cma-reserved)
<6>SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
<6>NR_IRQS: 64
<6>clocksource: timer: mask: 0xffffffffffffffff max_cycles: 0x1cd42e205, max_idle_ns: 881590404426 ns
<6>Calibrating delay loop... 8293.58 BogoMIPS (lpj=41467904)
<6>Checking that host ptys support output SIGIO...Yes
<6>pid_max: default: 32768 minimum: 301
<4>
<4>Modules linked in:
<6>Pid: 0, comm: swapper Not tainted 6.7.0-00001-g3e57d05b0b07-dirty
<6>RIP: 0033:net_ns_init+0x38/0x1ae
<6>RSP: 000000006053bf30  EFLAGS: 00010202
<6>RAX: 0000000060c02c00 RBX: 0000000060653c9c RCX: 00000000604d2038
<6>RDX: 0000000060c02ba0 RSI: 0000000000000000 RDI: 0000000060567ec0
<6>RBP: 000000006053bf40 R08: 00000000000001bc R09: 0000000000000000
<6>R10: 00000000643046f0 R11: 00000000000001c4 R12: 0000000000000000
<6>R13: 000000006005eb9b R14: 0000000060645f95 R15: 00000000603f095e
<0>Kernel panic - not syncing: Segfault with no mm
<4>CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-00001-g3e57d05b0b07-dirty #1
<4>Stack:
<4> 600191f4 60653c9c 6053bf80 600016a1
<4> 60531000 6053bfb0 00000000 00000000
<4> 00000000 ffffffffffffc000 6053bfa0 600040e5
<4>Call Trace:
<4> [<600191f4>] ? net_ns_init+0x0/0x1ae
<4> [<600016a1>] start_kernel+0x3c4/0x4c5
<4> [<600040e5>] start_kernel_proc+0x50/0x58
<4> [<600238b6>] new_thread_handler+0x9a/0xd1
<4> [<60026822>] uml_finishsetup+0x55/0x5a
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[BUG] linux A bug that should be fixed in the mainline kernel. hermetic builds has implications for doing gcc/binutils-free builds
Projects
None yet
Development

No branches or pull requests

3 participants