Skip to content

Commit 778666d

Browse files
jpoimboehcahca
authored andcommitted
s390: compile relocatable kernel without -fPIE
On s390, currently kernel uses the '-fPIE' compiler flag for compiling vmlinux. This has a few problems: - It uses dynamic symbols (.dynsym), for which the linker refuses to allow more than 64k sections. This can break features which use '-ffunction-sections' and '-fdata-sections', including kpatch-build [1] and Function Granular KASLR. - It unnecessarily uses GOT relocations, adding an extra layer of indirection for many memory accesses. Instead of using '-fPIE', resolve all the relocations at link time and then manually adjust any absolute relocations (R_390_64) during boot. This is done by first telling the linker to preserve all relocations during the vmlinux link. (Note this is harmless: they are later stripped in the vmlinux.bin link.) Then use the 'relocs' tool to find all absolute relocations (R_390_64) which apply to allocatable sections. The offsets of those relocations are saved in a special section which is then used to adjust the relocations during boot. (Note: For some reason, Clang occasionally creates a GOT reference, even without '-fPIE'. So Clang-compiled kernels have a GOT, which needs to be adjusted.) On my mostly-defconfig kernel, this reduces kernel text size by ~1.3%. [1] dynup/kpatch#1284 [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html [3] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html Compiler consideration: Gcc recently implemented an optimization [2] for loading symbols without explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates symbols to reside on a 2-byte boundary, enabling the use of the larl instruction. However, kernel linker scripts may still generate unaligned symbols. To address this, a new -munaligned-symbols option has been introduced [3] in recent gcc versions. This option has to be used with future gcc versions. Older Clang lacks support for handling unaligned symbols generated by kernel linker scripts when the kernel is built without -fPIE. However, future versions of Clang will include support for the -munaligned-symbols option. When the support is unavailable, compile the kernel with -fPIE to maintain the existing behavior. In addition to it: move vmlinux.relocs to safe relocation When the kernel is built with CONFIG_KERNEL_UNCOMPRESSED, the entire uncompressed vmlinux.bin is positioned in the bzImage decompressor image at the default kernel LMA of 0x100000, enabling it to be executed in-place. However, the size of .vmlinux.relocs could be large enough to cause an overlap with the uncompressed kernel at the address 0x100000. To address this issue, .vmlinux.relocs is positioned after the .rodata.compressed in the bzImage. Nevertheless, in this configuration, vmlinux.relocs will overlap with the .bss section of vmlinux.bin. To overcome that, move vmlinux.relocs to a safe location before clearing .bss and handling relocs. Compile warning fix from Sumanth Korikkar: When kernel is built with CONFIG_LD_ORPHAN_WARN and -fno-PIE, there are several warnings: ld: warning: orphan section `.rela.iplt' from `arch/s390/kernel/head64.o' being placed in section `.rela.dyn' ld: warning: orphan section `.rela.head.text' from `arch/s390/kernel/head64.o' being placed in section `.rela.dyn' ld: warning: orphan section `.rela.init.text' from `arch/s390/kernel/head64.o' being placed in section `.rela.dyn' ld: warning: orphan section `.rela.rodata.cst8' from `arch/s390/kernel/head64.o' being placed in section `.rela.dyn' Orphan sections are sections that exist in an object file but don't have a corresponding output section in the final executable. ld raises a warning when it identifies such sections. Eliminate the warning by placing all .rela orphan sections in .rela.dyn and raise an error when size of .rela.dyn is greater than zero. i.e. Dont just neglect orphan sections. This is similar to adjustment performed in x86, where kernel is built with -fno-PIE. commit 5354e84 ("x86/build: Add asserts for unwanted sections") [sumanthk@linux.ibm.com: rebased Josh Poimboeuf patches and move vmlinux.relocs to safe location] [hca@linux.ibm.com: merged compile warning fix from Sumanth] Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Link: https://lore.kernel.org/r/20240219132734.22881-4-sumanthk@linux.ibm.com Link: https://lore.kernel.org/r/20240219132734.22881-5-sumanthk@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
1 parent 55dc65b commit 778666d

File tree

9 files changed

+145
-13
lines changed

9 files changed

+145
-13
lines changed

arch/s390/Kconfig

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -583,14 +583,23 @@ config RELOCATABLE
583583
help
584584
This builds a kernel image that retains relocation information
585585
so it can be loaded at an arbitrary address.
586-
The kernel is linked as a position-independent executable (PIE)
587-
and contains dynamic relocations which are processed early in the
588-
bootup process.
589586
The relocations make the kernel image about 15% larger (compressed
590587
10%), but are discarded at runtime.
591588
Note: this option exists only for documentation purposes, please do
592589
not remove it.
593590

591+
config PIE_BUILD
592+
def_bool CC_IS_CLANG && !$(cc-option,-munaligned-symbols)
593+
help
594+
If the compiler is unable to generate code that can manage unaligned
595+
symbols, the kernel is linked as a position-independent executable
596+
(PIE) and includes dynamic relocations that are processed early
597+
during bootup.
598+
599+
For kpatch functionality, it is recommended to build the kernel
600+
without the PIE_BUILD option. PIE_BUILD is only enabled when the
601+
compiler lacks proper support for handling unaligned symbols.
602+
594603
config RANDOMIZE_BASE
595604
bool "Randomize the address of the kernel image (KASLR)"
596605
default y

arch/s390/Makefile

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,14 @@ KBUILD_AFLAGS_MODULE += -fPIC
1414
KBUILD_CFLAGS_MODULE += -fPIC
1515
KBUILD_AFLAGS += -m64
1616
KBUILD_CFLAGS += -m64
17+
ifdef CONFIG_PIE_BUILD
1718
KBUILD_CFLAGS += -fPIE
1819
LDFLAGS_vmlinux := -pie -z notext
20+
else
21+
KBUILD_CFLAGS += $(call cc-option,-munaligned-symbols,)
22+
LDFLAGS_vmlinux := --emit-relocs --discard-none
23+
extra_tools := relocs
24+
endif
1925
aflags_dwarf := -Wa,-gdwarf-2
2026
KBUILD_AFLAGS_DECOMPRESSOR := $(CLANG_FLAGS) -m64 -D__ASSEMBLY__
2127
ifndef CONFIG_AS_IS_LLVM
@@ -143,7 +149,7 @@ archheaders:
143149

144150
archprepare:
145151
$(Q)$(MAKE) $(build)=$(syscalls) kapi
146-
$(Q)$(MAKE) $(build)=$(tools) kapi
152+
$(Q)$(MAKE) $(build)=$(tools) kapi $(extra_tools)
147153
ifeq ($(KBUILD_EXTMOD),)
148154
# We need to generate vdso-offsets.h before compiling certain files in kernel/.
149155
# In order to do that, we should use the archprepare target, but we can't since

arch/s390/boot/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
# SPDX-License-Identifier: GPL-2.0-only
22
image
33
bzImage
4+
relocs.S
45
section_cmp.*
56
vmlinux
67
vmlinux.lds

arch/s390/boot/Makefile

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,8 @@ CFLAGS_sclp_early_core.o += -I$(srctree)/drivers/s390/char
3737

3838
obj-y := head.o als.o startup.o physmem_info.o ipl_parm.o ipl_report.o vmem.o
3939
obj-y += string.o ebcdic.o sclp_early_core.o mem.o ipl_vmparm.o cmdline.o
40-
obj-y += version.o pgm_check_info.o ctype.o ipl_data.o machine_kexec_reloc.o
40+
obj-y += version.o pgm_check_info.o ctype.o ipl_data.o
41+
obj-y += $(if $(CONFIG_PIE_BUILD),machine_kexec_reloc.o,relocs.o)
4142
obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE)) += uv.o
4243
obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
4344
obj-y += $(if $(CONFIG_KERNEL_UNCOMPRESSED),,decompressor.o) info.o
@@ -48,6 +49,9 @@ targets := bzImage section_cmp.boot.data section_cmp.boot.preserved.data $(obj-y
4849
targets += vmlinux.lds vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2
4950
targets += vmlinux.bin.xz vmlinux.bin.lzma vmlinux.bin.lzo vmlinux.bin.lz4
5051
targets += vmlinux.bin.zst info.bin syms.bin vmlinux.syms $(obj-all)
52+
ifndef CONFIG_PIE_BUILD
53+
targets += relocs.S
54+
endif
5155

5256
OBJECTS := $(addprefix $(obj)/,$(obj-y))
5357
OBJECTS_ALL := $(addprefix $(obj)/,$(obj-all))
@@ -106,6 +110,14 @@ OBJCOPYFLAGS_vmlinux.bin := -O binary --remove-section=.comment --remove-section
106110
$(obj)/vmlinux.bin: vmlinux FORCE
107111
$(call if_changed,objcopy)
108112

113+
ifndef CONFIG_PIE_BUILD
114+
CMD_RELOCS=arch/s390/tools/relocs
115+
quiet_cmd_relocs = RELOCS $@
116+
cmd_relocs = $(CMD_RELOCS) $< > $@
117+
$(obj)/relocs.S: vmlinux FORCE
118+
$(call if_changed,relocs)
119+
endif
120+
109121
suffix-$(CONFIG_KERNEL_GZIP) := .gz
110122
suffix-$(CONFIG_KERNEL_BZIP2) := .bz2
111123
suffix-$(CONFIG_KERNEL_LZ4) := .lz4

arch/s390/boot/boot.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,14 @@ struct vmlinux_info {
2525
unsigned long bootdata_size;
2626
unsigned long bootdata_preserved_off;
2727
unsigned long bootdata_preserved_size;
28+
#ifdef CONFIG_PIE_BUILD
2829
unsigned long dynsym_start;
2930
unsigned long rela_dyn_start;
3031
unsigned long rela_dyn_end;
32+
#else
33+
unsigned long got_off;
34+
unsigned long got_size;
35+
#endif
3136
unsigned long amode31_size;
3237
unsigned long init_mm_off;
3338
unsigned long swapper_pg_dir_off;
@@ -83,6 +88,7 @@ extern unsigned long vmalloc_size;
8388
extern int vmalloc_size_set;
8489
extern char __boot_data_start[], __boot_data_end[];
8590
extern char __boot_data_preserved_start[], __boot_data_preserved_end[];
91+
extern int __vmlinux_relocs_64_start[], __vmlinux_relocs_64_end[];
8692
extern char _decompressor_syms_start[], _decompressor_syms_end[];
8793
extern char _stack_start[], _stack_end[];
8894
extern char _end[], _decompressor_end[];

arch/s390/boot/startup.c

Lines changed: 72 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,8 @@ static void copy_bootdata(void)
141141
memcpy((void *)vmlinux.bootdata_preserved_off, __boot_data_preserved_start, vmlinux.bootdata_preserved_size);
142142
}
143143

144-
static void handle_relocs(unsigned long offset)
144+
#ifdef CONFIG_PIE_BUILD
145+
static void kaslr_adjust_relocs(unsigned long min_addr, unsigned long offset)
145146
{
146147
Elf64_Rela *rela_start, *rela_end, *rela;
147148
int r_type, r_sym, rc;
@@ -172,6 +173,62 @@ static void handle_relocs(unsigned long offset)
172173
}
173174
}
174175

176+
static void kaslr_adjust_got(unsigned long offset) {}
177+
static void rescue_relocs(void) {}
178+
static void free_relocs(void) {}
179+
#else
180+
int *vmlinux_relocs_64_start;
181+
int *vmlinux_relocs_64_end;
182+
183+
static void rescue_relocs(void)
184+
{
185+
unsigned long size, nrelocs;
186+
187+
nrelocs = __vmlinux_relocs_64_end - __vmlinux_relocs_64_start;
188+
size = nrelocs * sizeof(uint32_t);
189+
vmlinux_relocs_64_start = (void *)physmem_alloc_top_down(RR_RELOC, size, 0);
190+
memmove(vmlinux_relocs_64_start, (void *)__vmlinux_relocs_64_start, size);
191+
vmlinux_relocs_64_end = vmlinux_relocs_64_start + nrelocs;
192+
}
193+
194+
static void free_relocs(void)
195+
{
196+
physmem_free(RR_RELOC);
197+
}
198+
199+
static void kaslr_adjust_relocs(unsigned long min_addr, unsigned long offset)
200+
{
201+
int *reloc;
202+
unsigned long max_addr = min_addr + vmlinux.image_size;
203+
long loc;
204+
205+
/* Adjust R_390_64 relocations */
206+
for (reloc = vmlinux_relocs_64_start;
207+
reloc < vmlinux_relocs_64_end && *reloc;
208+
reloc++) {
209+
loc = (long)*reloc + offset;
210+
if (loc < min_addr || loc > max_addr)
211+
error("64-bit relocation outside of kernel!\n");
212+
*(u64 *)loc += offset;
213+
}
214+
}
215+
216+
static void kaslr_adjust_got(unsigned long offset)
217+
{
218+
u64 *entry;
219+
220+
/*
221+
* Even without -fPIE, Clang still uses a global offset table for some
222+
* reason. Adjust the GOT entries.
223+
*/
224+
for (entry = (u64 *)vmlinux.got_off;
225+
entry < (u64 *)(vmlinux.got_off + vmlinux.got_size);
226+
entry++) {
227+
*entry += offset;
228+
}
229+
}
230+
#endif
231+
175232
/*
176233
* Merge information from several sources into a single ident_map_size value.
177234
* "ident_map_size" represents the upper limit of physical memory we may ever
@@ -299,14 +356,18 @@ static void setup_vmalloc_size(void)
299356
vmalloc_size = max(size, vmalloc_size);
300357
}
301358

302-
static void offset_vmlinux_info(unsigned long offset)
359+
static void kaslr_adjust_vmlinux_info(unsigned long offset)
303360
{
304361
*(unsigned long *)(&vmlinux.entry) += offset;
305362
vmlinux.bootdata_off += offset;
306363
vmlinux.bootdata_preserved_off += offset;
364+
#ifdef CONFIG_PIE_BUILD
307365
vmlinux.rela_dyn_start += offset;
308366
vmlinux.rela_dyn_end += offset;
309367
vmlinux.dynsym_start += offset;
368+
#else
369+
vmlinux.got_off += offset;
370+
#endif
310371
vmlinux.init_mm_off += offset;
311372
vmlinux.swapper_pg_dir_off += offset;
312373
vmlinux.invalid_pg_dir_off += offset;
@@ -361,14 +422,15 @@ void startup_kernel(void)
361422
detect_physmem_online_ranges(max_physmem_end);
362423
save_ipl_cert_comp_list();
363424
rescue_initrd(safe_addr, ident_map_size);
425+
rescue_relocs();
364426

365427
if (kaslr_enabled()) {
366428
vmlinux_lma = randomize_within_range(vmlinux.image_size + vmlinux.bss_size,
367429
THREAD_SIZE, vmlinux.default_lma,
368430
ident_map_size);
369431
if (vmlinux_lma) {
370432
__kaslr_offset = vmlinux_lma - vmlinux.default_lma;
371-
offset_vmlinux_info(__kaslr_offset);
433+
kaslr_adjust_vmlinux_info(__kaslr_offset);
372434
}
373435
}
374436
vmlinux_lma = vmlinux_lma ?: vmlinux.default_lma;
@@ -393,18 +455,20 @@ void startup_kernel(void)
393455
/*
394456
* The order of the following operations is important:
395457
*
396-
* - handle_relocs() must follow clear_bss_section() to establish static
458+
* - kaslr_adjust_relocs() must follow clear_bss_section() to establish static
397459
* memory references to data in .bss to be used by setup_vmem()
398460
* (i.e init_mm.pgd)
399461
*
400-
* - setup_vmem() must follow handle_relocs() to be able using
462+
* - setup_vmem() must follow kaslr_adjust_relocs() to be able using
401463
* static memory references to data in .bss (i.e init_mm.pgd)
402464
*
403-
* - copy_bootdata() must follow setup_vmem() to propagate changes to
404-
* bootdata made by setup_vmem()
465+
* - copy_bootdata() must follow setup_vmem() to propagate changes
466+
* to bootdata made by setup_vmem()
405467
*/
406468
clear_bss_section(vmlinux_lma);
407-
handle_relocs(__kaslr_offset);
469+
kaslr_adjust_relocs(vmlinux_lma, __kaslr_offset);
470+
kaslr_adjust_got(__kaslr_offset);
471+
free_relocs();
408472
setup_vmem(asce_limit);
409473
copy_bootdata();
410474

arch/s390/boot/vmlinux.lds.S

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,24 @@ SECTIONS
110110
_compressed_end = .;
111111
}
112112

113+
#ifndef CONFIG_PIE_BUILD
114+
/*
115+
* When the kernel is built with CONFIG_KERNEL_UNCOMPRESSED, the entire
116+
* uncompressed vmlinux.bin is positioned in the bzImage decompressor
117+
* image at the default kernel LMA of 0x100000, enabling it to be
118+
* executed in-place. However, the size of .vmlinux.relocs could be
119+
* large enough to cause an overlap with the uncompressed kernel at the
120+
* address 0x100000. To address this issue, .vmlinux.relocs is
121+
* positioned after the .rodata.compressed.
122+
*/
123+
. = ALIGN(4);
124+
.vmlinux.relocs : {
125+
__vmlinux_relocs_64_start = .;
126+
*(.vmlinux.relocs_64)
127+
__vmlinux_relocs_64_end = .;
128+
}
129+
#endif
130+
113131
#define SB_TRAILER_SIZE 32
114132
/* Trailer needed for Secure Boot */
115133
. += SB_TRAILER_SIZE; /* make sure .sb.trailer does not overwrite the previous section */

arch/s390/include/asm/physmem_info.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ enum reserved_range_type {
2222
RR_DECOMPRESSOR,
2323
RR_INITRD,
2424
RR_VMLINUX,
25+
RR_RELOC,
2526
RR_AMODE31,
2627
RR_IPLREPORT,
2728
RR_CERT_COMP_LIST,

arch/s390/kernel/vmlinux.lds.S

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,9 @@ SECTIONS
6363
*(.data.rel.ro .data.rel.ro.*)
6464
}
6565
.got : {
66+
__got_start = .;
6667
*(.got)
68+
__got_end = .;
6769
}
6870

6971
. = ALIGN(PAGE_SIZE);
@@ -190,6 +192,7 @@ SECTIONS
190192

191193
PERCPU_SECTION(0x100)
192194

195+
#ifdef CONFIG_PIE_BUILD
193196
.dynsym ALIGN(8) : {
194197
__dynsym_start = .;
195198
*(.dynsym)
@@ -206,6 +209,7 @@ SECTIONS
206209
.dynstr ALIGN(8) : {
207210
*(.dynstr)
208211
}
212+
#endif
209213
.hash ALIGN(8) : {
210214
*(.hash)
211215
}
@@ -235,9 +239,14 @@ SECTIONS
235239
QUAD(__boot_data_preserved_start) /* bootdata_preserved_off */
236240
QUAD(__boot_data_preserved_end -
237241
__boot_data_preserved_start) /* bootdata_preserved_size */
242+
#ifdef CONFIG_PIE_BUILD
238243
QUAD(__dynsym_start) /* dynsym_start */
239244
QUAD(__rela_dyn_start) /* rela_dyn_start */
240245
QUAD(__rela_dyn_end) /* rela_dyn_end */
246+
#else
247+
QUAD(__got_start) /* got_off */
248+
QUAD(__got_end - __got_start) /* got_size */
249+
#endif
241250
QUAD(_eamode31 - _samode31) /* amode31_size */
242251
QUAD(init_mm)
243252
QUAD(swapper_pg_dir)
@@ -268,6 +277,12 @@ SECTIONS
268277
*(.plt) *(.plt.*) *(.iplt) *(.igot .igot.plt)
269278
}
270279
ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!")
280+
#ifndef CONFIG_PIE_BUILD
281+
.rela.dyn : {
282+
*(.rela.*) *(.rela_*)
283+
}
284+
ASSERT(SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations (.rela) detected!")
285+
#endif
271286

272287
/* Sections to be discarded */
273288
DISCARDS

0 commit comments

Comments
 (0)