Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Binutils] Generate invalid zero words in function #41

Closed
jiegec opened this issue Feb 3, 2024 · 15 comments
Closed

[Binutils] Generate invalid zero words in function #41

jiegec opened this issue Feb 3, 2024 · 15 comments

Comments

@jiegec
Copy link

jiegec commented Feb 3, 2024

Compile and link the following code:

#include <stdio.h>
__attribute__((noinline)) int c() {
        printf("in c\n");
}
__attribute__((noinline)) int d() {
        printf("in d\n");
}
int test(int a, int b) {
        if (a == 0 && b > 0)
                c();
        printf("hello\n");
        d();
}

int main() {
        int a, b;
        scanf("%d%d", &a, &b);
        return test(a, b);
}

Using binutils 2.42 and gcc 13.2.0 from AOSC OS:

gcc -O2 test.c -o test

Generates invalid 0x00000000 instruction in test:

0000000000000980 <test>:
 980:   02ffc063        addi.d          $sp, $sp, -16
 984:   29c02061        st.d            $ra, $sp, 8
 988:   44000c80        bnez            $a0, 12 # 994 <test+0x14>
 98c:   60001c05        bgtz            $a1, 28 # 9a8 <test+0x28>
 990:   00000000        .word           0x00000000
 994:   18000224        pcaddi          $a0, 17
 998:   57fdabff        bl              -600    # 740 <puts@plt>
 99c:   28c02061        ld.d            $ra, $sp, 8
 9a0:   02c04063        addi.d          $sp, $sp, 16
 9a4:   53ffbfff        b               -68     # 960 <d>
 9a8:   57ff9bff        bl              -104    # 940 <c>
 9ac:   18000164        pcaddi          $a0, 11
 9b0:   57fd93ff        bl              -624    # 740 <puts@plt>
 9b4:   28c02061        ld.d            $ra, $sp, 8
 9b8:   02c04063        addi.d          $sp, $sp, 16
 9bc:   53ffa7ff        b               -92     # 960 <d>

Which can lead to SIGILL:

$ ./test
0 0
fish: Job 1, './test' terminated by signal SIGILL (Illegal instruction)

CC @xry111

@xry111
Copy link
Member

xry111 commented Feb 3, 2024

I cannot reproduce it. Note that AOSC is applying a bunch of patches onto GCC 13 (and maybe Binutils, I've not checked their Binutils repo).

With vanilla GCC 13.2.0 and -O2:

0000000000000910 <test>:
 910:	02ffc063 	addi.d      	$sp, $sp, -16
 914:	29c02061 	st.d        	$ra, $sp, 8
 918:	44000880 	bnez        	$a0, 8	# 920 <test+0x10>
 91c:	60001c05 	bgtz        	$a1, 28	# 938 <test+0x28>
 920:	1a000024 	pcalau12i   	$a0, 1
 924:	02e5c084 	addi.d      	$a0, $a0, -1680
 928:	57fdabff 	bl          	-600	# 6d0 <puts@plt>
 92c:	28c02061 	ld.d        	$ra, $sp, 8
 930:	02c04063 	addi.d      	$sp, $sp, 16
 934:	53ffd3ff 	b           	-48	# 904 <d>
 938:	57ffc3ff 	bl          	-64	# 8f8 <c>
 93c:	1a000024 	pcalau12i   	$a0, 1
 940:	02e5c084 	addi.d      	$a0, $a0, -1680
 944:	57fd8fff 	bl          	-628	# 6d0 <puts@plt>
 948:	28c02061 	ld.d        	$ra, $sp, 8
 94c:	02c04063 	addi.d      	$sp, $sp, 16
 950:	53ffb7ff 	b           	-76	# 904 <d>

With vanilla GCC 13.2.0 and -O2 -mno-explicit-relocs:

 908:	02ffc063 	addi.d      	$sp, $sp, -16
 90c:	29c02061 	st.d        	$ra, $sp, 8
 910:	44000880 	bnez        	$a0, 8	# 918 <test+0x10>
 914:	60001805 	bgtz        	$a1, 24	# 92c <test+0x24>
 918:	18000244 	pcaddi      	$a0, 18
 91c:	57fdb7ff 	bl          	-588	# 6d0 <puts@plt>
 920:	28c02061 	ld.d        	$ra, $sp, 8
 924:	02c04063 	addi.d      	$sp, $sp, 16
 928:	53ffdbff 	b           	-40	# 900 <d>
 92c:	57ffcfff 	bl          	-52	# 8f8 <c>
 930:	18000184 	pcaddi      	$a0, 12
 934:	57fd9fff 	bl          	-612	# 6d0 <puts@plt>
 938:	28c02061 	ld.d        	$ra, $sp, 8
 93c:	02c04063 	addi.d      	$sp, $sp, 16
 940:	53ffc3ff 	b           	-64	# 900 <d>

With GCC 14 (r14-8422 + some patches I've mentioned at loongson-community/areweloongyet#16 (comment)):

 960:	02ffc063 	addi.d      	$sp, $sp, -16
 964:	29c02061 	st.d        	$ra, $sp, 8
 968:	44000880 	bnez        	$a0, 8	# 970 <test+0x10>
 96c:	60002405 	bgtz        	$a1, 36	# 990 <test+0x30>
 970:	18000284 	pcaddi      	$a0, 20
 974:	57fd5fff 	bl          	-676	# 6d0 <puts@plt>
 978:	28c02061 	ld.d        	$ra, $sp, 8
 97c:	02c04063 	addi.d      	$sp, $sp, 16
 980:	53ffc3ff 	b           	-64	# 940 <d>
 984:	03400000 	nop
 988:	03400000 	nop
 98c:	03400000 	nop
 990:	57ff93ff 	bl          	-112	# 920 <c>
 994:	18000164 	pcaddi      	$a0, 11
 998:	57fd3bff 	bl          	-712	# 6d0 <puts@plt>
 99c:	28c02061 	ld.d        	$ra, $sp, 8
 9a0:	02c04063 	addi.d      	$sp, $sp, 16
 9a4:	53ff9fff 	b           	-100	# 940 <d>

Is there an assembly output from AOSC GCC?

@xen0n
Copy link
Member

xen0n commented Feb 3, 2024

Cannot reproduce with almost-up-to-date snapshot of gcc and binutils-2.42 on Gentoo:

$ gcc --version
gcc (Gentoo 14.0.1_pre20240128 p17) 14.0.1 20240128 (experimental)
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ ld --version
GNU ld (Gentoo 2.42 p1) 2.42.0
Copyright (C) 2024 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.
$ gcc -O2 issue41.c -o issue41
issue41.c: In function ‘main’:
issue41.c:17:9: warning: ignoring return value of ‘scanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
   17 |         scanf("%d%d", &a, &b);
      |         ^~~~~~~~~~~~~~~~~~~~~
$ echo 1 2 | ./issue41 
hello
in d
$ echo $?
5
$ gcc -O2 issue41.c -c -o issue41.o
issue41.c: In function ‘main’:
issue41.c:17:9: warning: ignoring return value of ‘scanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
   17 |         scanf("%d%d", &a, &b);
      |         ^~~~~~~~~~~~~~~~~~~~~
$ objdump -dwr issue41.o

issue41.o:     file format elf64-loongarch


Disassembly of section .text:

0000000000000000 <.Lla-relax-align>:
   0:	03400000 	nop	0: R_LARCH_ALIGN	.Lla-relax-align+0x5
   4:	03400000 	nop
   8:	03400000 	nop
   c:	03400000 	nop
  10:	03400000 	nop
  14:	03400000 	nop
  18:	03400000 	nop

000000000000001c <c>:
  1c:	1a000004 	pcalau12i   	$a0, 0	1c: R_LARCH_PCALA_HI20	.LC0
	1c: R_LARCH_RELAX	*ABS*
  20:	02c00084 	addi.d      	$a0, $a0, 0	20: R_LARCH_PCALA_LO12	.LC0
	20: R_LARCH_RELAX	*ABS*
  24:	50000000 	b           	0	# 24 <c+0x8>	24: R_LARCH_B26	puts

0000000000000028 <L0^A>:
  28:	03400000 	nop	28: R_LARCH_ALIGN	.Lla-relax-align+0x5
  2c:	03400000 	nop
  30:	03400000 	nop
  34:	03400000 	nop
  38:	03400000 	nop
  3c:	03400000 	nop
  40:	03400000 	nop

0000000000000044 <d>:
  44:	1a000004 	pcalau12i   	$a0, 0	44: R_LARCH_PCALA_HI20	.LC1
	44: R_LARCH_RELAX	*ABS*
  48:	02c00084 	addi.d      	$a0, $a0, 0	48: R_LARCH_PCALA_LO12	.LC1
	48: R_LARCH_RELAX	*ABS*
  4c:	50000000 	b           	0	# 4c <d+0x8>	4c: R_LARCH_B26	puts

0000000000000050 <L0^A>:
  50:	03400000 	nop	50: R_LARCH_ALIGN	.Lla-relax-align+0x5
  54:	03400000 	nop
  58:	03400000 	nop
  5c:	03400000 	nop
  60:	03400000 	nop
  64:	03400000 	nop
  68:	03400000 	nop

000000000000006c <test>:
  6c:	02ffc063 	addi.d      	$sp, $sp, -16
  70:	29c02061 	st.d        	$ra, $sp, 8

0000000000000074 <L0^A>:
  74:	44001480 	bnez        	$a0, 20	# 88 <.L5>	74: R_LARCH_B21	.L5
  78:	60003405 	bgtz        	$a1, 52	# ac <L0^A>	78: R_LARCH_B16	.L7
  7c:	03400000 	nop	7c: R_LARCH_ALIGN	.Lla-relax-align+0x4
  80:	03400000 	nop
  84:	03400000 	nop

0000000000000088 <.L5>:
  88:	1a000004 	pcalau12i   	$a0, 0	88: R_LARCH_PCALA_HI20	.LC2
	88: R_LARCH_RELAX	*ABS*
  8c:	02c00084 	addi.d      	$a0, $a0, 0	8c: R_LARCH_PCALA_LO12	.LC2
	8c: R_LARCH_RELAX	*ABS*
  90:	54000000 	bl          	0	# 90 <.L5+0x8>	90: R_LARCH_B26	puts
  94:	28c02061 	ld.d        	$ra, $sp, 8

0000000000000098 <L0^A>:
  98:	02c04063 	addi.d      	$sp, $sp, 16

000000000000009c <L0^A>:
  9c:	50000000 	b           	0	# 9c <L0^A>	9c: R_LARCH_B26	d
  a0:	03400000 	nop	a0: R_LARCH_ALIGN	.Lla-relax-align+0x4
  a4:	03400000 	nop
  a8:	03400000 	nop

00000000000000ac <L0^A>:
  ac:	54000000 	bl          	0	# ac <L0^A>	ac: R_LARCH_B26	c
  b0:	1a000004 	pcalau12i   	$a0, 0	b0: R_LARCH_PCALA_HI20	.LC2
	b0: R_LARCH_RELAX	*ABS*
  b4:	02c00084 	addi.d      	$a0, $a0, 0	b4: R_LARCH_PCALA_LO12	.LC2
	b4: R_LARCH_RELAX	*ABS*
  b8:	54000000 	bl          	0	# b8 <L0^A+0xc>	b8: R_LARCH_B26	puts
  bc:	28c02061 	ld.d        	$ra, $sp, 8

00000000000000c0 <L0^A>:
  c0:	02c04063 	addi.d      	$sp, $sp, 16
  c4:	50000000 	b           	0	# c4 <L0^A+0x4>	c4: R_LARCH_B26	d

Disassembly of section .text.startup:

0000000000000000 <main-0x1c>:
   0:	03400000 	nop	0: R_LARCH_ALIGN	.Lla-relax-align+0x5
   4:	03400000 	nop
   8:	03400000 	nop
   c:	03400000 	nop
  10:	03400000 	nop
  14:	03400000 	nop
  18:	03400000 	nop

000000000000001c <main>:
  1c:	02ff8063 	addi.d      	$sp, $sp, -32
  20:	29c04077 	st.d        	$s0, $sp, 16

0000000000000024 <L0^A>:
  24:	1a000017 	pcalau12i   	$s0, 0	24: R_LARCH_GOT_PC_HI20	__stack_chk_guard
	24: R_LARCH_RELAX	*ABS*
  28:	28c002f7 	ld.d        	$s0, $s0, 0	28: R_LARCH_GOT_PC_LO12	__stack_chk_guard
	28: R_LARCH_RELAX	*ABS*
  2c:	260002ec 	ldptr.d     	$t0, $s0, 0
  30:	02c01066 	addi.d      	$a2, $sp, 4
  34:	00150065 	move        	$a1, $sp
  38:	1a000004 	pcalau12i   	$a0, 0	38: R_LARCH_PCALA_HI20	.LC3
	38: R_LARCH_RELAX	*ABS*
  3c:	02c00084 	addi.d      	$a0, $a0, 0	3c: R_LARCH_PCALA_LO12	.LC3
	3c: R_LARCH_RELAX	*ABS*
  40:	29c0206c 	st.d        	$t0, $sp, 8
  44:	29c06061 	st.d        	$ra, $sp, 24

0000000000000048 <L0^A>:
  48:	54000000 	bl          	0	# 48 <L0^A>	48: R_LARCH_B26	__isoc99_scanf
  4c:	24000465 	ldptr.w     	$a1, $sp, 4
  50:	24000064 	ldptr.w     	$a0, $sp, 0
  54:	54000000 	bl          	0	# 54 <L0^A+0xc>	54: R_LARCH_B26	test
  58:	28c0206d 	ld.d        	$t1, $sp, 8
  5c:	260002ec 	ldptr.d     	$t0, $s0, 0
  60:	5c0021ac 	bne         	$t1, $t0, 32	# 80 <L0^A>	60: R_LARCH_B16	.L11
  64:	28c06061 	ld.d        	$ra, $sp, 24

0000000000000068 <L0^A>:
  68:	28c04077 	ld.d        	$s0, $sp, 16
  6c:	02c08063 	addi.d      	$sp, $sp, 32

0000000000000070 <L0^A>:
  70:	4c000020 	ret
  74:	03400000 	nop	74: R_LARCH_ALIGN	.Lla-relax-align+0x4
  78:	03400000 	nop
  7c:	03400000 	nop

0000000000000080 <L0^A>:
  80:	54000000 	bl          	0	# 80 <L0^A>	80: R_LARCH_B26	__stack_chk_fail

@jiegec
Copy link
Author

jiegec commented Feb 4, 2024

Assembly and compiled object can be found at test.tar.gz

Assembly of test:

test:
.LFB13 = .
	.cfi_startproc
	addi.d	$r3,$r3,-16
	.cfi_def_cfa_offset 16
	st.d	$r1,$r3,8
	.cfi_offset 1, -8
	bnez	$r4,.L5
	bgt	$r5,$r0,.L13
	.align	4,54525952,4
.L5:
	la.local	$r4,.LC2
	bl	%plt(puts)
	ld.d	$r1,$r3,8
	.cfi_remember_state
	.cfi_restore 1
	addi.d	$r3,$r3,16
	.cfi_def_cfa_offset 0
	b	d
	.align	4,54525952,4
.L13:
	.cfi_restore_state
	bl	c
	la.local	$r4,.LC2
	bl	%plt(puts)
	ld.d	$r1,$r3,8
	.cfi_restore 1
	addi.d	$r3,$r3,16
	.cfi_def_cfa_offset 0
	b	d
	.cfi_endproc
.LFE13:
	.size	test, .-test
	.section	.rodata.str1.8
	.align	3
.LC3:
	.ascii	"%d%d\000"
$ objdump -dwr test.o
000000000000006c <test>:
  6c:   02ffc063        addi.d          $sp, $sp, -16
  70:   29c02061        st.d            $ra, $sp, 8

0000000000000074 <L0^A>:
  74:   44000c80        bnez            $a0, 12 # 80 <.L5>      74: R_LARCH_B21 .L5
  78:   60002005        bgtz            $a1, 32 # 98 <L0^A>     78: R_LARCH_B16 .L13
  7c:   00000000        .word           0x00000000

0000000000000080 <.L5>:
  80:   1a000004        pcalau12i       $a0, 0  80: R_LARCH_PCALA_HI20  .LC2
        80: R_LARCH_RELAX       *ABS*
  84:   02c00084        addi.d          $a0, $a0, 0     84: R_LARCH_PCALA_LO12  .LC2
        84: R_LARCH_RELAX       *ABS*
  88:   54000000        bl              0       # 88 <.L5+0x8>  88: R_LARCH_B26 puts
  8c:   28c02061        ld.d            $ra, $sp, 8

0000000000000090 <L0^A>:
  90:   02c04063        addi.d          $sp, $sp, 16

0000000000000094 <L0^A>:
  94:   50000000        b               0       # 94 <L0^A>     94: R_LARCH_B26 d

0000000000000098 <L0^A>:
  98:   54000000        bl              0       # 98 <L0^A>     98: R_LARCH_B26 c
  9c:   1a000004        pcalau12i       $a0, 0  9c: R_LARCH_PCALA_HI20  .LC2
        9c: R_LARCH_RELAX       *ABS*
  a0:   02c00084        addi.d          $a0, $a0, 0     a0: R_LARCH_PCALA_LO12  .LC2
        a0: R_LARCH_RELAX       *ABS*
  a4:   54000000        bl              0       # a4 <L0^A+0xc> a4: R_LARCH_B26 puts
  a8:   28c02061        ld.d            $ra, $sp, 8

00000000000000ac <L0^A>:
  ac:   02c04063        addi.d          $sp, $sp, 16
  b0:   50000000        b               0       # b0 <L0^A+0x4> b0: R_LARCH_B26 d

@jiegec
Copy link
Author

jiegec commented Feb 4, 2024

AOSC patches for GCC and Binutils.

@jiegec
Copy link
Author

jiegec commented Feb 4, 2024

The problem still persists using binutils master (commit 44c91f530fc9acc3a535953696d49633e5a7e24a)

@jiegec
Copy link
Author

jiegec commented Feb 4, 2024

Minimal reproducer:

        .section        .text,"ax",@progbits
        .align  2
        .align  5
        .globl  a
        .type   a, @function
a:
        .align  4,54525952,4
Disassembly of section .text:

0000000000000000 <.Lla-relax-align>:
   0:   03400000        nop             0: R_LARCH_ALIGN        .Lla-relax-align+0x5
   4:   03400000        nop
   8:   03400000        nop
   c:   03400000        nop
  10:   03400000        nop
  14:   03400000        nop
  18:   03400000        nop

000000000000001c <a>:
  1c:   00000000        .word           0x00000000

@jiegec
Copy link
Author

jiegec commented Feb 4, 2024

I see: in binutils 2.42, for .align 4,54525952,4, binutils only takes the lowest byte (arg = 0, fill_len = 1) from 54525952, which is zero:

      if (arg >= 0)
	fill_len = 1;
      else
	fill_len = -arg;

      if (fill_len <= 1)
	{
	  char fill_char = 0;

	  fill_char = fill;
	  do_align (align, &fill_char, fill_len, max);
	}

However, in binutils 2.41, arg=-4 and fill_len=4. The behavior was changed in bminor/binutils-gdb@27a750d.

The number was written by gcc, see gcc-mirror/gcc@b20c7ee.

@jiegec
Copy link
Author

jiegec commented Feb 4, 2024

So the answer is: binutils 2.42 assumes gcc with gcc-mirror/gcc@b20c7ee applied.

@jiegec jiegec closed this as completed Feb 4, 2024
@xry111
Copy link
Member

xry111 commented Feb 4, 2024

So it now seems making sense to backport b20c7ee066cb7d952fa193972e8bc6362c6e4063 to GCC 12 and 13.

And still Binutils should reject .align 4,54525952,4 if it cannot be supported with -mrelax instead of silently producing wrong code.

@xen0n
Copy link
Member

xen0n commented Feb 4, 2024

So it now seems making sense to backport b20c7ee066cb7d952fa193972e8bc6362c6e4063 to GCC 12 and 13.

Seems fine to me.

And still Binutils should reject .align 4,54525952,4 if it cannot be supported with -mrelax instead of silently producing wrong code.

Given the form of the now-problematic .align's is very distinguishable, and with unambiguous semantics, can we add detection of such usage to make it behave, while preserving new semantics for other usages? Or is such an approach too hacky in your mind?

@xry111
Copy link
Member

xry111 commented Feb 4, 2024

So it now seems making sense to backport b20c7ee066cb7d952fa193972e8bc6362c6e4063 to GCC 12 and 13.

Seems fine to me.

And still Binutils should reject .align 4,54525952,4 if it cannot be supported with -mrelax instead of silently producing wrong code.

Given the form of the now-problematic .align's is very distinguishable, and with unambiguous semantics, can we add detection of such usage to make it behave, while preserving new semantics for other usages? Or is such an approach too hacky in your mind?

We may special case the 2nd operand of .align and handle it properly if it's the NOP encoding.

But why not just support 3-operand .align unconditionally (with any 2nd operand)? I don't think it's always nonsense if the user wants to fill the alignment padding with a different instruction: at least a different "do nothing" instruction like or $r4,$r4,$r0 should be fine.

And for things like -falign-functions and -falign-jumps we can actually fill break or non-exist instructions instead of NOP: the padding added by -falign-functions and -falign-jumps should be unreachable in any normal execution, so a break or non-exist instruction in these cases can actually serve as a cheap hardening (reducing ROP gadgets) and maybe a hint to the uarch for branch prediction.

However doing that will need a revision of ABI. Current (2.30) wording indeed only allows NOP:

Alignment statement. If the symbol index is 0, the addend indicates the number of bytes occupied by nop instructions at the relocation offset. The alignment boundary is specified by the addend rounded up to the next power of two. If the symbol index is not 0, the addend indicates the first and third expressions of .align. The lowest 8 bits are used to represent the first expression, other bits are used to represent the third expression.

@xen0n
Copy link
Member

xen0n commented Feb 4, 2024

So it now seems making sense to backport b20c7ee066cb7d952fa193972e8bc6362c6e4063 to GCC 12 and 13.

Seems fine to me.

And still Binutils should reject .align 4,54525952,4 if it cannot be supported with -mrelax instead of silently producing wrong code.

Given the form of the now-problematic .align's is very distinguishable, and with unambiguous semantics, can we add detection of such usage to make it behave, while preserving new semantics for other usages? Or is such an approach too hacky in your mind?

We may special case the 2nd operand of .align and handle it properly if it's the NOP encoding.

But why not just support 3-operand .align unconditionally (with any 2nd operand)? I don't think it's always nonsense if the user wants to fill the alignment padding with a different instruction: at least a different "do nothing" instruction like or $r4,$r4,$r0 should be fine.

And for things like -falign-functions and -falign-jumps we can actually fill break or non-exist instructions instead of NOP: the padding added by -falign-functions and -falign-jumps should be unreachable in any normal execution, so a break or non-exist instruction in these cases can actually serve as a cheap hardening (reducing ROP gadgets) and maybe a hint to the uarch for branch prediction.

However doing that will need a revision of ABI. Current (2.30) wording indeed only allows NOP:

Alignment statement. If the symbol index is 0, the addend indicates the number of bytes occupied by nop instructions at the relocation offset. The alignment boundary is specified by the addend rounded up to the next power of two. If the symbol index is not 0, the addend indicates the first and third expressions of .align. The lowest 8 bits are used to represent the first expression, other bits are used to represent the third expression.

Aww. Indeed this would need attention from the Loongson teams if such an improvement is to be made; but in order to additionally stuff a 4-byte second expression into the reloc addend, the expressible range of the third expression must be reduced. I don't think this would create any serious problem though, because I expect large "maximum skip"'s (larger than e.g. 16 bits) to be practically non-existent.

(See GAS doc on .balign and .p2align, the padding is at most 4 bytes long; .align is meant to emulate the host and possibly non-GNU assembler's behavior, and doesn't have its 2nd expr's semantics fully specified, but as LoongArch's canonical toolchain is GNU, we are free to just adopt the GAS standard behavior for our .align too.)

@xen0n
Copy link
Member

xen0n commented Feb 4, 2024

Given we're little-endian throughout, if we just assume that very large maximum-skip values do not exist, the following change would be backwards-compatible (read: no real-world breakage expected):

-The lowest 8 bits are used to represent the first expression, other bits are used to represent the third expression.
+The lowest 8 bits are used to represent the first expression, the bits 8 to 31 inclusive the third expression, and the higher 32 bits the second expression.
+ If the bitfield for the second expression reads as zero, it is to be treated as a NOP (0x03400000).

This is all assuming ELF64. Elf32_Rela's addend field is only 32 bits wide, there's no way to stuff a max-32-bit value into the field while also having to make the field contain the other expressions as well. Which means we might have to introduce another reloc record like R_LARCH_ALIGN_PAD for expressing the second expression after all, but things start looking really fragile, and I definitely need more sleep or I'll most probably design bad ABI that would make me regret for the rest of my life.

@xry111
Copy link
Member

xry111 commented Feb 5, 2024

No, op2 is not needed to be encoded in the reloc. The instructions are already in the binary section and R_LARCH_ALIGN just delete some (or all) of them.

We just need to make the assembler stuff min((1 << op1) - 1, op3) / 4 instructions and an R_LARCH_ALIGN, and make the linker delete min((pc + (1 << op1) - 1) & ~((1 << op1) - 1), op3) / 4 instructions once we see an R_LARCH_ALIGN.

@xry111
Copy link
Member

xry111 commented Feb 6, 2024

GCC change backporting proposal: https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645067.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants