Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86] fdata-sections not work #133066

Closed
LukeSTM opened this issue Mar 26, 2025 · 9 comments
Closed

[X86] fdata-sections not work #133066

LukeSTM opened this issue Mar 26, 2025 · 9 comments
Labels
backend:X86 mc Machine (object) code question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!

Comments

@LukeSTM
Copy link

LukeSTM commented Mar 26, 2025

https://godbolt.org/z/357ro6MK3

demo:

#include <string.h>                                                                                                                                             
 
struct ss_t {
	int s_id;
	char msg[];
};

struct ss_t *mst_1;

void do_some_test(void)
{
	struct ss_t *mst_2;
	char buf[100];
	int fill_num=4;
	char tmp_str[20]="test string!";
	memcpy(buf,&fill_num,sizeof(int));
	memcpy(buf+sizeof(int), tmp_str, strlen(tmp_str) + 1); 
	mst_1=(struct ss_t *)buf;
 
	if (mst_1->s_id==4)
		printf("mst_1->s_id=%d,mst_1->msg=%s\n",mst_1->s_id,mst_1->msg);
}```

Here is a use case where it was found that the fdata-sections option is not worked for the x86 when using clang, but it works fine for aarch64. The gcc compiler also works fine for the x86 architecture.

gcc readelf:
Relocation section '.rela.text.do_some_test' at offset 0x908 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000012  000000050000000a R_X86_64_32            0000000000000000 .rodata.do_some_test.str1.1 + 0
000000000000001a  0000001100000002 R_X86_64_PC32          0000000000000000 .LC1 - 4
000000000000003d  0000001400000002 R_X86_64_PC32          0000000000000008 mst_1 - 4
000000000000005f  0000001500000002 R_X86_64_PC32          0000000000000000 printf - 4

llvm aarch64 readelf:
Relocation section '.rela.text.do_some_test' at offset 0x6d0 contains 7 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
000000000000000c  0000000400000113 R_AARCH64_ADR_PREL_PG_HI21 0000000000000000 .rodata..L__const.do_some_test.tmp_str + 0
0000000000000010  0000000400000115 R_AARCH64_ADD_ABS_LO12_NC 0000000000000000 .rodata..L__const.do_some_test.tmp_str + 0
000000000000001c  0000001500000113 R_AARCH64_ADR_PREL_PG_HI21 0000000000000000 mst_1 + 0
0000000000000028  0000000700000113 R_AARCH64_ADR_PREL_PG_HI21 0000000000000000 .rodata.str1.1 + 0
000000000000002c  0000000700000115 R_AARCH64_ADD_ABS_LO12_NC 0000000000000000 .rodata.str1.1 + 0
0000000000000044  000000150000011e R_AARCH64_LDST64_ABS_LO12_NC 0000000000000000 mst_1 + 0
0000000000000048  000000160000011b R_AARCH64_CALL26       0000000000000000 printf + 0

llvm x86 readelf:
Relocation section '.rela.text.do_some_test' at offset 0x600 contains 3 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000034  0000000900000002 R_X86_64_PC32          0000000000000000 mst_1 - 4
000000000000003b  0000000300000002 R_X86_64_PC32          0000000000000000 .L.str - 4
0000000000000047  0000000a00000004 R_X86_64_PLT32         0000000000000000 printf - 4
@LukeSTM
Copy link
Author

LukeSTM commented Mar 26, 2025

CC @topperc @pranavk @compnerd

@dtcxzyw dtcxzyw added backend:X86 mc Machine (object) code and removed new issue labels Mar 26, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 26, 2025

@llvm/issue-subscribers-backend-x86

Author: Luke (LukeSTM)

https://godbolt.org/z/357ro6MK3

demo:

#include &lt;string.h&gt;                                                                                                                                             
 
struct ss_t {
	int s_id;
	char msg[];
};

struct ss_t *mst_1;

void do_some_test(void)
{
	struct ss_t *mst_2;
	char buf[100];
	int fill_num=4;
	char tmp_str[20]="test string!";
	memcpy(buf,&amp;fill_num,sizeof(int));
	memcpy(buf+sizeof(int), tmp_str, strlen(tmp_str) + 1); 
	mst_1=(struct ss_t *)buf;
 
	if (mst_1-&gt;s_id==4)
		printf("mst_1-&gt;s_id=%d,mst_1-&gt;msg=%s\n",mst_1-&gt;s_id,mst_1-&gt;msg);
}```

Here is a use case where it was found that the fdata-sections option is not worked for the x86 when using clang, but it works fine for aarch64. The gcc compiler also works fine for the x86 architecture.

gcc readelf:
Relocation section '.rela.text.do_some_test' at offset 0x908 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000012  000000050000000a R_X86_64_32            0000000000000000 .rodata.do_some_test.str1.1 + 0
000000000000001a  0000001100000002 R_X86_64_PC32          0000000000000000 .LC1 - 4
000000000000003d  0000001400000002 R_X86_64_PC32          0000000000000008 mst_1 - 4
000000000000005f  0000001500000002 R_X86_64_PC32          0000000000000000 printf - 4

llvm aarch64 readelf:
Relocation section '.rela.text.do_some_test' at offset 0x6d0 contains 7 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
000000000000000c  0000000400000113 R_AARCH64_ADR_PREL_PG_HI21 0000000000000000 .rodata..L__const.do_some_test.tmp_str + 0
0000000000000010  0000000400000115 R_AARCH64_ADD_ABS_LO12_NC 0000000000000000 .rodata..L__const.do_some_test.tmp_str + 0
000000000000001c  0000001500000113 R_AARCH64_ADR_PREL_PG_HI21 0000000000000000 mst_1 + 0
0000000000000028  0000000700000113 R_AARCH64_ADR_PREL_PG_HI21 0000000000000000 .rodata.str1.1 + 0
000000000000002c  0000000700000115 R_AARCH64_ADD_ABS_LO12_NC 0000000000000000 .rodata.str1.1 + 0
0000000000000044  000000150000011e R_AARCH64_LDST64_ABS_LO12_NC 0000000000000000 mst_1 + 0
0000000000000048  000000160000011b R_AARCH64_CALL26       0000000000000000 printf + 0

llvm x86 readelf:
Relocation section '.rela.text.do_some_test' at offset 0x600 contains 3 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000034  0000000900000002 R_X86_64_PC32          0000000000000000 mst_1 - 4
000000000000003b  0000000300000002 R_X86_64_PC32          0000000000000000 .L.str - 4
0000000000000047  0000000a00000004 R_X86_64_PLT32         0000000000000000 printf - 4

</details>

@topperc
Copy link
Collaborator

topperc commented Mar 26, 2025

I'm not sure what I'm supposed to be seeing in the readelf dumps? Can you explain what you're not seeing that you expect?

@Zhenhang1213
Copy link
Contributor

I'm not sure what I'm supposed to be seeing in the readelf dumps? Can you explain what you're not seeing that you expect?

I think he means the difference of rodata

@compnerd
Copy link
Member

I think that @topperc is right - the relocation list doesn't really give much. The symbol list is going to be more useful to see what is being emitted where.

@LukeSTM
Copy link
Author

LukeSTM commented Mar 27, 2025

compile with -O2 -ffunction-sections -fdata-sections, i think in llvm x86 readelf dump should have a symbol .rodata..L__const.do_some_test.tmp_str like llvm aarch64 readelf dump

llvm x86
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS do_something.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 3 .text.do_some_test
3: 0000000000000000 30 OBJECT LOCAL DEFAULT 7 .L.str
4: 0000000000000000 80 FUNC GLOBAL DEFAULT 3 do_some_test
5: 0000000000000000 8 OBJECT GLOBAL DEFAULT 6 mst_1
6: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf

llvm aarch64
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS do_something.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 3 .text.do_some_test
3: 0000000000000000 0 NOTYPE LOCAL DEFAULT 3 $x.0
4: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .rodata..L__const.do_some_test.tmp_str
5: 0000000000000000 0 NOTYPE LOCAL DEFAULT 5 $d.1
6: 0000000000000000 0 NOTYPE LOCAL DEFAULT 6 $d.2
7: 0000000000000000 0 SECTION LOCAL DEFAULT 7 .rodata.str1.1
8: 0000000000000000 0 NOTYPE LOCAL DEFAULT 7 $d.3
9: 0000000000000000 0 NOTYPE LOCAL DEFAULT 8 $d.4
10: 0000000000000000 0 NOTYPE LOCAL DEFAULT 10 $d.5
11: 0000000000000000 88 FUNC GLOBAL DEFAULT 3 do_some_test
12: 0000000000000000 8 OBJECT GLOBAL DEFAULT 6 mst_1
13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf

gcc x86
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS do_something.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text
3: 0000000000000000 0 SECTION LOCAL DEFAULT 2 .data
4: 0000000000000000 0 SECTION LOCAL DEFAULT 3 .bss
5: 0000000000000000 0 SECTION LOCAL DEFAULT 4 .rodata.do_some_test.str1.1
6: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .text.do_some_test
7: 0000000000000000 0 SECTION LOCAL DEFAULT 7 .rodata.cst16
8: 0000000000000000 0 SECTION LOCAL DEFAULT 9 .note.GNU-stack
9: 0000000000000000 0 SECTION LOCAL DEFAULT 11 .eh_frame
10: 0000000000000000 0 NOTYPE LOCAL DEFAULT 7 .LC1
11: 0000000000000000 0 SECTION LOCAL DEFAULT 8 .comment
12: 0000000000000000 0 SECTION LOCAL DEFAULT 10 .note.gnu.property
13: 0000000000000000 109 FUNC GLOBAL DEFAULT 5 do_some_test
14: 0000000000000008 8 OBJECT GLOBAL DEFAULT COM mst_1
15: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf

@MaskRay
Copy link
Member

MaskRay commented Mar 27, 2025

The question is why the declaration char tmp_str[20] = "test string!"; doesn't result in a .rodata section in x86. The constant string is relatively short, yet in x86, it gets expanded into movabsq instructions (as seen in SelectionDAG::getMemcpy). On AArch64, however, this expansion doesn't occur, and the string is placed in a .rodata section instead.

@MaskRay MaskRay closed this as completed Mar 27, 2025
@MaskRay MaskRay added the question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! label Mar 27, 2025
@LukeSTM
Copy link
Author

LukeSTM commented Mar 28, 2025

I have a question. Since constant strings can be treated as instructions, why is the rodata section of constant strings still retained when using the fdata-section option?

@LukeSTM
Copy link
Author

LukeSTM commented Mar 28, 2025

CC @MaskRay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 mc Machine (object) code question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!
Projects
None yet
Development

No branches or pull requests

7 participants