-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lld] Mark target section as code section when merging code sections into a data section. #72030
Conversation
@llvm/pr-subscribers-lld-coff @llvm/pr-subscribers-platform-windows Author: Jacek Caban (cjacek) ChangesI noticed this MS link.exe behavior while experimenting with #69101. This seems reasonable, esp. in case of uninitialized data and will be useful for a test case in #69101. Full diff: https://github.com/llvm/llvm-project/pull/72030.diff 2 Files Affected:
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index 960328d686852a3..80edda03a0a47e7 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -346,6 +346,13 @@ void OutputSection::merge(OutputSection *other) {
contribSections.insert(contribSections.end(), other->contribSections.begin(),
other->contribSections.end());
other->contribSections.clear();
+
+ // MS link.exe compatibility: when merging a code section into a data section,
+ // mark the target section as a code section.
+ if (other->header.Characteristics & IMAGE_SCN_CNT_CODE) {
+ header.Characteristics |= IMAGE_SCN_CNT_CODE;
+ header.Characteristics &= ~IMAGE_SCN_CNT_INITIALIZED_DATA | IMAGE_SCN_CNT_UNINITIALIZED_DATA;
+ }
}
// Write the section header to a given buffer.
diff --git a/lld/test/COFF/code-merge.s b/lld/test/COFF/code-merge.s
new file mode 100644
index 000000000000000..5728fbc93ff9b9d
--- /dev/null
+++ b/lld/test/COFF/code-merge.s
@@ -0,0 +1,56 @@
+# REQUIRES: x86
+
+# Test that merging code section into a data section changes its characteristics.
+
+# RUN: llvm-mc -triple x86_64-windows-msvc %s -filetype=obj -o %t.obj
+# RUN: lld-link -machine:amd64 -dll -noentry -out:%t.dll %t.obj -merge:.testx=.testd -merge:.testx2=.testbss
+# RUN: llvm-readobj --sections %t.dll | FileCheck %s
+
+# CHECK: Sections [
+# CHECK-NEXT: Section {
+# CHECK-NEXT: Number: 1
+# CHECK-NEXT: Name: .testbss (2E 74 65 73 74 62 73 73)
+# CHECK-NEXT: VirtualSize: 0xA
+# CHECK-NEXT: VirtualAddress: 0x1000
+# CHECK-NEXT: RawDataSize: 512
+# CHECK-NEXT: PointerToRawData: 0x400
+# CHECK-NEXT: PointerToRelocations: 0x0
+# CHECK-NEXT: PointerToLineNumbers: 0x0
+# CHECK-NEXT: RelocationCount: 0
+# CHECK-NEXT: LineNumberCount: 0
+# CHECK-NEXT: Characteristics [ (0x40000020)
+# CHECK-NEXT: IMAGE_SCN_CNT_CODE (0x20)
+# CHECK-NEXT: IMAGE_SCN_MEM_READ (0x40000000)
+# CHECK-NEXT: ]
+# CHECK-NEXT: }
+# CHECK-NEXT: Section {
+# CHECK-NEXT: Number: 2
+# CHECK-NEXT: Name: .testd (2E 74 65 73 74 64 00 00)
+# CHECK-NEXT: VirtualSize: 0xA
+# CHECK-NEXT: VirtualAddress: 0x2000
+# CHECK-NEXT: RawDataSize: 512
+# CHECK-NEXT: PointerToRawData: 0x600
+# CHECK-NEXT: PointerToRelocations: 0x0
+# CHECK-NEXT: PointerToLineNumbers: 0x0
+# CHECK-NEXT: RelocationCount: 0
+# CHECK-NEXT: LineNumberCount: 0
+# CHECK-NEXT: Characteristics [ (0x40000020)
+# CHECK-NEXT: IMAGE_SCN_CNT_CODE (0x20)
+# CHECK-NEXT: IMAGE_SCN_MEM_READ (0x40000000)
+# CHECK-NEXT: ]
+# CHECK-NEXT: }
+# CHECK-NEXT: ]
+
+ .section .testx, "xr"
+ movq $1, %rax
+ retq
+
+ .section .testx2, "xr"
+ movq $2, %rax
+ retq
+
+ .section .testd, "dr"
+ .word 1
+
+ .section .testbss, "br"
+ .word 0
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
285156c
to
889c45b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do remember something along these lines when I was playing with BFD and emulating link behaviour.
Thanks for review! I realized that the test for uninitialized data was not testing what I intended, it should be fixed in the fixup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
lld/test/COFF/code-merge.s
Outdated
# Test that merging code section into a data section changes its characteristics. | ||
|
||
# RUN: llvm-mc -triple x86_64-windows-msvc %s -filetype=obj -o %t.obj | ||
# RUN: lld-link -machine:amd64 -dll -noentry -out:%t.dll %t.obj -merge:.testx=.testd -merge:.testx2=.testbss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more text/data sections and test the other merge direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed more tests.
I noticed one more problem. To fill gaps between chunks, we memset the whole section to 0xcc. When merging uninitialized chunks into code section, it means that those chunks will not be zeroed. This is a preexisting problem, not specific to this PR, but this PR makes it a bit more widespread by marking more sections as code sections. I will take a look at that.
3fcd424
to
fe622d0
Compare
I created #72138 fixing the problem with gap fills filling uninitialized data chunks and rebased this PR on top of that. |
…s into a data sections.
fe622d0
to
ac9cc24
Compare
…into a data section. (llvm#72030)
I noticed this MS link.exe behavior while experimenting with #69101. This seems reasonable, esp. in case of uninitialized data and will be useful for a test case in #69101.