Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lld] Mark target section as code section when merging code sections into a data section. #72030

Merged
merged 1 commit into from
Nov 14, 2023

Conversation

cjacek
Copy link
Contributor

@cjacek cjacek commented Nov 11, 2023

I noticed this MS link.exe behavior while experimenting with #69101. This seems reasonable, esp. in case of uninitialized data and will be useful for a test case in #69101.

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 11, 2023

@llvm/pr-subscribers-lld-coff
@llvm/pr-subscribers-lld

@llvm/pr-subscribers-platform-windows

Author: Jacek Caban (cjacek)

Changes

I noticed this MS link.exe behavior while experimenting with #69101. This seems reasonable, esp. in case of uninitialized data and will be useful for a test case in #69101.


Full diff: https://github.com/llvm/llvm-project/pull/72030.diff

2 Files Affected:

  • (modified) lld/COFF/Writer.cpp (+7)
  • (added) lld/test/COFF/code-merge.s (+56)
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index 960328d686852a3..80edda03a0a47e7 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -346,6 +346,13 @@ void OutputSection::merge(OutputSection *other) {
   contribSections.insert(contribSections.end(), other->contribSections.begin(),
                          other->contribSections.end());
   other->contribSections.clear();
+
+  // MS link.exe compatibility: when merging a code section into a data section,
+  // mark the target section as a code section.
+  if (other->header.Characteristics & IMAGE_SCN_CNT_CODE) {
+    header.Characteristics |= IMAGE_SCN_CNT_CODE;
+    header.Characteristics &= ~IMAGE_SCN_CNT_INITIALIZED_DATA | IMAGE_SCN_CNT_UNINITIALIZED_DATA;
+  }
 }
 
 // Write the section header to a given buffer.
diff --git a/lld/test/COFF/code-merge.s b/lld/test/COFF/code-merge.s
new file mode 100644
index 000000000000000..5728fbc93ff9b9d
--- /dev/null
+++ b/lld/test/COFF/code-merge.s
@@ -0,0 +1,56 @@
+# REQUIRES: x86
+
+# Test that merging code section into a data section changes its characteristics.
+
+# RUN: llvm-mc -triple x86_64-windows-msvc %s -filetype=obj -o %t.obj
+# RUN: lld-link -machine:amd64 -dll -noentry -out:%t.dll %t.obj -merge:.testx=.testd -merge:.testx2=.testbss
+# RUN: llvm-readobj --sections %t.dll | FileCheck %s
+
+# CHECK:      Sections [
+# CHECK-NEXT:   Section {
+# CHECK-NEXT:     Number: 1
+# CHECK-NEXT:     Name: .testbss (2E 74 65 73 74 62 73 73)
+# CHECK-NEXT:     VirtualSize: 0xA
+# CHECK-NEXT:     VirtualAddress: 0x1000
+# CHECK-NEXT:     RawDataSize: 512
+# CHECK-NEXT:     PointerToRawData: 0x400
+# CHECK-NEXT:     PointerToRelocations: 0x0
+# CHECK-NEXT:     PointerToLineNumbers: 0x0
+# CHECK-NEXT:     RelocationCount: 0
+# CHECK-NEXT:     LineNumberCount: 0
+# CHECK-NEXT:     Characteristics [ (0x40000020)
+# CHECK-NEXT:       IMAGE_SCN_CNT_CODE (0x20)
+# CHECK-NEXT:       IMAGE_SCN_MEM_READ (0x40000000)
+# CHECK-NEXT:     ]
+# CHECK-NEXT:   }
+# CHECK-NEXT:   Section {
+# CHECK-NEXT:     Number: 2
+# CHECK-NEXT:     Name: .testd (2E 74 65 73 74 64 00 00)
+# CHECK-NEXT:     VirtualSize: 0xA
+# CHECK-NEXT:     VirtualAddress: 0x2000
+# CHECK-NEXT:     RawDataSize: 512
+# CHECK-NEXT:     PointerToRawData: 0x600
+# CHECK-NEXT:     PointerToRelocations: 0x0
+# CHECK-NEXT:     PointerToLineNumbers: 0x0
+# CHECK-NEXT:     RelocationCount: 0
+# CHECK-NEXT:     LineNumberCount: 0
+# CHECK-NEXT:     Characteristics [ (0x40000020)
+# CHECK-NEXT:       IMAGE_SCN_CNT_CODE (0x20)
+# CHECK-NEXT:       IMAGE_SCN_MEM_READ (0x40000000)
+# CHECK-NEXT:     ]
+# CHECK-NEXT:   }
+# CHECK-NEXT: ]
+
+        .section .testx, "xr"
+        movq $1, %rax
+        retq
+
+        .section .testx2, "xr"
+        movq $2, %rax
+        retq
+
+        .section .testd, "dr"
+        .word 1
+
+        .section .testbss, "br"
+        .word 0

Copy link

github-actions bot commented Nov 11, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@compnerd compnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do remember something along these lines when I was playing with BFD and emulating link behaviour.

@cjacek
Copy link
Contributor Author

cjacek commented Nov 11, 2023

Thanks for review! I realized that the test for uninitialized data was not testing what I intended, it should be fixed in the fixup.

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

# Test that merging code section into a data section changes its characteristics.

# RUN: llvm-mc -triple x86_64-windows-msvc %s -filetype=obj -o %t.obj
# RUN: lld-link -machine:amd64 -dll -noentry -out:%t.dll %t.obj -merge:.testx=.testd -merge:.testx2=.testbss
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add more text/data sections and test the other merge direction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed more tests.

I noticed one more problem. To fill gaps between chunks, we memset the whole section to 0xcc. When merging uninitialized chunks into code section, it means that those chunks will not be zeroed. This is a preexisting problem, not specific to this PR, but this PR makes it a bit more widespread by marking more sections as code sections. I will take a look at that.

@cjacek
Copy link
Contributor Author

cjacek commented Nov 13, 2023

I created #72138 fixing the problem with gap fills filling uninitialized data chunks and rebased this PR on top of that.

@cjacek cjacek merged commit c425db2 into llvm:main Nov 14, 2023
2 of 3 checks passed
@cjacek cjacek deleted the lld-code-merge branch November 14, 2023 22:02
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants