-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LTO] [LLD] Don't alias the __imp_func and func symbol resolutions #71376
Conversation
@llvm/pr-subscribers-lld @llvm/pr-subscribers-platform-windows Author: Martin Storsjö (mstorsjo) ChangesCommit b963c0b fixed LTO compilation of cases where one translation unit is calling a function with the dllimport attribute, and another translation unit provides this function locally within the same linked module (i.e. not actually dllimported); see This was fixed by aliasing their GlobalResolution structs, for the I believe this fix to be wrong. This patch reverts that fix, and fixes the same issue differently, within LLD instead. The fix assumed that one can treat the However this fix only works if both translation units are compiled as LTO; if the caller is compiled as a regular object file and the callee is compiled as LTO, the fix fails, as the LTO compilation doesn't know that the unprefixed symbol is needed. The only level that knows of the potential relationship between the Therefore, revert the original fix from Then fix the issue differently - when concluding that we can fulfill an undefined symbol starting with This change also fixes another issue; an object file can provide both unprefixed and prefixed versions of the same symbol, like this:
That allows the function to be called both with and without dllimport markings. (The concept of automatically resolving a reference to Previously, the aliasing of global resolutions at the LTO level would trigger a failed assert with "Multiple prevailing defs are not allowed" for this case, as both This change (together with previous change in Full diff: https://github.com/llvm/llvm-project/pull/71376.diff 6 Files Affected:
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index d9673e159769fbd..1285dee41303c42 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -458,8 +458,10 @@ void SymbolTable::reportUnresolvable() {
StringRef name = undef->getName();
if (name.starts_with("__imp_")) {
Symbol *imp = find(name.substr(strlen("__imp_")));
- if (imp && isa<Defined>(imp))
+ if (Defined *def = dyn_cast_or_null<Defined>(imp)) {
+ def->isUsedInRegularObj = true;
continue;
+ }
}
if (name.contains("_PchSym_"))
continue;
diff --git a/lld/test/COFF/lto-dllimport.ll b/lld/test/COFF/lto-dllimport.ll
new file mode 100644
index 000000000000000..257bfa8a04a585a
--- /dev/null
+++ b/lld/test/COFF/lto-dllimport.ll
@@ -0,0 +1,54 @@
+; REQUIRES: x86
+; RUN: split-file %s %t.dir
+
+; RUN: llvm-as %t.dir/main.ll -o %t.main.bc
+; RUN: llvm-as %t.dir/other.ll -o %t.other.bc
+; RUN: llc %t.dir/main.ll -o %t.main.obj --filetype=obj
+; RUN: llc %t.dir/other.ll -o %t.other.obj --filetype=obj
+
+; RUN: lld-link %t.main.obj %t.other.obj -entry:main -out:%t.exe
+; RUN: lld-link %t.main.bc %t.other.bc -entry:main -out:%t.exe
+; RUN: lld-link %t.main.obj %t.other.bc -entry:main -out:%t.exe
+; RUN: lld-link %t.main.bc %t.other.obj -entry:main -out:%t.exe
+
+;--- main.ll
+; ModuleID = 'a.obj'
+source_filename = "a.c"
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc19.11.0"
+
+; Function Attrs: norecurse nounwind sspstrong uwtable
+define dso_local i32 @main() local_unnamed_addr {
+entry:
+ %call = tail call i32 @foo()
+ ret i32 %call
+}
+
+declare dllimport i32 @foo() local_unnamed_addr
+
+!llvm.module.flags = !{!1}
+
+!1 = !{i32 1, !"ThinLTO", i32 0}
+
+^0 = module: (path: "a.obj", hash: (0, 0, 0, 0, 0))
+^1 = gv: (name: "foo") ; guid = 2709792123250749187
+^2 = gv: (name: "main", summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 1, live: 0, dsoLocal: 1), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 1, returnDoesNotAlias: 0), calls: ((callee: ^1))))) ; guid = 15822663052811949562
+
+;--- other.ll
+; ModuleID = 'b.obj'
+source_filename = "b.c"
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-windows-msvc19.11.0"
+
+; Function Attrs: norecurse nounwind readnone sspstrong uwtable
+define dso_local i32 @foo() local_unnamed_addr {
+entry:
+ ret i32 42
+}
+
+!llvm.module.flags = !{!1}
+
+!1 = !{i32 1, !"ThinLTO", i32 0}
+
+^0 = module: (path: "b.obj", hash: (0, 0, 0, 0, 0))
+^1 = gv: (name: "foo", summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 1, live: 0, dsoLocal: 1), insts: 1, funcFlags: (readNone: 1, readOnly: 0, noRecurse: 1, returnDoesNotAlias: 0)))) ; guid = 2709792123250749187
diff --git a/lld/test/COFF/lto-imp-prefix.ll b/lld/test/COFF/lto-imp-prefix.ll
index 503e4fd0649e41d..e64de6b9229b049 100644
--- a/lld/test/COFF/lto-imp-prefix.ll
+++ b/lld/test/COFF/lto-imp-prefix.ll
@@ -3,9 +3,11 @@
; RUN: rm -rf %t.dir
; RUN: split-file %s %t.dir
; RUN: llvm-as %t.dir/main.ll -o %t.main.obj
-; RUN: llvm-as %t.dir/other.ll -o %t.other.obj
+; RUN: llvm-as %t.dir/other1.ll -o %t.other1.obj
+; RUN: llvm-as %t.dir/other2.ll -o %t.other2.obj
-; RUN: lld-link /entry:entry %t.main.obj %t.other.obj /out:%t.exe /subsystem:console
+; RUN: lld-link /entry:entry %t.main.obj %t.other1.obj /out:%t.exe /subsystem:console
+; RUN: lld-link /entry:entry %t.main.obj %t.other2.obj /out:%t.exe /subsystem:console
;--- main.ll
target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
@@ -22,7 +24,7 @@ declare dllimport void @importedFunc()
declare void @other()
-;--- other.ll
+;--- other1.ll
target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-w64-windows-gnu"
@@ -37,3 +39,19 @@ define void @other() {
entry:
ret void
}
+
+;--- other2.ll
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-w64-windows-gnu"
+
+@__imp_importedFunc = global ptr @importedFunc
+
+define void @importedFunc() {
+entry:
+ ret void
+}
+
+define void @other() {
+entry:
+ ret void
+}
diff --git a/llvm/lib/LTO/LTO.cpp b/llvm/lib/LTO/LTO.cpp
index 214c2ef45de0664..52c6a0c2e57674a 100644
--- a/llvm/lib/LTO/LTO.cpp
+++ b/llvm/lib/LTO/LTO.cpp
@@ -610,13 +610,7 @@ void LTO::addModuleToGlobalRes(ArrayRef<InputFile::Symbol> Syms,
assert(ResI != ResE);
SymbolResolution Res = *ResI++;
- StringRef Name = Sym.getName();
- // Strip the __imp_ prefix from COFF dllimport symbols (similar to the
- // way they are handled by lld), otherwise we can end up with two
- // global resolutions (one with and one for a copy of the symbol without).
- if (TT.isOSBinFormatCOFF() && Name.startswith("__imp_"))
- Name = Name.substr(strlen("__imp_"));
- auto &GlobalRes = GlobalResolutions[Name];
+ auto &GlobalRes = GlobalResolutions[Sym.getName()];
GlobalRes.UnnamedAddr &= Sym.isUnnamedAddr();
if (Res.Prevailing) {
assert(!GlobalRes.Prevailing &&
diff --git a/llvm/test/LTO/X86/Inputs/dllimport.ll b/llvm/test/LTO/X86/Inputs/dllimport.ll
deleted file mode 100644
index f3daab1df09742d..000000000000000
--- a/llvm/test/LTO/X86/Inputs/dllimport.ll
+++ /dev/null
@@ -1,17 +0,0 @@
-; ModuleID = 'b.obj'
-source_filename = "b.cpp"
-target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
-target triple = "x86_64-pc-windows-msvc19.11.0"
-
-; Function Attrs: norecurse nounwind readnone sspstrong uwtable
-define dso_local i32 @"?foo@@YAHXZ"() local_unnamed_addr {
-entry:
- ret i32 42
-}
-
-!llvm.module.flags = !{!1}
-
-!1 = !{i32 1, !"ThinLTO", i32 0}
-
-^0 = module: (path: "b.obj", hash: (0, 0, 0, 0, 0))
-^1 = gv: (name: "?foo@@YAHXZ", summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 1, live: 0, dsoLocal: 1), insts: 1, funcFlags: (readNone: 1, readOnly: 0, noRecurse: 1, returnDoesNotAlias: 0)))) ; guid = 2709792123250749187
diff --git a/llvm/test/LTO/X86/dllimport.ll b/llvm/test/LTO/X86/dllimport.ll
deleted file mode 100644
index eb9f7df5c8b30a5..000000000000000
--- a/llvm/test/LTO/X86/dllimport.ll
+++ /dev/null
@@ -1,30 +0,0 @@
-; Test requiring LTO to remove the __imp_ prefix for locally imported COFF
-; symbols (mirroring how lld handles these symbols).
-; RUN: llvm-as %s -o %t.obj
-; RUN: llvm-as %S/Inputs/dllimport.ll -o %t2.obj
-; RUN: llvm-lto2 run -r=%t.obj,main,px -r %t.obj,__imp_?foo@@YAHXZ -r %t2.obj,?foo@@YAHXZ,p -o %t3 %t.obj %t2.obj -save-temps
-; RUN: llvm-dis %t3.0.0.preopt.bc -o - | FileCheck %s
-
-; CHECK: define dso_local i32 @"?foo@@YAHXZ"()
-
-; ModuleID = 'a.obj'
-source_filename = "a.cpp"
-target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
-target triple = "x86_64-pc-windows-msvc19.11.0"
-
-; Function Attrs: norecurse nounwind sspstrong uwtable
-define dso_local i32 @main() local_unnamed_addr {
-entry:
- %call = tail call i32 @"?foo@@YAHXZ"()
- ret i32 %call
-}
-
-declare dllimport i32 @"?foo@@YAHXZ"() local_unnamed_addr
-
-!llvm.module.flags = !{!1}
-
-!1 = !{i32 1, !"ThinLTO", i32 0}
-
-^0 = module: (path: "a.obj", hash: (0, 0, 0, 0, 0))
-^1 = gv: (name: "?foo@@YAHXZ") ; guid = 2709792123250749187
-^2 = gv: (name: "main", summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 1, live: 0, dsoLocal: 1), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 1, returnDoesNotAlias: 0), calls: ((callee: ^1))))) ; guid = 15822663052811949562
|
Ping (although if Reid is busy this week I guess it can wait, if nobody else wants to have a look before that). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only level that knows of the potential relationship between the
__imp_
prefixed and unprefixed symbol, across regular and bitcode object files, is LLD itself.
Agreed.
Previously, the aliasing of global resolutions at the LTO level would trigger a failed assert with "Multiple prevailing defs are not allowed" for this case, as both importedFunc and __imp_importedFunc could be prevailing. Add a case to the existing LLD test case lto-imp-prefix.ll to test this as well.
Confirmed.
The LLD fix is indeed better than https://reviews.llvm.org/D49138 . Whether the unprefixed is external does not matter now.
lld/test/COFF/lto-imp-prefix.ll
Outdated
|
||
; RUN: lld-link /entry:entry %t.main.obj %t.other.obj /out:%t.exe /subsystem:console | ||
; RUN: lld-link /entry:entry %t.main.obj %t.other1.obj /out:%t.exe /subsystem:console |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can define a __imp_other
and link with -debug:symtab
and check that __imp_other
(not strictly needed) is emitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, as an example of the unnecessary symbols that the implementation of #70777 does leave behind? That sounds like a good idea - although perhaps I should add that in a separate commit and not in this one.
lld/test/COFF/lto-dllimport.ll
Outdated
; RUN: lld-link %t.main.bc %t.other.obj -entry:main -out:%t.exe | ||
|
||
;--- main.ll | ||
; ModuleID = 'a.obj' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the incorrect ModuleID comment (main.ll vs a.obj?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will do
lld/test/COFF/lto-dllimport.ll
Outdated
^2 = gv: (name: "main", summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 1, live: 0, dsoLocal: 1), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 1, returnDoesNotAlias: 0), calls: ((callee: ^1))))) ; guid = 15822663052811949562 | ||
|
||
;--- other.ll | ||
; ModuleID = 'b.obj' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consistently use other
or b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I can remove the ModuleID comments altogether.
|
||
@__imp_importedFunc = global ptr @importedFunc | ||
|
||
define void @importedFunc() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add a comment to emphasize that unprefixed symbol is external.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, to clarify that this testcase explicitly tests providing two external symbols from the same object file, with/without prefix.
The patch changes the behavior of data symbols as well. Consider adding a test to check its current behavior (I think it's fine whether it makes sense or not; the test just serves as a change detector). I haven't thought hard about data symbols. |
The mechanism work exactly the same for data symbols; if you have a dllimport reference to a data symbol in one object file, that symbol can be provided either with an explicit (As a side note; the reverse, where a data symbol is referenced without a dllimport attribute, but the data symbol only is available in dllimport form from another DLL, is the tricky case. In MSVC environments, this is not supported, while in mingw environments, it is supported via the "autoimport" mechanism, which in worst case requires runtime pseudo relocations which are fixed up at startup by the mingw runtime.) |
ea58002
to
0f0bbb4
Compare
Commit b963c0b fixed LTO compilation of cases where one translation unit is calling a function with the dllimport attribute, and another translation unit provides this function locally within the same linked module (i.e. not actually dllimported); see llvm#37453 or https://bugs.llvm.org/show_bug.cgi?id=38105 for full context. This was fixed by aliasing their GlobalResolution structs, for the __imp_ prefixed and non prefixed symbols. I believe this fix to be wrong. This patch reverts that fix, and fixes the same issue differently, within LLD instead. The fix assumed that one can treat the __imp_ prefixed and unprefixed symbols as equal, referencing SVN r240620 (d766653). However that referenced commit had mistaken how this logic works, which was corrected later in SVN r240622 (88e0f92); those symbols aren't direct aliases for each other - but if there's a need for the __imp_ prefixed one and the other one exists, the __imp_ prefixed one is created, as a pointer to the other one. However this fix only works if both translation units are compiled as LTO; if the caller is compiled as a regular object file and the callee is compiled as LTO, the fix fails, as the LTO compilation doesn't know that the unprefixed symbol is needed. The only level that knows of the potential relationship between the __imp_ prefixed and unprefixed symbol, across regular and bitcode object files, is LLD itself. Therefore, revert the original fix from b963c0b, and fix the issue differently - when concluding that we can fulfill an undefined symbol starting with __imp_, mark the corresponding non prefixed symbol as used in a regular object for the LTO compilation, to make sure that this non prefixed symbol exists after the LTO compilation, to let LLD do the fixup of the local import. Extend the testcase to test a regular object file calling an LTO object file, which previously failed. This change also fixes another issue; an object file can provide both unprefixed and prefixed versions of the same symbol, like this: void importedFunc(void) { } void (*__imp_importedFunc)(void) = importedFunc; That allows the function to be called both with and without dllimport markings. (The concept of automatically resolving a reference to __imp_func to a locally defined func only is done in MSVC style linkers, but not in GNU ld, therefore MinGW mode code often uses this construct.) Previously, the aliasing of global resolutions at the LTO level would trigger a failed assert with "Multiple prevailing defs are not allowed" for this case, as both importedFunc and __imp_importedFunc could be prevailing. Add a case to the existing LLD test case lto-imp-prefix.ll to test this as well. This change (together with previous change in 3ab6209) completes LLD to work with mingw-w64-crt files (the base glue code for a mingw-w64 toolchain) built with LTO.
0f0bbb4
to
b38f434
Compare
Commit b963c0b fixed LTO compilation of cases where one translation unit is calling a function with the dllimport attribute, and another translation unit provides this function locally within the same linked module (i.e. not actually dllimported); see #37453 or https://bugs.llvm.org/show_bug.cgi?id=38105 for full context.
This was fixed by aliasing their GlobalResolution structs, for the
__imp_
prefixed and non prefixed symbols.I believe this fix to be wrong.
This patch reverts that fix, and fixes the same issue differently, within LLD instead.
The fix assumed that one can treat the
__imp_
prefixed and unprefixed symbols as equal, referencing SVN r240620 (d766653). However that referenced commit had mistaken how this logic works, which was corrected later in SVN r240622 (88e0f92); those symbols aren't direct aliases for each other - but if there's a need for the__imp_
prefixed one and the other one exists, the__imp_
prefixed one is created, as a pointer to the other one.However this fix only works if both translation units are compiled as LTO; if the caller is compiled as a regular object file and the callee is compiled as LTO, the fix fails, as the LTO compilation doesn't know that the unprefixed symbol is needed.
The only level that knows of the potential relationship between the
__imp_
prefixed and unprefixed symbol, across regular and bitcode object files, is LLD itself.Therefore, revert the original fix from b963c0b, and fix the issue differently - when concluding that we can fulfill an undefined symbol starting with
__imp_
, mark the corresponding non prefixed symbol as used in a regular object for the LTO compilation, to make sure that this non prefixed symbol exists after the LTO compilation, to let LLD do the fixup of the local import.Extend the testcase to test a regular object file calling an LTO object file, which previously failed.
This change also fixes another issue; an object file can provide both unprefixed and prefixed versions of the same symbol, like this:
That allows the function to be called both with and without dllimport markings. (The concept of automatically resolving a reference to
__imp_func
to a locally definedfunc
only is done in MSVC style linkers, but not in GNU ld, therefore MinGW mode code often uses this construct.)Previously, the aliasing of global resolutions at the LTO level would trigger a failed assert with "Multiple prevailing defs are not allowed" for this case, as both
importedFunc
and__imp_importedFunc
could be prevailing. Add a case to the existing LLD test case lto-imp-prefix.ll to test this as well.This change (together with previous change in 3ab6209) completes LLD to work with mingw-w64-crt files (the base glue code for a mingw-w64 toolchain) built with LTO.