-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebAssembly] Use DebugValueManager only when subprogram exists #77978
Conversation
We previously scanned the whole BB for `DBG_VALUE` instruction even when the program doesn't have debug info, i.e., the function doesn't have a subprogram associated with it, which can make compilation unnecessarily slow. This disables `DebugValueManager` when a `DISubprogram` doesn't exist for a function. This only reduces unnecessary work in non-debug mode and does not change output, so it's hard to add a test to test this behavior. `cfg-stackify-dbg-skip.ll` test change is necessary because its `DISubprogram` was not correctly linked with the function `foo`, so with this PR the compiler incorrectly assumed function `foo` didn't have a subprogram and the test started to fail. Fixes emscripten-core/emscripten#21048. --- - Further thoughts: This PR, which disables `DebugValueManager` in non-debug mode solves the immediate problem at hand (emscripten-core/emscripten#21048), but this means the example in that issue will still take a long time to compile when `-g` is given. The reason why https://github.com/emscripten-core/emscripten/blob/main/test/printf/test_printf.c (the example in emscripten-core/emscripten#21048) took really long time was that file contained a single BB with more than 10,000 instructions, and `WebAssemblyDebugValueManager`'s constructor contains a loop that's O(n^2): https://github.com/llvm/llvm-project/blob/011ba725070360341f5473e88ebb4c956574805f/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp#L32-L40 It was written this way to salvage as many `DBG_VALUE`s as possible. For example, `MachineInstr::collectDebugValues` only collects `DBG_VALUE`s that immediately follow a def: https://github.com/llvm/llvm-project/blob/011ba725070360341f5473e88ebb4c956574805f/llvm/lib/CodeGen/MachineInstr.cpp#L2315-L2329 So that routine cannot detect something like ```ll %def = ... ... some other instructions ... DBG_VALUE %def ... ``` which occurs rather frequently. Our `WebAssemblyDebugValueManager` constructor tried to capture those non-consecutive `DBG_VALUE`s as well, but as a result, we scan the whole BB. This did not cause a significant slowdown on programs with normal BB sizes, but can be a problem with this kind of program that contains giant BBs with thousands of instructions. Actually `DBG_VALUE`s that refer to `%def` can even be in other BBs, which even us cannot capture. `DBG_INSTR_REF` (https://llvm.org/docs/InstrRefDebugInfo.html) can be a better solution for this but we don't use it (yet). We can probably consider imposing some limits on the number of instructions we can after a def. All this applies to the debug mode compilation, so this is not directly related to what this PR does though.
@llvm/pr-subscribers-debuginfo @llvm/pr-subscribers-backend-webassembly Author: Heejin Ahn (aheejin) ChangesWe previously scanned the whole BB for This only reduces unnecessary work in non-debug mode and does not change output, so it's hard to add a test to test this behavior.
Fixes emscripten-core/emscripten#21048.
The reason why
It was written this way to salvage as many llvm-project/llvm/lib/CodeGen/MachineInstr.cpp Lines 2315 to 2329 in 011ba72
%def = ...
... some other instructions ...
DBG_VALUE %def ... which occurs rather frequently. Our We can probably consider imposing some limits on the number of instructions we can after a def. All this applies to the debug mode compilation, so this is not directly related to what this PR does though. Full diff: https://github.com/llvm/llvm-project/pull/77978.diff 2 Files Affected:
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp
index fd510f85a8a37a..a2a054127d5f65 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp
@@ -17,11 +17,15 @@
#include "WebAssemblyMachineFunctionInfo.h"
#include "llvm/CodeGen/MachineInstr.h"
#include "llvm/IR/DebugInfoMetadata.h"
+#include "llvm/IR/Function.h"
using namespace llvm;
WebAssemblyDebugValueManager::WebAssemblyDebugValueManager(MachineInstr *Def)
: Def(Def) {
+ if (!Def->getMF()->getFunction().getSubprogram())
+ return;
+
// This code differs from MachineInstr::collectDebugValues in that it scans
// the whole BB, not just contiguous DBG_VALUEs, until another definition to
// the same register is encountered.
diff --git a/llvm/test/CodeGen/WebAssembly/cfg-stackify-dbg-skip.ll b/llvm/test/CodeGen/WebAssembly/cfg-stackify-dbg-skip.ll
index a10b9bfdc71af3..5a951d9043df29 100644
--- a/llvm/test/CodeGen/WebAssembly/cfg-stackify-dbg-skip.ll
+++ b/llvm/test/CodeGen/WebAssembly/cfg-stackify-dbg-skip.ll
@@ -15,7 +15,7 @@
target triple = "wasm32-unknown-unknown"
-define void @foo(i64 %arg) {
+define void @foo(i64 %arg) !dbg !37 {
start:
%val = trunc i64 %arg to i32
%cmp = icmp eq i32 %val, 0
@@ -39,7 +39,7 @@ declare void @llvm.dbg.value(metadata, metadata, metadata)
!22 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "&str", file: !6, size: 64, align: 32, elements: !{}, identifier: "111094d970b097647de579f9c509ef08")
!33 = !{i32 2, !"Debug Info Version", i32 3}
!35 = distinct !DILexicalBlock(scope: !37, file: !6, line: 357, column: 8)
-!37 = distinct !DISubprogram(name: "foobar", linkageName: "_fooba", scope: !38, file: !6, line: 353, type: !39, isLocal: true, isDefinition: true, scopeLine: 353, flags: DIFlagPrototyped, isOptimized: true, unit: !0, templateParams: !2, retainedNodes: !42)
+!37 = distinct !DISubprogram(name: "foo", scope: !6, file: !6, line: 353, type: !39, isLocal: true, isDefinition: true, scopeLine: 353, flags: DIFlagPrototyped, isOptimized: true, unit: !0, templateParams: !2, retainedNodes: !42)
!38 = !DINamespace(name: "ptr", scope: null)
!39 = !DISubroutineType(types: !2)
!42 = !{!46}
|
Thanks, this change LGTM and yeah I concur with all of your analysis about DebugValueManager's behavior. I think it does make sense for now to just limit the "lookahead" range for |
Yeah, I think I can run a little test with some sizable binaries on various lookahead limit values to see how debug coverage changes and decide on a number. Will do that as a follow-up. |
…#77978) We previously scanned the whole BB for `DBG_VALUE` instruction even when the program doesn't have debug info, i.e., the function doesn't have a subprogram associated with it, which can make compilation unnecessarily slow. This disables `DebugValueManager` when a `DISubprogram` doesn't exist for a function. This only reduces unnecessary work in non-debug mode and does not change output, so it's hard to add a test to test this behavior. Test changes were necessary because their `DISubprogram`s were not correctly linked with the functions, so with this PR the compiler incorrectly assumed the functions didn't have a subprogram and the tests started to fail. Fixes emscripten-core/emscripten#21048.
We previously scanned the whole BB for
DBG_VALUE
instruction even when the program doesn't have debug info, i.e., the function doesn't have a subprogram associated with it, which can make compilation unnecessarily slow. This disablesDebugValueManager
when aDISubprogram
doesn't exist for a function.This only reduces unnecessary work in non-debug mode and does not change output, so it's hard to add a test to test this behavior.
Test changes were necessary because their
DISubprogram
s were not correctly linked with the functions, so with this PR the compiler incorrectly assumed the functions didn't have a subprogram and the tests started to fail.Fixes emscripten-core/emscripten#21048.
DebugValueManager
in non-debug mode solves the immediate problem at hand(test/printf/test.c take 22 seconds to compile emscripten-core/emscripten#21048), but this means the example in that issue will still take a long time to compile when
-g
is given.The reason why
https://github.com/emscripten-core/emscripten/blob/main/test/printf/test_printf.c (the example in
emscripten-core/emscripten#21048) took really long time was that file contained a single BB with more than 10,000 instructions, and
WebAssemblyDebugValueManager
's constructor contains a loop that's O(n^2):llvm-project/llvm/lib/Target/WebAssembly/WebAssemblyDebugValueManager.cpp
Lines 32 to 40 in 011ba72
It was written this way to salvage as many
DBG_VALUE
s as possible. For example,MachineInstr::collectDebugValues
only collectsDBG_VALUE
s that immediately follow a def:llvm-project/llvm/lib/CodeGen/MachineInstr.cpp
Lines 2315 to 2329 in 011ba72
which occurs rather frequently. Our
WebAssemblyDebugValueManager
constructor tried to capture those non-consecutiveDBG_VALUE
s as well, but as a result, we scan the whole BB. This did not cause a significant slowdown on programs with normal BB sizes, but can be a problem with this kind of program that contains giant BBs with thousands of instructions. ActuallyDBG_VALUE
s that refer to%def
can even be in other BBs, which even we cannot capture.DBG_INSTR_REF
(https://llvm.org/docs/InstrRefDebugInfo.html) can be a better solution for this but we don't use it (yet).We can probably consider imposing some limits on the number of instructions we scan after a def. All this applies to the debug mode compilation, so this is not directly related to what this PR does though.