New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add filename and line number to backtrace #3343
Add filename and line number to backtrace #3343
Conversation
…ent (fixes linux-x86 builds)
https://github.com/indoorvivants/macho-elf-coff-parser Co-authored-by: Anton Sviridov <keynmol@gmail.com>
One last thing to fix is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job on this one! So far looks good, in long term it would be great to lower ammount of class allocations via tuples, Options or Try, mostly becouse it's very low level part of runtime and we should make sure not the impose to much overhead. However, these changes could wait until it would works correctly.
val maybeFileline = | ||
if (recur) None | ||
else | ||
Try(Backtrace.decodeFileline(ip.toLong)) match { | ||
// Ignore the exception, should we expose the internal error somehow? | ||
case Failure(exception) => None | ||
case Success(value) => value | ||
} | ||
|
||
val updated = | ||
maybeFileline match { | ||
case None => elem | ||
case Some(v) => | ||
new StackTraceElement( | ||
elem.getClassName, | ||
elem.getMethodName, | ||
v._1, | ||
v._2 | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Becouse it is really low level part of the code, it might be better to use try-catch
instead of util.Try
, and if possible to get rid of Option
which when not optimized out would give us additional instantisation of Option per each element of buffer.
Probably something like following should be fine as well:
if(recur) elem
else try {
val v = Backtrace.decodeFileline(ip.toLong)
new StackTraceElement(..,.., v._1, v._2)
} catch {case ex: ??? => elem }
val maybeFileline = | ||
if (recur) None | ||
else | ||
Try(Backtrace.decodeFileline(ip.toLong)) match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder can it decoding filename be a part of makeStackTraceElement
. This which is effectively cached in cachedStackTraceElemen
. This way we would need to decode file name only once per IP. In such case we also would not be required to allocate Tuple of (StackTraceElement, CULong)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally, we used to call Backtrace.decodeFileline from within the makeStackTraceElement function. However, I intentionally moved this functionality out to a different location, as can be seen in this commit: 0e2c67a
This change was made because we were checking for recursive calls to currentStackTrace
by examining the stack trace and counting the occurrences of currentStackTrace
entries.
I thought we can call it from within makeStackTraceElement
in other way (which I found doesn't work)
The idea was that the first call to currentStackTrace
would record the stack pointer's address at that moment. Subsequent calls to currentStackTrace
would then compare their stack pointer addresses to the initial one. If the address was lower in the subsequent call, it would indicate a recursive invocation, and thus could be treated as such.
However, I realized that this approach doesn't always work as expected, particularly if the exception object is generated multiple times within the application.
In order to tell we have recursive call to currentStackTrace
, we have to know the whole stacktrace.
) | ||
} | ||
// Update cache with the updated stacktrace element | ||
cache.update(ip, updated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should check if value in cache already has a filename set. currently we set filename even if StackTraceElem already contains it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, now we check the stacktrace already has filename before decoding their filename and line number by
if (...
elem.getFileName != null // Skip decoding if we already have filename information
) elem
in Throwables.scala, currentStackTrace()
// | ||
// Consequently, to mitigate the risk of cascading recursive exceptions, | ||
// skip executing "BackTrace.decodeFileline" if "currentStackTrace" is already being invoked recursively. | ||
val recur = buffer.count(e => e._1.getMethodName == "currentStackTrace") > 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point to would be good to eliminate this check, becouse it would happen each time we create stack trace.
Based on the information where the exception in decodeFileline
is thrown we should be able to guard it with runtime checks and not allow to throw the exception in the first place.
An alternative would be also to use ThreadLocal boolean set at the beginning of currentStacktrace
This way if we renter this method recursively we would know that the current thread throwed the exception.
Since it is really low level part of the code, and it might be better to use `try-catch`? Also, skip decoding if the elem already has an filename
The rest thing to do is to work on #3343 (comment) by using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. When testing this out I've made some changes in my local branch which could be potentially integrated into this PR: WojciechMazur@1cdb0c7
1 remaining thing would be the automatic invoke fo dsymutils
, I've added a bigger comment below
} | ||
} | ||
} | ||
|
||
buffer.toArray | ||
if (LinktimeInfo.isMac) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add additional LinktimeInfo.hasDebugMetadata
which would be set by the NativeConfig value and check it here as well.. This way we would not risk the penalty of trying to decode positions which would not be existing in the output file anyway.
.getProperty("os.name") | ||
.toLowerCase(Locale.ROOT) | ||
.startsWith("mac")) { | ||
Process(s"dsymutil ${path.getAbsolutePath()}") ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step is necessary to make the positions available. Can we try to integrate it with the main workflow? We can try to run it, and then if it's not present we should log a warning for the users that dsumutil and missing and stacktraces would not contain positions, and potentially if executiono fit fails we should give another warning.
Streaminline this into the main workflow would actually make this feature the most usefull in tests where frequent exceptions would guide us in resolving bugs and slower startup is not an issue.
dsymutil
is part of LLVM toolchain so we can try to discover it similary to llvm-ar
here: https://github.com/WojciechMazur/scala-native/blob/1cdb0c71d9dcd1b527d70d0d36a0ce81855977af/tools/src/main/scala/scala/scalanative/build/LLVM.scala#L232
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step is necessary to make the positions available. Can we try to integrate it with the main workflow? ...
dsymutil is part of LLVM toolchain so we can try to discover it similary to llvm-ar here:
Ah, that sounds better! 👍 Actually clang
also integrate dsymutil
into its workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a note, Apple toolchain does supply dsymutil
but not llvm-ar
but I understand ar
will work. I am thinking the archive won't work on Apple toolchain with our current setup.
if (result != 0) { | ||
throw new BuildException(s"Failed to run dsymutil ${path}") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the dsymutil
fails it should rather be a warning saying that we failed to postprocess binary and the stack traces would not work correctly.
@@ -150,6 +150,16 @@ private[scalanative] object LLVM { | |||
copyOutput(config, buildPath) | |||
} | |||
|
|||
def dsymutil(config: Config, path: Path) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a public method so explicit result type would be recommended
@@ -150,6 +150,16 @@ private[scalanative] object LLVM { | |||
copyOutput(config, buildPath) | |||
} | |||
|
|||
def dsymutil(config: Config, path: Path) = { | |||
val dsymutil = Discover.discover("dsymutil", "LLVM_BIN") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently discover
is always failing if it cannot find given binary. Maybe let's introduce an additional variant of it def tryDiscover(String, String): Option[Path]
. If we won't find it should not be a fatal error, but warning instead. It would be great if in case when we don't find dsymutil we point to the user that we couldn't find this program, possibly show the paths we tried, and point him up that he can use the LLVM_BIN
env to try fix this issue.
.withMode(Mode.debug) // otherwise, clang O2 inlines the call stack in Linux | ||
} | ||
|
||
lazy val debugBuild = taskKey[Unit]("Compile and run dsymutil if exists") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be required anymore
.map { p => | ||
val linked = backend.link(p) | ||
if (Platform.isMac && config.compilerConfig.debugMetadata) { | ||
backend.dsymutil(linked) | ||
} | ||
linked | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe let's create an additional method backend.postprocess
instead which would call the dsymutil and possibly other non-crucial post processing operations.
private def prepareDSymUtilCommand() = { | ||
val dsymutil = Discover.discover("dsymutil", "LLVM_BIN") | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused method?
@@ -172,6 +178,12 @@ object Build { | |||
) { | |||
LLVM.link(config, linkerResult, compiled) | |||
} | |||
|
|||
def dsymutil(linked: Path): Unit = time( | |||
s"Running dsymutil on ${linked.getFileName}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not require users to now what dsymutil is and what it does. Maybe let's just introduce a single logging of time spent in whole postprocessing stage (based on suggestion to move all postprocessing to seperate step). At that point can remove this method and use LLVM.dsymutil
directly in postprocessing stage.
Based on #2869
TL;DR
This PR adds filename and line number to scala-native backtrace using DWARF debug information.
Read/parse myself executable at runtime (when
fillInStacktrace
is called), and extract debug information and add the filename and line number information to backtrace. Also, the runtime will read/proc/self/maps
or call_dyld_get_image_vmaddr_slide to calculate the ASLR offset.vmmap <pid>
_dyld_get_image_vmaddr_slide
)/proc/self/maps
)Background
Current
scala-native
backtrace doesn't have filename and line number information.For example:
scala-native will emit the following backtrace, that have
Unknown Source
instead of filename and line number.scala-native construct backtrace using (llvm) libunwind around here and it gives us the (mangled) symbol name and the call address, but nothing beyond.
However, thanks to @keynmol's recent effort #2869 scala-native will be able to generate DWARF debug information. And they (filename and line number) are in the DWARF information of the binary, and given the call address we can extract the exact call location.
Also, he developed DWARF parser (+ macho parser) in Scala, which can be re-used in scala-native implementation https://github.com/indoorvivants/macho-elf-coff-parser/ 🎉
Algorithm overview
fillInStackTrace
llvm-dwarfdump
AFAIK, and it's only one time. However, still it hurts performance if the program contains a Throwable object 🤔_dyld_get_image_vmaddr_slide (for OSX, for Linux we will need to checkvmmap
command against for myself PID/proc/<pid>/maps
).unwind.get_reg(cursor, unwind.UNW_REG_IP, ip)
and adjusted for ASLR offset)What this PR does
This PR
argv[0]
) and read/parse DWARF information_dyld_get_image_vmaddr_slide to calculate ASLR offsetvmmap
(in OSX) //proc/self/maps
StacktraceElement
As a result, scala-native's backtrace (for the above Scala snippet) looks like this:
References