Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filename and line number to backtrace #3343

Merged
merged 65 commits into from
Aug 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
7b19254
WIP somewhat functional DWARF renderer
keynmol Sep 19, 2022
72b9575
WIP: generate dwarf metadata
keynmol Sep 25, 2022
270d0e8
Try anything at this point
keynmol Sep 25, 2022
33695fd
Quick, commit while it works
keynmol Sep 25, 2022
648e4a0
Formatting
keynmol Sep 25, 2022
147bfd1
Handle unit return types
keynmol Oct 1, 2022
9fd664e
Cache call position by scope as well
keynmol Oct 1, 2022
654835e
Add location metadata to `invoke`
keynmol Oct 1, 2022
8655222
Return sandbox to the way it was
keynmol Oct 9, 2022
c843ba0
Fix compilation errors
keynmol Oct 9, 2022
702eae9
Erase GenIdx from public API, and cleanup
keynmol Oct 9, 2022
d14c74d
WIP
keynmol Feb 16, 2023
8d67657
Fix issues after rebase, rename to LLVMDebugInformation
keynmol Feb 16, 2023
483aeaa
Merge branch 'main' into they-are-taking-DWARFs-to-isengard
keynmol Feb 27, 2023
1e884a1
Hacky resolution of filename for URLs
keynmol Feb 27, 2023
ee966f6
chore: run clangfmt
keynmol Feb 27, 2023
f01161b
chore: scalafmt
keynmol Feb 27, 2023
543ef98
Be even more restrictive about positions
keynmol Feb 27, 2023
8aeb2c6
Back to making file a required parameter
keynmol Feb 27, 2023
deb5e03
Try to satisfy JVM 8
keynmol Feb 27, 2023
9b2c0dc
just a gift that keeps on giving
keynmol Feb 27, 2023
b4482ab
Make LLVM metadata addition configurable and disable it by default
keynmol Feb 27, 2023
d785864
Merge branch 'main' into they-are-taking-DWARFs-to-isengard
keynmol Mar 1, 2023
f55c49b
Code review comments
keynmol Mar 6, 2023
c4dc4b3
Merge branch 'main' into they-are-taking-DWARFs-to-isengard
keynmol May 16, 2023
bd92200
Add mising dbg metadata for invoke calls.
WojciechMazur May 18, 2023
4cf6888
Ensure `dereferenceable_or_null` is always using i64 type as an argum…
WojciechMazur May 18, 2023
43f535d
Copy macho-elf-coff-parser to scala-native
tanishiking Jun 13, 2023
ac88a4c
Make dwarf parser compile
tanishiking Jun 13, 2023
4cedf1a
WIP
tanishiking Jun 15, 2023
7130e59
Test with no-pie
tanishiking Jun 15, 2023
2a91827
Add fileline info for Position Indenpendent executable
tanishiking Jun 19, 2023
4fba5cc
Read separate dSYM file
tanishiking Jun 21, 2023
22d3cf8
Merge branch 'main' into fileline-stacktrace
tanishiking Jul 3, 2023
6bb042e
binary search
tanishiking Jul 6, 2023
6b03563
Add backtrace filename line number test
tanishiking Jul 7, 2023
ba8b21b
Seekable buffer
tanishiking Jul 7, 2023
0a690c6
Don't parse .debug_abbrev again
tanishiking Jul 10, 2023
f31e196
Clean up
tanishiking Jul 10, 2023
79b4589
scalafmt
tanishiking Jul 11, 2023
dc4ab4b
Add some documentation
tanishiking Jul 11, 2023
edd3034
Merge branch 'main' into fileline-stacktrace
tanishiking Aug 1, 2023
0e2c67a
Fix: do not stack overflow when exception occurs in Backtrace.decodeF…
tanishiking Aug 2, 2023
5892f90
Retrieve image offset of myself using `dyld` in OSX
tanishiking Aug 3, 2023
3bcd8ce
Don't test for fileline in Linux
tanishiking Aug 3, 2023
ad42fba
Fix CI
tanishiking Aug 3, 2023
1b85b08
Extend timeout in OptimizerSpec since it started timeout in CI
tanishiking Aug 3, 2023
8b623af
Fix scripted tests backtrace in Windows and Linux
tanishiking Aug 4, 2023
04d7f6a
Run dsymutil only on Mac
tanishiking Aug 4, 2023
38cb60e
Use toString for stacktrace
tanishiking Aug 4, 2023
69ab6a7
Remove unused getpid
tanishiking Aug 4, 2023
35eb304
Don't try to run decodeFileline on Linux and Windows
tanishiking Aug 8, 2023
8223e68
Use try-catch block instead of util.Try
tanishiking Aug 8, 2023
a08556b
Test: print raw stacktraces
tanishiking Aug 8, 2023
ebdf020
try noinline
tanishiking Aug 9, 2023
11d3e46
Use LinktimeInfo + fix for Linux CI
tanishiking Aug 13, 2023
82335e8
Merge branch 'main' into fileline-stacktrace
tanishiking Aug 13, 2023
7d2e3eb
Fix scripted test in Linux by running clang with O0
tanishiking Aug 14, 2023
dda0271
Merge branch 'main' into fileline-stacktrace
tanishiking Aug 22, 2023
51527c4
Integrate dsymutil into scala-native
tanishiking Aug 24, 2023
5b32df9
Guard not to enhance stacktrace if no debug metadata
tanishiking Aug 24, 2023
aa87156
Running dsymutil in postprocess
tanishiking Aug 25, 2023
7c41c00
Remove unused method
tanishiking Aug 25, 2023
042abed
Don't fail if dsymutil didn't work
tanishiking Aug 25, 2023
c5ccc3e
Remove unnecessary debugBuild task from test
tanishiking Aug 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
49 changes: 45 additions & 4 deletions javalib/src/main/scala/java/lang/Throwables.scala
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import scalanative.runtime.unwind
import scala.scalanative.meta.LinktimeInfo
// TODO: Replace with j.u.c.ConcurrentHashMap when implemented to remove scalalib dependency
import scala.collection.concurrent.TrieMap
import scala.scalanative.runtime.Backtrace

private[lang] object StackTrace {
private val cache = TrieMap.empty[CUnsignedLong, StackTraceElement]
Expand Down Expand Up @@ -39,23 +40,63 @@ private[lang] object StackTrace {
cache.getOrElseUpdate(ip, makeStackTraceElement(cursor))

@noinline private[lang] def currentStackTrace(): Array[StackTraceElement] = {
var buffer = mutable.ArrayBuffer.empty[StackTraceElement]
var buffer = mutable.ArrayBuffer.empty[(StackTraceElement, CUnsignedLong)]
if (!LinktimeInfo.asanEnabled) {
Zone { implicit z =>
val cursor = alloc[scala.Byte](unwind.sizeOfCursor)
val context = alloc[scala.Byte](unwind.sizeOfContext)
val ip = stackalloc[CSize]()

unwind.get_context(context)
unwind.init_local(cursor, context)
while (unwind.step(cursor) > 0) {
unwind.get_reg(cursor, unwind.UNW_REG_IP, ip)
buffer += cachedStackTraceElement(cursor, !ip)
val elem = (cachedStackTraceElement(cursor, !ip), !ip)
buffer += elem
}
}
}

buffer.toArray
if (LinktimeInfo.isMac && LinktimeInfo.hasDebugMetadata) {
// Add filename and line number informatiion
// When analyzing the stack trace, if the entry "currentStackTrace" appears more than once, it indicates that a Throwable object
// was generated within the same "currentStackTrace" method, causing recursive calls. This recursive behavior can lead to an
// infinite loop, ultimately resulting in a stack overflow exception.
//
// To prevent excessive recursion, it's preferable to minimize multiple consecutive calls to the "currentStackTrace" method.
//
// The "currentStackTrace" process is straightforward and typically doesn't trigger exceptions.
// On the other hand, the "BackTrace.decodeFileline" is intricate, and if an exception thrown, it's likely to originate from that method.
//
// Consequently, to mitigate the risk of cascading recursive exceptions,
// skip executing "BackTrace.decodeFileline" if "currentStackTrace" is already being invoked recursively.
val recur =
buffer.count(e => e._1.getMethodName == "currentStackTrace") > 1
buffer.map { e =>
val elem = e._1
val ip = e._2
if (recur || // Skip decoding if we're calling currentStackTrace in recursively
elem.getFileName != null // Skip decoding if we already have filename information
) elem
else {
try {
Backtrace.decodeFileline(ip.toLong) match {
case None =>
elem
case Some(v) =>
val updated = new StackTraceElement(
elem.getClassName,
elem.getMethodName,
v._1,
v._2
)
// Update cache with the updated stacktrace element
cache.update(ip, updated)
updated
}
} catch { case ex: Throwable => elem }
}
}.toArray
} else buffer.map(_._1).toArray
}
}

Expand Down
29 changes: 29 additions & 0 deletions nativelib/src/main/resources/scala-native/vmoffset.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#if (defined(__APPLE__) && defined(__MACH__))
#include <mach-o/dyld.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>

// see:
// https://stackoverflow.com/questions/10301542/getting-process-base-address-in-mac-osx
// https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/dyld.3.html
intptr_t scalanative_get_vmoffset() {
char path[1024];
uint32_t size = sizeof(path);
if (_NSGetExecutablePath(path, &size) != 0)
return -1;
for (uint32_t i = 0; i < _dyld_image_count(); i++) {
if (strcmp(_dyld_get_image_name(i), path) == 0)
return _dyld_get_image_vmaddr_slide(i);
}
return 0;
}

#else

// should be unused, we can get vmoffset from /proc/pid/maps at least in Linux.
// Not sure about windows.
#include <stdint.h>
intptr_t scalanative_get_vmoffset() { return 0; }

#endif
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ import scala.scalanative.unsafe._
* discard some parts of NIR instructions when linking
*/
object LinktimeInfo {
@resolvedAtLinktime("scala.scalanative.meta.linktimeinfo.hasDebugMetadata")
def hasDebugMetadata: Boolean = resolved

@resolvedAtLinktime("scala.scalanative.meta.linktimeinfo.debugMode")
def debugMode: Boolean = resolved

Expand Down
208 changes: 208 additions & 0 deletions nativelib/src/main/scala/scala/scalanative/runtime/Backtrace.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
package scala.scalanative.runtime

import scala.scalanative.runtime.dwarf.BinaryFile
import scala.scalanative.runtime.dwarf.MachO
import scala.scalanative.runtime.dwarf.DWARF
import scala.scalanative.runtime.dwarf.DWARF.DIE
import scala.scalanative.runtime.dwarf.DWARF.CompileUnit

import scala.scalanative.unsafe.CSize
import scala.scalanative.unsafe.Tag
import scala.scalanative.unsafe.Zone
import scala.scalanative.unsigned.UInt
import scalanative.unsigned._

import scala.collection.mutable
import scala.collection.concurrent.TrieMap
import scala.annotation.tailrec

import java.io.File

object Backtrace {
private sealed trait Format
private case object MACHO extends Format
private case object ELF extends Format
private case class DwarfInfo(
subprograms: IndexedSeq[SubprogramDIE],
strings: DWARF.Strings,
/** ASLR offset (minus __PAGEZERO size for macho) */
offset: Long,
format: Format
)

private case class SubprogramDIE(
lowPC: Long,
highPC: Long,
line: Int,
filenameAt: Option[UInt]
)

private val MACHO_MAGIC = "cffaedfe"
private val ELF_MAGIC = "7f454c46"

private val cache = TrieMap.empty[String, Option[DwarfInfo]]

def decodeFileline(pc: Long): Option[(String, Int)] = {
cache.get(filename) match {
case Some(None) =>
None // cached, there's no debug section
Comment on lines +47 to +48
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache the debug information, so we don't need to try to parse the executable more than once.

case Some(Some(info)) =>
impl(pc, info)
case None =>
processFile(filename, None) match {
case None =>
// there's no debug section, cache it so we don't parse the exec file any longer
cache.put(filename, None)
None
case file @ Some(info) =>
cache.put(filename, file)
impl(pc, info)
}
}
}

private def impl(
pc: Long,
info: DwarfInfo
): Option[(String, Int)] = {
// The address (DW_AT_(low|high)_address) in debug information has the file offset (the offset in the executable + __PAGEZERO in macho).
// While the pc address retrieved from libunwind at runtime has the location of the memory into the virtual memory
// at runtime. which has a random offset (called ASLR offset or slide) that is different for every run because of
// Address Space Layout Randomization (ASLR) when the executable is built as PIE.
// Subtract the offset to match the pc address from libunwind (runtime) and address in debug info (compile/link time).
val address = pc - info.offset
for {
subprogram <- search(info.subprograms, address)
at <- subprogram.filenameAt
} yield {
val filename = info.strings.read(at)
(filename, subprogram.line + 1) // line number in DWARF is 0-based
}
}

private def search(
dies: IndexedSeq[SubprogramDIE],
address: Long
): Option[SubprogramDIE] = {
val length = dies.length
@tailrec
def binarySearch(from: Int, to: Int): Option[SubprogramDIE] = {
if (from < 0) binarySearch(0, to)
else if (to > length) binarySearch(from, length)
else if (to <= from) None
else {
val idx = from + (to - from - 1) / 2
val die = dies(idx)
if (die.lowPC <= address && address <= die.highPC) Some(die)
else if (address < die.lowPC) binarySearch(from, idx)
else // die.highPC < address
binarySearch(idx + 1, to)
}
}
binarySearch(0, length)
}

private def processMacho(
macho: MachO
)(implicit bf: BinaryFile): Option[(Vector[DIE], DWARF.Strings)] = {
val sections = macho.segments.flatMap(_.sections)
for {
debug_info <- sections.find(_.sectname == "__debug_info")
debug_abbrev <- sections.find(_.sectname == "__debug_abbrev")
debug_str <- sections.find(_.sectname == "__debug_str")
debug_line <- sections.find(_.sectname == "__debug_line")
} yield {
readDWARF(
debug_info = DWARF.Section(debug_info.offset, debug_info.size),
debug_abbrev = DWARF.Section(debug_abbrev.offset, debug_abbrev.size),
debug_str = DWARF.Section(debug_str.offset, debug_str.size)
)
}
}

private def filterSubprograms(dies: Vector[CompileUnit]) = {
var filenameAt: Option[UInt] = None
dies
.flatMap { die =>
if (die.is(DWARF.Tag.DW_TAG_subprogram)) {
for {
line <- die.getLine
low <- die.getLowPC
high <- die.getHighPC(low)
} yield SubprogramDIE(low, high, line, filenameAt)
} else if (die.is(DWARF.Tag.DW_TAG_compile_unit)) {
// Debug Information Entries (DIE) in DWARF has a tree structure, and
// the DIEs after the Compile Unit DIE belongs to that compile unit (file in Scala)
// TODO: Parse `.debug_line` section, and decode the filename using
// `DW_AT_decl_file` attribute of the `subprogram` DIE.
filenameAt = die.getName
None
} else None
}
.sortBy(_.lowPC)
.toIndexedSeq
}

private def processFile(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the endpoint for parsing the executable, and returns DwarfInfo instance.

filename: String,
matchUUID: Option[List[UInt]]
): Option[DwarfInfo] = {
implicit val bf: BinaryFile = new BinaryFile(new File(filename))
val head = bf.position()
val magic = bf.readInt().toUInt.toHexString
bf.seek(head)
if (magic == MACHO_MAGIC) {
val macho = MachO.parse(bf)
val dwarfOpt: Option[(Vector[DIE], DWARF.Strings)] =
processMacho(macho).orElse {
val basename = new File(filename).getName()
// dsymutil `foo` will assemble the debug information into `foo.dSYM/Contents/Resources/DWARF/foo`.
// Coulnt't find the official source, but at least libbacktrace locate the dSYM file from this location.
// https://github.com/ianlancetaylor/libbacktrace/blob/cdb64b688dda93bbbacbc2b1ccf50ce9329d4748/macho.c#L908
val dSymPath =
s"$filename.dSYM/Contents/Resources/DWARF/${basename}"
if (new File(dSymPath).exists()) {
val dSYMBin: BinaryFile = new BinaryFile(
new File(dSymPath)
)
val dSYMMacho = MachO.parse(dSYMBin)
if (dSYMMacho.uuid == macho.uuid) // Validate the macho in dSYM has the same build uuid.
processMacho(dSYMMacho)(dSYMBin)
else None
} else None
}

for {
dwarf <- dwarfOpt
dies = dwarf._1.flatMap(_.units)
subprograms = filterSubprograms(dies)
offset = vmoffset.get_vmoffset()
} yield {
DwarfInfo(
subprograms = subprograms,
strings = dwarf._2,
offset = offset,
format = MACHO
)
}
} else if (magic == ELF_MAGIC) {
None
} else { // COFF has various magic numbers
None
}

}
def readDWARF(
debug_info: DWARF.Section,
debug_abbrev: DWARF.Section,
debug_str: DWARF.Section
)(implicit bf: BinaryFile) = {
DWARF.parse(
debug_info = DWARF.Section(debug_info.offset, debug_info.size),
debug_abbrev = DWARF.Section(debug_abbrev.offset, debug_abbrev.size)
) ->
DWARF.Strings.parse(
DWARF.Section(debug_str.offset, debug_str.size)
)
}
}