Skip to content

Knowledge Dump Debugging Epoch Programs

Mike Lewis edited this page Jul 18, 2016 · 5 revisions

Epoch Knowledge Dump Series

Debugging Epoch Programs Using Native Windows Tools

One of the key usability traits of a new programming language is the debugging experience. Without a solid set of tools for debugging, any novel language faces a serious uphill battle for adoption. Epoch has been since its inception a pragmatic language first and foremost; if it doesn't help get work done, it isn't doing its job. Debugging is no exception (pardon the pun) so we want to have a first-class debugging experience ready for future Epoch programmers.

One option for great debuggability of new languages is to build the debug tools by hand. In fact this was largely the plan for Epoch for a long time. We need to emit comprehensive metadata about the code anyways for garbage collection purposes, so why not hitchhike on that data and deliver a custom debugger?

Of course the problem with this is that building a world-class debugger is a monumental undertaking, and is not even guaranteed to hit the bar for quality. More specifically, a home-grown debugger is likely to offer a very different UX than the tools developers already know. So the gold standard is to integrate with existing tools cleanly.

Since Epoch is primarily targeting Windows (for now!) this means that the ideal debugging experience is to work seamlessly with tools like Visual Studio and WinDbg. Moreover, it means adopting the PDB debug file format so that things like DbgHelp.dll can generate stack traces, minidumps, and so forth.

It doesn't take much research into the PDB format to discover that very little is actually publicly known about how these files work. There are a tiny number of projects that have interfaced successfully with PDB files, most notably cv2pdb. The strategy used by this tool is to talk directly to MSPDB140.dll (or a similarly named file depending on local Visual Studio version) and use its APIs to build up and emit a PDB.

We've adopted this approach for the Epoch 64-bit compiler. After much tweaking and exploring, the compiler is able to emit debug symbols that sort-of work for popular Windows debugging tools. More specifically:

  • DBH.exe can dump the function names and source mapping correctly from our generated PDB
  • Visual Studio 2015 correctly generates callstacks with function names
  • Visual Studio 2015 does not correctly use source-file mapping data, so it can't show source code
  • WinDbg correctly generates callstacks but function names are all empty strings
  • WinDbg can show source files from a given instruction in the disassembly
  • DbgHelp generates correct callstacks but again with empty-string names for functions
  • DbgHelp does not correctly follow the source mapping data

There are several notable holes in the current PDB generation code:

  • Type metadata is not emitted yet; this may be a large part of why the experience is inconsistent
  • PDB data is generated using the AddSymbols API of MSPDB140.dll fed with CodeView data generated by LLVM; it is entirely possible that this data is not converting cleanly
  • Many aspects of the PDB API are still mysterious and make it difficult to know what's missing
  • The raw debug data being generated by LLVM is in some cases bogus, because we feed it hack data for laziness reasons

Ultimately we're very close to having a usable debug experience on Windows. The last bits of work will focus on compiler front-end improvements to generate better debug data (such as accurate line and column information for source mappings). Right now the main hurdle is getting the PDB generation nailed down but it feels like it's within reach.