Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement DWARF parser for better backtraces #58

Closed
mathetake opened this issue Nov 23, 2021 · 6 comments · Fixed by #881
Closed

Implement DWARF parser for better backtraces #58

mathetake opened this issue Nov 23, 2021 · 6 comments · Fixed by #881
Assignees

Comments

@mathetake
Copy link
Member

mathetake commented Nov 23, 2021

Background

LLVM-based compilers for Wasm, for examples C/C++, Rust, Zig, TinyGo (virtually 100% of viable languages),
emit DWARF information into .debug_* custom sections. The following is the sections contained in a TinyGo binary:

$ wasm-objdump main.wasm -h

main.go.wasm:	file format wasm 0x1

Sections:

     Type start=0x0000000b end=0x00000158 (size=0x0000014d) count: 42
   Import start=0x0000015b end=0x000003df (size=0x00000284) count: 18
 Function start=0x000003e2 end=0x000004e8 (size=0x00000106) count: 260
    Table start=0x000004ea end=0x000004ef (size=0x00000005) count: 1
   Memory start=0x000004f1 end=0x000004f4 (size=0x00000003) count: 1
   Global start=0x000004f6 end=0x000004fe (size=0x00000008) count: 1
   Export start=0x00000501 end=0x000007ac (size=0x000002ab) count: 31
     Code start=0x000007b0 end=0x0001b258 (size=0x0001aaa8) count: 260
     Data start=0x0001b25c end=0x00020862 (size=0x00005606) count: 2
   Custom start=0x00020866 end=0x00034e9a (size=0x00014634) ".debug_info"
   Custom start=0x00034e9d end=0x00035f66 (size=0x000010c9) ".debug_pubtypes"
   Custom start=0x00035f6a end=0x000431a7 (size=0x0000d23d) ".debug_loc"
   Custom start=0x000431aa end=0x00044f60 (size=0x00001db6) ".debug_ranges"
   Custom start=0x00044f62 end=0x00044fa1 (size=0x0000003f) ".debug_aranges"
   Custom start=0x00044fa4 end=0x00046ef6 (size=0x00001f52) ".debug_abbrev"
   Custom start=0x00046efa end=0x00059503 (size=0x00012609) ".debug_line"
   Custom start=0x00059507 end=0x0006510b (size=0x0000bc04) ".debug_str"
   Custom start=0x0006510f end=0x0006bf6b (size=0x00006e5c) ".debug_pubnames"
   Custom start=0x0006bf6e end=0x0006e6e8 (size=0x0000277a) "name"
   Custom start=0x0006e6eb end=0x0006e778 (size=0x0000008d) "producers"

By reading debug sections, we can associate "each wasm instruction" in functions to a specific line of a source code which the binary is compiled from.

Why?

Some of the de-facto Wasm tools have already supported the DWARF format. For example Google Chrome[3] has allowed users to debug Wasm programs on the browser. Another example is Wasmtime -- when you run the panic example in this repo with WASMTIME_BACKTRACE_DETAILS=1, you can see the backtrace with source code info mation:

$ WASMTIME_BACKTRACE_DETAILS=1 wasmtime run examples/wasm/trap.wasm --invoke cause_panic
panic: causing panic!!!!!!!!!!
Error: failed to run main module `examples/wasm/trap.wasm`

Caused by:
    0: failed to invoke `cause_panic`
    1: wasm trap: unreachable
       wasm backtrace:
           0:  0x92a - runtime.abort
                           at /usr/local/lib/tinygo/src/runtime/runtime_tinygowasm.go:63:6
                     - runtime._panic
                           at /usr/local/lib/tinygo/src/runtime/panic.go:13:7
           1:  0x9ba - main.three
                           at /home/mathetake/gasm/examples/wasm/trap.go:19:7
           2:  0x9b0 - main.two
                           at /home/mathetake/gasm/examples/wasm/trap.go:15:7
           3:  0x9a6 - main.one
                           at /home/mathetake/gasm/examples/wasm/trap.go:11:5
           4:  0x99c - cause_panic
                           at /home/mathetake/gasm/examples/wasm/trap.go:7:5

On the other hand, at the moment of this writing, our backtrace is not using DWARF, but just parsing "name" custom sections and attach each function name:

panic: causing panic!!!!!!!!!!
wasm runtime error: unreachable
wasm backtrace:
	0: runtime._panic
	1: main.three
	2: main.two
	3: main.one
	4: cause_panic

This will be much more useful when users run non-TinyGo Wasms -- usually the function names are mangled by compilers (luckily TinyGo does not!) so they are basically not human-readable. For example, Rust binary's backtrace with custom sections would look like this:

  0:  0x42deb - __rust_start_panic
  1:  0x42c0c - rust_panic
  2:  0x42882 - _ZN3std9panicking20rust_panic_with_hook17h072472ae3822b936E
  3:  0x32914 - _ZN3std9panicking11begin_panic28_$u7b$$u7b$closure$u7d$$u7d$17hed88036b12f483dfE
  4:  0x34891 - _ZN3std10sys_common9backtrace26__rust_end_short_backtrace17h9133fcc3e85035deE
  5:  0x32810 - _ZN3std9panicking11begin_panic17he6f6e918174263cfE
  6:  0x39eb - _ZN77_$LT$http_headers..HttpHeaders$u20$as$u20$proxy_wasm..traits..HttpContext$GT$6on_log17hde90e85ea16e616eE
  7:  0x2ae53 - _ZN10proxy_wasm10dispatcher10Dispatcher6on_log17hc6cd4fb35c538b86E
  8:  0x2d3dd - _ZN10proxy_wasm10dispatcher12proxy_on_log28_$u7b$$u7b$closure$u7d$$u7d$17h3f864ec735f41e70E
  9:  0x311bd - _ZN3std6thread5local17LocalKey$LT$T$GT$8try_with17hc87d8e9cf2d2494cE

With DWARF information, we don't need to parse "name" custom section therefore we won't suffer this mangled dirty symbols and instead we can emit each trace with human-readable function names plus source code info.

How?

Wasm DWARF format[1] is almost same as the standard DWARF specification version 5?[2] with the difference where the address should be interpreted as an offset from the beginning of "the code section" vs the beginning of "the binary" in non-Wasm format.

So it should be simple to write parser by getting insights from other DWARF implementations.

Links

[1] https://yurydelendik.github.io/webassembly-dwarf/
[2] https://dwarfstd.org/doc/DWARF5.pdf pdf!
[3] https://twitter.com/ChromeDevTools/status/1192803818024710145

@r8d8
Copy link
Contributor

r8d8 commented Feb 8, 2022

Hi @mathetake
Does this issue still relevant?
Looking to contribute and landed with it

@codefromthecrypt
Copy link
Contributor

@r8d8 I think this definitely is still relevant, and we'd want to start with backtrace enhancement. If I understand you correctly, you are interested in contributing this? If so, I'd recommend starting small as this may touch a few different spots and iterating small can give you less burden especially in an area that is not 100pct defined in spec.

@mathetake https://github.com/yurydelendik/webassembly-dwarf is abandoned and the author isn't replying to issues anymore. We should ask the actual spec about this and cite something with a future to avoid compatibility drift, possibly asking other implementers which "specs" they plan to use. We can tentatively use the dead one of course, but some time before 1.0 we need to firm this up. wdyt?

@mathetake
Copy link
Member Author

DWARF in Wasm is added in tool-conventions which says:

These conventions are not part of the WebAssembly standard, and are not required of WebAssembly-consuming implementations to execute WebAssembly code. Tools producing and working with WebAssembly in other ways also need not follow any of these conventions. They exist only to support tools that wish to interoperate with other tools at a higher abstraction level than just WebAssembly itself.

Meaning there won't be any formal specification, but instead we have to follow the (personally hosted) specification (https://yurydelendik.github.io/webassembly-dwarf/). So we’ll have to choose a way and possibly compare against another implementation. Fortunately the implementation is stable in the sense that major wasm runtimes and compilers implement it (clang/LLVM, wasmtime, V8).

As for contribution, I think this could be multiple weeks or even months of full-time work. This includes; implementing a binary parser for DWARF 5 (note that there's nothing we can reference or use in the exiting Go ecosystem meaning that we have to implement literally from scratch), refactor the JIT compiler and interpreter so they can track original Wasm instruction address to the our runtime representation), etc.

That said, I would recommend as @codefromthecrypt suggested to start small rather than an overwhelming one like this.

@r8d8
Copy link
Contributor

r8d8 commented Feb 9, 2022

@codefromthecrypt @mathetake
Thanks for your reply.
Will take a look into WASI support direction.

@justinclift
Copy link

As a general thought, there were DWARF 5 pieces added to the main Go tool chain a while back:

So, things might not be completely from scratch. 😄

@mathetake
Copy link
Member Author

Oh that's cool! Thank you for the info! @justinclift

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants