Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have a mechanism for moving CodeSourceMap information out of AOT app into a mapping file. #35851

Closed
mkustermann opened this issue Feb 4, 2019 · 26 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. customer-flutter type-performance Issue relates to performance or code size vm-aot-code-size Related to improvements in AOT code size

Comments

@mkustermann
Copy link
Member

mkustermann commented Feb 4, 2019

It appears we preserve CodeSourceMap objects in obfuscated mode. They are used for building string representations of stack traces (one frame might be decoded into multiple functions) as well as for storing pc offset -> token position mappings.

This information is not necessary for the AOT runtime for correctness, only for stack traces and exception messages. We could therefore consider moving this information out of AOT snapshot, similarly to how we have an external obfuscation map.

These CodeSourceMap's sometimes take up 15% of the isolate snapshot (upwards of half a MB).

@mkustermann mkustermann added the area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. label Feb 4, 2019
@a-siva a-siva added the type-enhancement A request for a change that isn't a bug label Feb 5, 2019
@a-siva
Copy link
Contributor

a-siva commented Feb 5, 2019

Would this map not be part of the APK, we would provide some tool for folks to run in order to get sane exception messages?

@mkustermann
Copy link
Member Author

mkustermann commented Feb 5, 2019

Would this map not be part of the APK,

No.

we would provide some tool for folks to run in order to get sane exception messages?

Precisely.

With --obfuscate mode the stack trace is already now not readable. To make it readable one needs the obfuscation map, which is generated at AOT compile-time (via --save-obfuscation-map=<file>) but not included in the APK.

What would change is that the stack trace would become shorter (i.e. we wouldn't expand frames on the stack to the actual inlined functions) and have special markers as line numbers (probably token positions). Some exception messages (in addition to stack traces) would have special markers in them.

Just like with the obfuscation map we already produce, we would produce more metadata which can expand stack frames, map token positions to source file + line numbers and special tokens in exceptions to the actual names referenced.

A tool would read in the already exiting obfuscation map and the other metadata and can produce the original stack + exception message.

@rmacnak-google
Copy link
Contributor

When --dwarf_stack_traces=true, this information in encoded in DWARF form and the CodeSourceMaps are dropped.

@mkustermann
Copy link
Member Author

mkustermann commented Feb 8, 2019

When --dwarf_stack_traces=true, this information in encoded in DWARF form and the CodeSourceMaps are dropped.

That is a bug then: This information is used in AOT runtime for generation of exception messages. It probably crashes the VM if the code source maps are missing:

From runtime/vm/runtime_entry.cc:

DEFINE_RUNTIME_ENTRY(NullError, 0) {
  ...
  const uword pc_offset = caller_frame->pc() - code.PayloadStart();
  ...
  const CodeSourceMap& map =
      CodeSourceMap::Handle(zone, code.code_source_map());
  ASSERT(!map.IsNull());

  CodeSourceMapReader reader(map, Array::null_array(),
                             Function::null_function());
  const intptr_t name_index = reader.GetNullCheckNameIndexAt(pc_offset);
  RELEASE_ASSERT(name_index >= 0);

  const ObjectPool& pool = ObjectPool::Handle(zone, code.GetObjectPool());
  const String& member_name =
      String::CheckedHandle(zone, pool.ObjectAt(name_index));

  NullErrorHelper(zone, member_name);
}

Even if we fix this particular code to use DWARF instead, we still are in a situation where the runtime needs the information. What I'm proposing is we a) move the information out of AOT snapshot and b) make the runtime work without it c) provide a tool to decode runtime stacks (and possibly exception messages) using the metadata on the side

@rmacnak-google
Copy link
Contributor

That's a bug in the NullError runtime entry then, as the DWARF mechanism is older. The general case exception machinery already knows how to handle the absence of the CodeSourceMap and produce stack traces that can be symbolized, either manually with addr2line/llvm-symbolizer or more conveniently with ndk-stack.

dart-bot pushed a commit that referenced this issue Feb 25, 2019
This change fixes null checks in case of missing CodeSourceMap,
which happens when --dwarf_stack_traces is used.

Issue: #35851
Change-Id: I7d79533cc42c988618e2efe988f530d169d9aec1
Reviewed-on: https://dart-review.googlesource.com/c/93989
Commit-Queue: Alexander Markov <alexmarkov@google.com>
Reviewed-by: Ryan Macnak <rmacnak@google.com>
@alexmarkov
Copy link
Contributor

Fixed NullError runtime entry to work in case of missing CodeSourceMap.

@mraleph
Copy link
Member

mraleph commented May 8, 2019

Assigning to @sstrickl to investigate the ways we could split this information out of the snapshot. This might really benefit some internal use cases @mkustermann reports around 10-20% in CodeSourceMap in AOT snapshots.

@mkustermann
Copy link
Member Author

@sstrickl Heads up: Ryan has a change out, namely cl/81323 to directly generate ELF from gen_snapshot. This should allow us to get rid of the CodeSourceMap objects, since that can be encoded in the ELF file.

Once we have the ELF file, we can split out the source map information to reduce the ELF size (see discussion further above) and used on the side for decoding stack traces.

@wangying3426
Copy link

wangying3426 commented Sep 16, 2019

@mkustermann @mraleph @alexmarkov When use --dwarf_stack_traces to build snapshot, the CodeSourceMaps are dropped, but the exception information missed the stack traces, just like this:

flutter: ══╡ EXCEPTION CAUGHT BY WIDGETS LIBRARY ╞═══════════════════════════════════════════════════════════
flutter: The following NoSuchMethodError was thrown building CupertinoAlertDemo:
flutter: The method 'debugDescribeChildren' was called on null.
flutter: Receiver: null
flutter: Tried calling: debugDescribeChildren()
flutter:
flutter: When the exception was thrown, this was the stack:
flutter:
flutter: ════════════════════════════════════════════════════════════════════════════════════════════════════

So, what's the reason and how to get the whole stack traces?

@wangying3426
Copy link

using the metadata on the side

When --dwarf_stack_traces=true, this information in encoded in DWARF form and the CodeSourceMaps are dropped.

That is a bug then: This information is used in AOT runtime for generation of exception messages. It probably crashes the VM if the code source maps are missing:

From runtime/vm/runtime_entry.cc:

DEFINE_RUNTIME_ENTRY(NullError, 0) {
  ...
  const uword pc_offset = caller_frame->pc() - code.PayloadStart();
  ...
  const CodeSourceMap& map =
      CodeSourceMap::Handle(zone, code.code_source_map());
  ASSERT(!map.IsNull());

  CodeSourceMapReader reader(map, Array::null_array(),
                             Function::null_function());
  const intptr_t name_index = reader.GetNullCheckNameIndexAt(pc_offset);
  RELEASE_ASSERT(name_index >= 0);

  const ObjectPool& pool = ObjectPool::Handle(zone, code.GetObjectPool());
  const String& member_name =
      String::CheckedHandle(zone, pool.ObjectAt(name_index));

  NullErrorHelper(zone, member_name);
}

Even if we fix this particular code to use DWARF instead, we still are in a situation where the runtime needs the information. What I'm proposing is we a) move the information out of AOT snapshot and b) make the runtime work without it c) provide a tool to decode runtime stacks (and possibly exception messages) using the metadata on the side

Is there any progress?

@mkustermann
Copy link
Member Author

So, what's the reason and how to get the whole stack traces?

The VM would need to print a list of return addresses / PCs from the stack. Those PCs probably need to be relative to the .text segment of the .so/.dylib file.

That should be sufficient for an external tool to read those PC offsets and use the DWARF information (which is not available on device to safe space) to decode the stack trace.

Is there any progress?

@sstrickl Has been occupied with another task, but we will come back to this in the very near future.

@mraleph mraleph added type-performance Issue relates to performance or code size vm-aot-code-size Related to improvements in AOT code size and removed type-enhancement A request for a change that isn't a bug labels Sep 20, 2019
@mkustermann
Copy link
Member Author

mkustermann commented Oct 30, 2019

To give an update on this:

A cl/121857 is in-progress for making our AOT compiler output a second ELF file containing the necessary dwarf information for decoding stack traces. We still need to ensure this will be sufficient for all use-cases and ensure tooling is available to decode stack traces. We will keep the issue updated.

As a side note, flutter users can already get this size decrease by passing --extra-gen-snapshot-options=--dwarf-stack-traces to flutter tools, e.g.:

f/examples/flutter_gallery % ../../bin/flutter build aot --target-platform=android-arm64
f/examples/flutter_gallery % /bin/ls build/aot/app.so
11318192  build/aot/app.so

f/examples/flutter_gallery %  ../../bin/flutter build aot --target-platform=android-arm64 --extra-gen-snapshot-options=--dwarf-stack-traces
f/examples/flutter_gallery % ls build/aot/app.so 
10433456 build/aot/app.so    // <== around -8%

@Hixie
Copy link
Contributor

Hixie commented Nov 6, 2019

Would a mode with no stack traces at all give an even bigger improvement? (I can imagine some developers being happy with such a trade-off.)

@mkustermann
Copy link
Member Author

mkustermann commented Nov 6, 2019

Would a mode with no stack traces at all give an even bigger improvement? (I can imagine some developers being happy with such a trade-off.)

Probably not:

When code source maps are not included in the AOT snapshot all the VM knows are return addresses on the stack, which it can convert to relative DSO addresses and print.
=> So it doesn't keep any additional data around we could remove.

Those DSO-relative addresses can be used for offline decoding (via DWARF).

(But there is of course other size improvement possibilities.)

@Hixie
Copy link
Contributor

Hixie commented Nov 20, 2019

@rmacnak-google any updates?

@rmacnak-google
Copy link
Contributor

The ability to remove CodeSourceMaps from snapshots and make the information available in DWARF format has existed since March 2017 (https://codereview.chromium.org/2723213002).

@mkustermann
Copy link
Member Author

mkustermann commented Nov 21, 2019

As @rmacnak-google points out the support has been there for a long time.

We are just working on some tools to make it more convenient to use

  • tool for decoding the stack traces on all platforms (avoids relying on addr2line / atos, which is not always available)
  • making our AOT compiler directly emit the metadata on the side (avoids relying on strip / objcopy, which is not necessarily always available)

and small fixes (like making inlined frames work decode correctly)

Our hope is that with the additional tooling available, we can drop the CodeSourceMaps by default - so all flutter users automatically will get smaller AOT apps in flutter-release mode.

dart-bot pushed a commit that referenced this issue Nov 25, 2019
Also fixes the low and high PC information in the DWARF output of
inlining nodes.

Bug: #35851
Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-linux-debug-x64-try,vm-kernel-precomp-android-release-arm64-try,vm-kernel-mac-release-x64-try,vm-kernel-precomp-mac-release-simarm_x64-try,vm-kernel-precomp-win-release-x64-try
Change-Id: I11c3577bdc5c12abdc88aa799c1aae3f505294df
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/123734
Commit-Queue: Teagan Strickland <sstrickl@google.com>
Reviewed-by: Martin Kustermann <kustermann@google.com>
@gaaclarke
Copy link
Contributor

It might be better for us to not make stripping CodeSourceMaps the default in flutter-release. It will make framework developers jobs a bit harder. The people most likely to care about the binary size are the ones that would be technically savvy enough to "symbolicate" their stacktraces.

I was playing with nim this weekend and they have 2 levers to control this --stackTrace:on and --lineTrace:on. One gives you control of having procedure names, the other lines of code. Maybe there is an option that leaves CodeSourceMaps in the binary but thins them out like nim does? That might be a better default?

@mraleph
Copy link
Member

mraleph commented Nov 25, 2019

@gaaclarke users in general tend to care about the binary size and I would argue that users expect default build settings to provide smallest possible side. I would not be surprised that people take Flutter "Hello, World!" with default settings and then make conclusions based on the size they see.

In any case @sstrickl is going to work on a more formal proposal which we will circulate for discussion among Flutter team.

@Hixie
Copy link
Contributor

Hixie commented Dec 4, 2019

@gaaclarke We get a lot of complaints about our binary size, so I think it's safe to say that a substantial portion of our developer base does care, enough that we should make making the binary size smaller for release the default even at the cost of harder-to-symbolicate stack traces. (We already make C++ stack traces hard to symbolicate for this reason.)

@mraleph awesome. Please keep us posted!

@gaaclarke
Copy link
Contributor

Scene 1

User: Hey Flutter, my customers are getting crashes when the app starts up.
Flutter: Can we see your logs?
User: Here you go.
Flutter: Looks like there is an IO exception getting thrown, can we see your Dart debugger symbol archive for the build you submitted to the store so we can symbolicate the stacktrace?
User: What's that?
Flutter: It's that thing that's get generated when you compile your Flutter app in release mode.
User: Oh, I didn't save that.

Scene 2

Flutter: Hey do you care about binary size?
Users: Oh, yeah!

Flutter: Hey do you care about making it easy to debug customer/Flutter issues?
Users: Ohhhh, yeah!


Stop me if I'm wrong. I'm just sayin: thinned out stack trace info might be a better default with the option to go whole hog.

@Hixie
Copy link
Contributor

Hixie commented Dec 4, 2019

I mean, we need to base this on data. Right now, based on our surveys and such, as far as I can tell the scenes are actually:

Scene 1

User: Hey Flutter, my customers are getting crashes when the app starts up.
Flutter: Figure out the steps to reproduce and the devices affected, submit a bug with a reduced test case.
User: Will do.

Scene 2

Flutter: Hey do you care about making it easy to debug customer/Flutter issues?
Users: Can you make the binary smaller?

dart-bot pushed a commit that referenced this issue Dec 12, 2019
The flag can be used when creating AOT snapshots. The resulting file can be
used with package:vm/dwarf/convert.dart to convert DWARF-based stack traces
to stack traces with function, file, and line number information.

Currently the saved file will be an ELF file with DWARF debugging information,
but this is subject to change in the future. To avoid being affected by any
changes in format, read the DWARF information from the file using
Dwarf.fromFile() in package:vm/dwarf/dwarf.dart.

Also adds --dwarf-stack-traces to the VM global flag list, so its value at
compilation will be read out of snapshots by the precompiled runtime.

Fixes #39512, which was due to
a missing compiler::target:: prefix for Instructions::HeaderSize() in
BlobImageWriter::WriteText().

Exposes the information suggested in
#39490 so that package:vm/dwarf
can appropriately convert absolute PC addresses to the correct virtual
address for a given DWARF line number program.

Bug: #35851
Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-linux-debug-x64-try,vm-kernel-precomp-android-release-arm64-try,vm-kernel-precomp-mac-release-simarm_x64-try,vm-kernel-precomp-win-release-x64-try
Change-Id: I80d900fd9e5f1e2399ad3f2fd25c85131a7ea43c
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/121857
Commit-Queue: Teagan Strickland <sstrickl@google.com>
Reviewed-by: Martin Kustermann <kustermann@google.com>
dart-bot pushed a commit that referenced this issue Dec 12, 2019
Bug: #35851
Change-Id: I3a9d16a85abab825cb698b491d0846dc50f1fff4
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/127893
Commit-Queue: Teagan Strickland <sstrickl@google.com>
Reviewed-by: Martin Kustermann <kustermann@google.com>
@sstrickl
Copy link
Contributor

Okay, have finished doing my cleanups and tool creation and also have an initial version of documentation on what DWARF stack traces are and how they can work for you. PTAL, @Hixie and others interested.

dart-bot pushed a commit that referenced this issue Dec 18, 2019
Removes an unnecessary change to the names of type testing stubs that
allowed name collisions to happen.

Old commit message:

The flag can be used when creating AOT snapshots. The resulting file can be
used with package:vm/dwarf/convert.dart to convert DWARF-based stack traces
to stack traces with function, file, and line number information.

Currently the saved file will be an ELF file with DWARF debugging information,
but this is subject to change in the future. To avoid being affected by any
changes in format, read the DWARF information from the file using
Dwarf.fromFile() in package:vm/dwarf/dwarf.dart.

Also adds --dwarf-stack-traces to the VM global flag list, so its value at
compilation will be read out of snapshots by the precompiled runtime.

Fixes #39512, which was due to
a missing compiler::target:: prefix for Instructions::HeaderSize() in
BlobImageWriter::WriteText().

Exposes the information suggested in
#39490 so that package:vm/dwarf
can appropriately convert absolute PC addresses to the correct virtual
address for a given DWARF line number program.

Bug: #35851
Change-Id: I6723449dceb0b89c054a1f1e70e7acd7749baf14
Cq-Include-Trybots: luci.dart.try:vm-kernel-precomp-linux-release-x64-try,vm-kernel-precomp-android-release-arm64-try,vm-kernel-precomp-mac-release-simarm64-try,vm-kernel-precomp-win-release-x64-try
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/128730
Reviewed-by: Martin Kustermann <kustermann@google.com>
dart-bot pushed a commit that referenced this issue Dec 18, 2019
…ies."

Bug: #35851
Change-Id: Iae4bb32fd8fc2770277cfc0b7961ff89e5928b29
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/128733
Commit-Queue: Teagan Strickland <sstrickl@google.com>
Reviewed-by: Martin Kustermann <kustermann@google.com>
@jonahwilliams
Copy link
Contributor

I'm very interested in getting this wired up in the tool. But whether or not it's on by default, we'll need the symbolication library before we can proceed. What are the current plans for publishing it?

@mkustermann
Copy link
Member Author

@jonahwilliams We have had discussions on what to name the package and agreed upon package:vm_stack_traces now. The cl/132281 is ready to land.

It will land today or tomorrow and we'll publish and ping the flutter/flutter#48726 once it's on pub

@mkustermann
Copy link
Member Author

All the infrastructure should be landed, the package:native_stack_traces is uploaded to pub, so I'll close the issue.

Please re- open if there's something missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. customer-flutter type-performance Issue relates to performance or code size vm-aot-code-size Related to improvements in AOT code size
Projects
None yet
Development

No branches or pull requests

10 participants