-
Notifications
You must be signed in to change notification settings - Fork 83
[doc] Add performance documentation #2739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
925b7e9
[doc] Add performance documentation
dcharkes d254e75
address comments
dcharkes 2209199
Add windows perf tools
dcharkes 7227c6d
add link
dcharkes aac23dc
update the `perf` documentation
dcharkes e9f94d6
Add some community sources
dcharkes File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,122 @@ | ||
| # Performance with Native Code | ||
|
|
||
| How to assess performance of Dart and native code, and how to improve it. | ||
|
|
||
| ## Profiling Performance | ||
|
|
||
| | Tool | Platform | Primary Use Case | Measures (Dart CPU) | Measures (Native CPU) | Measures (Dart Heap) | Measures (Native Heap) | | ||
| | --------------------------------------- | --------- | --------------------------------------- | ---------------------------- | ------------------------ | -------------------- | ---------------------------------------------------------------- | | ||
| | [Dart DevTools] | All | Profiles Dart VM, UI jank, Dart heap | Yes | Opaque "Native" block | Yes | Tracks "External" VM-aware memory only; Misses native-heap leaks | | ||
| | [Xcode Instruments (Time Profiler)] | iOS/macOS | Profiles native CPU call stacks | No | Yes (full symbolication) | No | No | | ||
| | [Xcode Instruments (Leaks/Allocations)] | iOS/macOS | Profiles native heap (malloc, mmap) | No | No | No | Yes | | ||
| | [Android Studio Profiler (CPU)] | Android | Profiles native C/C++ CPU execution | No | Yes (traces C++ calls) | No | No | | ||
| | [Perfetto (heapprofd)] | Android | Advanced native heap profiling | No | No | No | Yes (traces malloc/free call stacks) | | ||
| | [Linux perf] | Linux | Unified Dart AOT + Native CPU profiling | Yes (requires special flags) | Yes | No | No | | ||
| | [Visual Studio CPU Usage Profiler] | Windows | Profiles native C/C++ CPU execution | No | Yes (traces C++ calls) | No | No | | ||
| | [WPA (Heap Analysis)] | Windows | Advanced native heap profiling | No | No | No | Yes (traces malloc/free call stacks) | | ||
|
|
||
| <!-- TODO: Add documentation for the other tools. --> | ||
|
|
||
| ### Dart DevTools | ||
|
|
||
| For only assessing the performance of the Dart code, and treating native code as | ||
| a black box, use the Dart performance tooling. | ||
|
|
||
| See the documentation on https://dart.dev/tools/dart-devtools and | ||
| https://docs.flutter.dev/perf. For FFI, most specifically, you can use | ||
| https://docs.flutter.dev/tools/devtools/cpu-profiler and | ||
| https://docs.flutter.dev/tools/devtools/performance#timeline-events-tab. | ||
| For synchronous FFI calls you can add synchronous timeline events, and for | ||
| asynchronous code (using async callbacks or helper isolates) you can use async | ||
| events. | ||
|
|
||
| ### `perf` on Linux | ||
|
|
||
| To see both Dart and native symbols in a flame graph, you can use `perf` on | ||
| Linux. | ||
|
|
||
| To run the [FfiCall benchmark] in JIT mode with `perf`: | ||
|
|
||
| ``` | ||
| $ perf record -g dart --generate-perf-events-symbols benchmarks/FfiCall/dart/FfiCall.dart && \ | ||
| perf report --hierarchy | ||
| ``` | ||
|
|
||
| Note that Flutter apps are deployed in AOT mode. So prefer profiling in AOT | ||
| mode. | ||
|
|
||
| For AOT, we currently don't have a [single command | ||
| yet](https://github.com/dart-lang/sdk/issues/54254). You need to use | ||
| `precompiler2` command from the Dart SDK. See [building the Dart SDK] for how to | ||
| build the Dart SDK. | ||
|
|
||
| ``` | ||
| $ pkg/vm/tool/precompiler2 benchmarks/FfiCall/dart/FfiCall.dart benchmarks/FfiCall/dart/FfiCall.dart.bin && \ | ||
| perf record -g pkg/vm/tool/dart_precompiled_runtime2 --generate-perf-events-symbols benchmarks/FfiCall/dart/FfiCall.dart.bin && \ | ||
| perf report --hierarchy | ||
| ``` | ||
dcharkes marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| To analyze a performance issue in Flutter, it is best to reproduce the issue in | ||
| Dart standalone. | ||
|
|
||
| ## Improving performance | ||
|
|
||
| There are some typical patterns to improve performance: | ||
|
|
||
| * To avoid dropped frames, move long-running FFI calls to a helper isolate. | ||
| * To avoid copying data where possible: | ||
| * Keep data in native memory, operating on [`Pointer`][]s and using | ||
| [`asTypedList`][] to convert the pointers into [`TypedData`][]. | ||
| * For short calls, if the memory is in Dart, avoid copying by using leaf calls | ||
| ([`isLeaf`][], [`isLeaf` (2)][], [`isLeaf` (3)][]) and [`address`]. (Leaf | ||
| calls prevent the Dart GC from running on all isolates, which allows giving | ||
| a pointer to native code of an object in Dart.) | ||
| * Use [`Isolate.exit`][] to send large data from a helper isolate to the main | ||
| isolate after a large computation. | ||
| * For many small calls, limit the overhead per call. This makes a significant | ||
| difference for calls shorter than 1 us (one millionth of a second), and can be | ||
| considered for calls of up to 10 us. | ||
| * Use leaf calls ([`isLeaf`][], [`isLeaf` (2)][], [`isLeaf` (3)][]). | ||
| * Prefer using [build hooks][] with [`Native`] `external` | ||
| functions over [`DynamicLibrary.lookupFunction`][] and | ||
| [`Pointer.asFunction`][]. | ||
|
|
||
| For reference, the [FfiCall benchmark][] reports 1000 FFI calls in AOT on Linux x64: | ||
| ``` | ||
| FfiCall.Uint8x01(RunTime): 234.61104068226345 us. | ||
| FfiCall.Uint8x01Leaf(RunTime): 71.9994712538334 us. | ||
| FfiCall.Uint8x01Native(RunTime): 216.07292770828917 us. | ||
| FfiCall.Uint8x01NativeLeaf(RunTime): 27.64136415181509 us. | ||
| ``` | ||
| A single call that is native-leaf takes 28 ns, while an `asFunction`-non-leaf | ||
| takes 235 ns. So for calls taking ~1000 ns that's a 20% speedup. | ||
|
|
||
| ## Community sources | ||
|
|
||
| * (Video) Using Dart FFI for Compute-Heavy Tasks: | ||
| https://www.youtube.com/watch?v=eJR5C0VRCjU | ||
| * (Video) Maximize Speed with Dart FFI: Beginner’s Guide to High-Performance | ||
| Integration https://www.youtube.com/watch?v=HF8gHAakb1Q | ||
|
|
||
| [`address`]: https://api.dart.dev/dart-ffi/StructAddress/address.html | ||
| [`asTypedList`]: https://api.dart.dev/dart-ffi/Uint8Pointer/asTypedList.html | ||
| [`DynamicLibrary.lookupFunction`]: https://api.dart.dev/dart-ffi/DynamicLibraryExtension/lookupFunction.html | ||
| [`isLeaf` (2)]: https://api.dart.dev/dart-ffi/NativeFunctionPointer/asFunction.html | ||
| [`isLeaf` (3)]:https://api.dart.dev/dart-ffi/DynamicLibraryExtension/lookupFunction.html | ||
| [`isLeaf`]: https://api.dart.dev/dart-ffi/Native/isLeaf.html | ||
| [`Isolate.exit`]: https://api.dart.dev/dart-isolate/Isolate/exit.html | ||
| [`Native`]: https://api.dart.dev/dart-ffi/Native-class.html | ||
| [`Pointer.asFunction`]: https://api.dart.dev/dart-ffi/NativeFunctionPointer/asFunction.html | ||
| [`Pointer`]: https://api.dart.dev/dart-ffi/Pointer-class.html | ||
| [`TypedData`]: https://api.dart.dev/dart-typed_data/TypedData-class.html | ||
| [Android Studio Profiler (CPU)]: https://developer.android.com/studio/profile | ||
| [build hooks]: https://dart.dev/tools/hooks | ||
| [building the Dart SDK]: https://github.com/dart-lang/sdk/blob/main/docs/Building.md | ||
| [Dart DevTools]: https://dart.dev/tools/dart-devtools | ||
| [FfiCall benchmark]: https://github.com/dart-lang/sdk/blob/main/benchmarks/FfiCall/dart/FfiCall.dart | ||
| [Linux perf]: https://perfwiki.github.io/main/ | ||
| [Perfetto (heapprofd)]: https://perfetto.dev/ | ||
| [Visual Studio CPU Usage Profiler]: https://learn.microsoft.com/en-us/visualstudio/profiling/cpu-usage | ||
| [WPA (Heap Analysis)]: https://learn.microsoft.com/en-us/windows-hardware/test/wpt/windows-performance-analyzer | ||
| [Xcode Instruments (Leaks/Allocations)]: https://developer.apple.com/documentation/xcode/gathering-information-about-memory-use | ||
| [Xcode Instruments (Time Profiler)]: https://developer.apple.com/tutorials/instruments | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.