Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve protobuf decoding performance #35154

Closed
mkustermann opened this issue Nov 13, 2018 · 1 comment
Closed

Improve protobuf decoding performance #35154

mkustermann opened this issue Nov 13, 2018 · 1 comment
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. customer-google3

Comments

@mkustermann
Copy link
Member

Our current protobuf decoding is very slow and there are a number of improvements we can do:

  • Unified object layout of typed data, external typed data and views (maybe also the new read-only classes)
    -> Make VM aware that e.g. all implementations of Uint8List have same base class and use it for the CompileType.

  • Add pure/invariant calls which we know always return the same result.

  • Improve TFA to support limited control-flow sensitive analysis for null-ability propagation.

  • Automatic specialization of List<int> related code on kernel level (maybe)

  • Fast utf8 decoding also tracked at UTF8.decode is slow (performance) #31954

@mkustermann mkustermann self-assigned this Nov 13, 2018
@mkustermann mkustermann added the area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. label Nov 13, 2018
dart-bot pushed a commit that referenced this issue Nov 22, 2018
…ys initialized with _Smi's

Until now a typed data view was first constructed and then afterwards the
constructor body validated that the offsetInBytes/length are non-null
and in range.

This CL moves the range check to the call sites of the constructor,
thereby guarenteeing that we will never initialize the
_TypedListView.{offsetInBytes,length} fields with non-_Smi values. (The
views are constructed only within the "dart:typed_data" library.  Neither the
embedder nor the runtime construct them.)

Further more this CL removes the redundant _defaultIfNull() call, since
the call sites already handled the `length == null` case.

This CL also starts annotating _TypedListView.{offsetInBytes,length}
with @pragma("vm:exact-result-type", "dart:core#_Smi") to ensure any
[LoadFieldInstr] will have the right type attached, thereby avoiding any
null-checks and mint-handling.

This improves dart-aot

On arm7hf:
    MD5: +38%, SHA256: +87%, SHA1: +89% (plus various typed data microbenchmarks)
    JsonParseDefaultReviver/StringBuffer/StringIdiomatic/JsonParseCustomReviver: -5-8% (probably due to not inlining static calls, will find out)

On arm8:
    MD5: +6.5%, SHA256: +12%, SHA1: 3.6%, JsonUtf8RoundTrip: 8% (plus various typed data microbenchmarks)
    DeltaBlue: -6.7% (probably due to not inlining static calls, will find out)

Issue #35154
Issue #31954

Change-Id: I37c822e6879f5a2d17fd9650a68cf2eee4326b01
Reviewed-on: https://dart-review.googlesource.com/c/84241
Commit-Queue: Martin Kustermann <kustermann@google.com>
Reviewed-by: Alexander Markov <alexmarkov@google.com>
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
dart-bot pushed a commit that referenced this issue Nov 22, 2018
A field/function annotated with this pragma must be guaranteed to not
return `null` at runtime.

Make use of this non-nullable annotation in the VM's type propagator.

Annotates the "_TypedListView._typedData" field to ensure the VM knows it
returns a non-nullable _TypedListView.

Furthermore annotates methods on the integer implementation.  Those particular
methods are recognized methods with a "dynamic" return type.  This caused
the type propagator to use CompileType::Dynamic() as result type.  Since a
previous CL started to only utilize the annotated type if it is better than
"dynamic" more integer operations got handled in-line, though with null-checks.
Annotating those methods to return non-null improves the in-line handling of
integer operations.

This improves dart-aot

On arm7hf:
  SHA256: +5%, SHA: +6%, JsonObjectRoundTrip: +7%, ...

On arm8:
  SHA1: +28%, MD5: +25%, SHA256: +15%, TypedData.Int16ListViewBench: +18.5%, StringInterpolation: +18%, ...


Issue #31954
Issue #35154

Change-Id: Ia4263a37241a36c9dc35e8a48893297effa6f4b2
Reviewed-on: https://dart-review.googlesource.com/c/84421
Commit-Queue: Martin Kustermann <kustermann@google.com>
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
Reviewed-by: Alexander Markov <alexmarkov@google.com>
dart-bot pushed a commit that referenced this issue Mar 15, 2019
This removes the 3 fields from the classes in Dart and instead describes
the layout in C++ via a RawTypedDataView class (as already do with
normal RawTypedData). The existing "semi" TypedDataView handle class is
changed to be a real handle class.

This decreases performance of some microbenchmarks due to field guards
not being used anymore (before the JIT could add a field guard to view classes
guarding that only normal typed data is used as backing store (and not e.g. external
typed data)

Issue #35154
Issue #31954

Change-Id: I7a0022b843a4c0fa69f53dedcc4c7bd2117cdc37
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/96806
Commit-Queue: Martin Kustermann <kustermann@google.com>
Reviewed-by: Ryan Macnak <rmacnak@google.com>
dart-bot pushed a commit that referenced this issue Mar 20, 2019
…to work on untagged arrays

Issue #35154

Change-Id: I86db977ce6c618fbbff6186cd75c8dc84546f6f9
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/97302
Commit-Queue: Martin Kustermann <kustermann@google.com>
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
dart-bot pushed a commit that referenced this issue Mar 21, 2019
…x missing deopt env bug

The truncating division can deopt if divisor is 0, so we have to inform
the compiler that we need a deoptimization environment.

Issue #35154

Cq-Include-Trybots: luci.dart.try:vm-canary-linux-debug-try, vm-dartkb-linux-debug-x64-try, vm-dartkb-linux-release-x64-try, vm-kernel-asan-linux-release-x64-try, vm-kernel-checked-linux-release-x64-try, vm-kernel-linux-debug-ia32-try, vm-kernel-linux-debug-simdbc64-try, vm-kernel-linux-debug-x64-try, vm-kernel-linux-product-x64-try, vm-kernel-linux-release-ia32-try, vm-kernel-linux-release-simarm-try, vm-kernel-linux-release-simarm64-try, vm-kernel-linux-release-simdbc64-try, vm-kernel-linux-release-x64-try, vm-kernel-optcounter-threshold-linux-release-ia32-try, vm-kernel-optcounter-threshold-linux-release-x64-try, vm-kernel-precomp-android-release-arm-try, vm-kernel-precomp-bare-linux-release-simarm-try, vm-kernel-precomp-bare-linux-release-simarm64-try, vm-kernel-precomp-bare-linux-release-x64-try, vm-kernel-precomp-linux-debug-x64-try, vm-kernel-precomp-linux-product-x64-try, vm-kernel-precomp-linux-release-simarm-try, vm-kernel-precomp-linux-release-simarm64-try, vm-kernel-precomp-linux-release-x64-try, vm-kernel-precomp-obfuscate-linux-release-x64-try, vm-kernel-precomp-win-release-simarm64-try, vm-kernel-precomp-win-release-x64-try, vm-kernel-reload-linux-debug-x64-try, vm-kernel-reload-linux-release-x64-try, vm-kernel-reload-rollback-linux-debug-x64-try, vm-kernel-reload-rollback-linux-release-x64-try, vm-kernel-win-debug-ia32-try, vm-kernel-win-debug-x64-try, vm-kernel-win-product-x64-try, vm-kernel-win-release-ia32-try, vm-kernel-win-release-x64-try
Change-Id: I10d52ca68198f20362008d4c1ad4a2736ac2d425
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/97330
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
Commit-Queue: Martin Kustermann <kustermann@google.com>
dart-bot pushed a commit that referenced this issue Mar 21, 2019
When eliminating bounds checks using loop information the length of
GenericBoundsCheckInstr is directly compared to the loop bound.

Instead we should compare the original definitions, ignoring any
boxing/unboxing. This allows the elimination of more bounds checks.

Issue #35154

Cq-Include-Trybots: luci.dart.try:vm-canary-linux-debug-try, vm-dartkb-linux-debug-x64-try, vm-dartkb-linux-release-x64-try, vm-kernel-asan-linux-release-x64-try, vm-kernel-checked-linux-release-x64-try, vm-kernel-linux-debug-ia32-try, vm-kernel-linux-debug-simdbc64-try, vm-kernel-linux-debug-x64-try, vm-kernel-linux-product-x64-try, vm-kernel-linux-release-ia32-try, vm-kernel-linux-release-simarm-try, vm-kernel-linux-release-simarm64-try, vm-kernel-linux-release-simdbc64-try, vm-kernel-linux-release-x64-try, vm-kernel-optcounter-threshold-linux-release-ia32-try, vm-kernel-optcounter-threshold-linux-release-x64-try, vm-kernel-precomp-android-release-arm-try, vm-kernel-precomp-bare-linux-release-simarm-try, vm-kernel-precomp-bare-linux-release-simarm64-try, vm-kernel-precomp-bare-linux-release-x64-try, vm-kernel-precomp-linux-debug-x64-try, vm-kernel-precomp-linux-product-x64-try, vm-kernel-precomp-linux-release-simarm-try, vm-kernel-precomp-linux-release-simarm64-try, vm-kernel-precomp-linux-release-x64-try, vm-kernel-precomp-obfuscate-linux-release-x64-try, vm-kernel-precomp-win-release-simarm64-try, vm-kernel-precomp-win-release-x64-try, vm-kernel-reload-linux-debug-x64-try, vm-kernel-reload-linux-release-x64-try, vm-kernel-reload-rollback-linux-debug-x64-try, vm-kernel-reload-rollback-linux-release-x64-try, vm-kernel-win-debug-ia32-try, vm-kernel-win-debug-x64-try, vm-kernel-win-product-x64-try, vm-kernel-win-release-ia32-try, vm-kernel-win-release-x64-try
Change-Id: Ie10880f833f3b55d0804a03c4be9bd9d1ad52f66
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/97331
Commit-Queue: Martin Kustermann <kustermann@google.com>
Reviewed-by: Aart Bik <ajcbik@google.com>
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
dart-bot pushed a commit that referenced this issue Mar 30, 2019
…e class

This moves the length field as well as an inner pointer to the start of
the data into the RawTypedDataBase class.  The inner pointer is updated
on allocation, scavenges and old space compactions.

To avoid writing more assembly the typed data view factory constructors
will be generated using IL, which will update the inner pointer. This
required adding new IR instructions and changing the existing
UnboxedIntConverter instruction.

This is the foundation work to de-virtualize calls on the public typed
data types, e.g. Uint8List.


Issue #35154

Cq-Include-Trybots: luci.dart.try:vm-canary-linux-debug-try, vm-dartkb-linux-debug-x64-try, vm-dartkb-linux-release-x64-try, vm-kernel-asan-linux-release-x64-try, vm-kernel-checked-linux-release-x64-try, vm-kernel-linux-debug-ia32-try, vm-kernel-linux-debug-simdbc64-try, vm-kernel-linux-debug-x64-try, vm-kernel-linux-product-x64-try, vm-kernel-linux-release-ia32-try, vm-kernel-linux-release-simarm-try, vm-kernel-linux-release-simarm64-try, vm-kernel-linux-release-simdbc64-try, vm-kernel-linux-release-x64-try, vm-kernel-optcounter-threshold-linux-release-ia32-try, vm-kernel-optcounter-threshold-linux-release-x64-try, vm-kernel-precomp-android-release-arm-try, vm-kernel-precomp-bare-linux-release-simarm-try, vm-kernel-precomp-bare-linux-release-simarm64-try, vm-kernel-precomp-bare-linux-release-x64-try, vm-kernel-precomp-linux-debug-x64-try, vm-kernel-precomp-linux-product-x64-try, vm-kernel-precomp-linux-release-simarm-try, vm-kernel-precomp-linux-release-simarm64-try, vm-kernel-precomp-linux-release-x64-try, vm-kernel-precomp-obfuscate-linux-release-x64-try, vm-kernel-precomp-win-release-simarm64-try, vm-kernel-precomp-win-release-x64-try, vm-kernel-reload-linux-debug-x64-try, vm-kernel-reload-linux-release-x64-try, vm-kernel-reload-rollback-linux-debug-x64-try, vm-kernel-reload-rollback-linux-release-x64-try, vm-kernel-win-debug-ia32-try, vm-kernel-win-debug-x64-try, vm-kernel-win-product-x64-try, vm-kernel-win-release-ia32-try, vm-kernel-win-release-x64-try
Change-Id: I1aab0dd93fa0f06a05299ab4cb019cf898b9e1ef
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/97960
Reviewed-by: Ryan Macnak <rmacnak@google.com>
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
dart-bot pushed a commit that referenced this issue Apr 1, 2019
…ata views

Currently internal cids for all element sizes are packed together, so
are external cids and view cids.

This change re-orders them to make internal/external/view for the same
element size be packed together.

This allows is/as checks for those classes to be performed with
continuous cid-range check. For example "<obj> is Uint8List" becomes
one range check.

(For most applications internal/external/view classes are the only
non-abstract implementations of those interfaces)

Issue #35154

Cq-Include-Trybots: luci.dart.try:vm-canary-linux-debug-try, vm-dartkb-linux-debug-x64-try, vm-dartkb-linux-release-x64-try, vm-kernel-asan-linux-release-x64-try, vm-kernel-checked-linux-release-x64-try, vm-kernel-linux-debug-ia32-try, vm-kernel-linux-debug-simdbc64-try, vm-kernel-linux-debug-x64-try, vm-kernel-linux-product-x64-try, vm-kernel-linux-release-ia32-try, vm-kernel-linux-release-simarm-try, vm-kernel-linux-release-simarm64-try, vm-kernel-linux-release-simdbc64-try, vm-kernel-linux-release-x64-try, vm-kernel-optcounter-threshold-linux-release-ia32-try, vm-kernel-optcounter-threshold-linux-release-x64-try, vm-kernel-precomp-android-release-arm-try, vm-kernel-precomp-bare-linux-release-simarm-try, vm-kernel-precomp-bare-linux-release-simarm64-try, vm-kernel-precomp-bare-linux-release-x64-try, vm-kernel-precomp-linux-debug-x64-try, vm-kernel-precomp-linux-product-x64-try, vm-kernel-precomp-linux-release-simarm-try, vm-kernel-precomp-linux-release-simarm64-try, vm-kernel-precomp-linux-release-x64-try, vm-kernel-precomp-obfuscate-linux-release-x64-try, vm-kernel-precomp-win-release-simarm64-try, vm-kernel-precomp-win-release-x64-try, vm-kernel-reload-linux-debug-x64-try, vm-kernel-reload-linux-release-x64-try, vm-kernel-reload-rollback-linux-debug-x64-try, vm-kernel-reload-rollback-linux-release-x64-try, vm-kernel-win-debug-ia32-try, vm-kernel-win-debug-x64-try, vm-kernel-win-product-x64-try, vm-kernel-win-release-ia32-try, vm-kernel-win-release-x64-try
Change-Id: If27cbc68dcd1bb93e26fc943e2b2a92ad9c43305
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/98323
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
dart-bot pushed a commit that referenced this issue Apr 4, 2019
This is the first CL which takes advantage of the newly added inner
pointer in the RawTypedDataBase in a polymorphic way (so far this
pointer was only used if we knew the object is external typed data).

Issue #35154

Change-Id: I80f915774aa650b7943bd6b4a1d5ffdcb7952b21
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/98571
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
Commit-Queue: Martin Kustermann <kustermann@google.com>
dart-bot pushed a commit that referenced this issue Apr 4, 2019
…in values after GenericCheckBound

Before this CL the range analysis was only constraining bounds for the
JIT-based CheckArrayBounds. This CL enables doing the same for
GenericCheckBound by extracting a common base class and make the range
analysis use it

Issue #35154

Change-Id: Ie23b297ea0b133dd9dee2cff460cd9f24301152f
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/98460
Commit-Queue: Martin Kustermann <kustermann@google.com>
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
dart-bot pushed a commit that referenced this issue Apr 5, 2019
…sses

Based on the unified typed data layout, we can now inline accesses to
typed data interface classes if there are no 3rd party implementations
of those interfaces.

Example: If a receiver is of type Uint8List and we call `[]` or `[]=` we
will inline the byte access.

Instead of changing the existing inliner / call specializer we add this
as an extra pass: If the inliner / call specializer infer that the
receiver type is e.g. internal typed data then it will perform the
inlining itself using more optimized LoadIndexed instruction.

=> Only if those existing optimization passes have not been able to inline
   the access will we, later on in the compilation pipeline, run a
   specialized pass which will inline the accesses using LoadUntagged +
   LoadIndexed (which is slightly less efficient than using only LoadIndexed
   for internal typed data).

As a first step this is only done for AOT.

For ease of writing tests matching certain IR graphs this CL also adds a
IR pattern matcher.

Issue #35154

Cq-Include-Trybots: luci.dart.try:vm-canary-linux-debug-try, vm-dartkb-linux-debug-x64-try, vm-dartkb-linux-release-x64-try, vm-kernel-asan-linux-release-x64-try, vm-kernel-checked-linux-release-x64-try, vm-kernel-linux-debug-ia32-try, vm-kernel-linux-debug-simdbc64-try, vm-kernel-linux-debug-x64-try, vm-kernel-linux-product-x64-try, vm-kernel-linux-release-ia32-try, vm-kernel-linux-release-simarm-try, vm-kernel-linux-release-simarm64-try, vm-kernel-linux-release-simdbc64-try, vm-kernel-linux-release-x64-try, vm-kernel-optcounter-threshold-linux-release-ia32-try, vm-kernel-optcounter-threshold-linux-release-x64-try, vm-kernel-precomp-android-release-arm-try, vm-kernel-precomp-bare-linux-release-simarm-try, vm-kernel-precomp-bare-linux-release-simarm64-try, vm-kernel-precomp-bare-linux-release-x64-try, vm-kernel-precomp-linux-debug-x64-try, vm-kernel-precomp-linux-product-x64-try, vm-kernel-precomp-linux-release-simarm-try, vm-kernel-precomp-linux-release-simarm64-try, vm-kernel-precomp-linux-release-x64-try, vm-kernel-precomp-obfuscate-linux-release-x64-try, vm-kernel-precomp-win-release-simarm64-try, vm-kernel-precomp-win-release-x64-try, vm-kernel-reload-linux-debug-x64-try, vm-kernel-reload-linux-release-x64-try, vm-kernel-reload-rollback-linux-debug-x64-try, vm-kernel-reload-rollback-linux-release-x64-try, vm-kernel-win-debug-ia32-try, vm-kernel-win-debug-x64-try, vm-kernel-win-product-x64-try, vm-kernel-win-release-ia32-try, vm-kernel-win-release-x64-try

Change-Id: I5f2e01a55f46b473f64478b05679f65b9fd7c4c8
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/98662
Commit-Queue: Martin Kustermann <kustermann@google.com>
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
dart-bot pushed a commit that referenced this issue Apr 8, 2019
…yped data to be fast as well in string decoding

This improves the binary protobuf decoding benchmark by 6% on x64/arm64
and 17% on arm32.

Issue #35154

Change-Id: I23ba48b79b6e2a7698707f81ae4122cb9590ed2c
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/98844
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
Commit-Queue: Martin Kustermann <kustermann@google.com>
dart-bot pushed a commit that referenced this issue Apr 26, 2019
…dIndexedInstr produce unboxed values

Currently the LoadIndexedInstr will mostly box resulting values which
might be untagged right after it to do binary int64 arithmetic on the
values.

Similarly StoreIndexedInstr takes only boxed values as inputs. Though
tight code using int64s will need to box just so the StoreIndexedInstr
unboxes it again.

To avoid the extra tagging/untagging we make the load return unboxed
values which can be optionally tagged afterwards if need be.

Issue #35154

Change-Id: I7e82107c2047e41699854e1534934a7cb9da691d
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/99149
Commit-Queue: Martin Kustermann <kustermann@google.com>
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
@mkustermann
Copy link
Member Author

We have done a lot of optimizations to improve protobuf decoding. In the string-heavy protobuf benchmark we have reduce decoding time by 60%.

Closing this issue, since we have achieved major improvement and have no more immediate work planned on this atm.

tekknolagi pushed a commit to tekknolagi/dart-assembler that referenced this issue Nov 3, 2020
…e class

This moves the length field as well as an inner pointer to the start of
the data into the RawTypedDataBase class.  The inner pointer is updated
on allocation, scavenges and old space compactions.

To avoid writing more assembly the typed data view factory constructors
will be generated using IL, which will update the inner pointer. This
required adding new IR instructions and changing the existing
UnboxedIntConverter instruction.

This is the foundation work to de-virtualize calls on the public typed
data types, e.g. Uint8List.


Issue dart-lang#35154

Cq-Include-Trybots: luci.dart.try:vm-canary-linux-debug-try, vm-dartkb-linux-debug-x64-try, vm-dartkb-linux-release-x64-try, vm-kernel-asan-linux-release-x64-try, vm-kernel-checked-linux-release-x64-try, vm-kernel-linux-debug-ia32-try, vm-kernel-linux-debug-simdbc64-try, vm-kernel-linux-debug-x64-try, vm-kernel-linux-product-x64-try, vm-kernel-linux-release-ia32-try, vm-kernel-linux-release-simarm-try, vm-kernel-linux-release-simarm64-try, vm-kernel-linux-release-simdbc64-try, vm-kernel-linux-release-x64-try, vm-kernel-optcounter-threshold-linux-release-ia32-try, vm-kernel-optcounter-threshold-linux-release-x64-try, vm-kernel-precomp-android-release-arm-try, vm-kernel-precomp-bare-linux-release-simarm-try, vm-kernel-precomp-bare-linux-release-simarm64-try, vm-kernel-precomp-bare-linux-release-x64-try, vm-kernel-precomp-linux-debug-x64-try, vm-kernel-precomp-linux-product-x64-try, vm-kernel-precomp-linux-release-simarm-try, vm-kernel-precomp-linux-release-simarm64-try, vm-kernel-precomp-linux-release-x64-try, vm-kernel-precomp-obfuscate-linux-release-x64-try, vm-kernel-precomp-win-release-simarm64-try, vm-kernel-precomp-win-release-x64-try, vm-kernel-reload-linux-debug-x64-try, vm-kernel-reload-linux-release-x64-try, vm-kernel-reload-rollback-linux-debug-x64-try, vm-kernel-reload-rollback-linux-release-x64-try, vm-kernel-win-debug-ia32-try, vm-kernel-win-debug-x64-try, vm-kernel-win-product-x64-try, vm-kernel-win-release-ia32-try, vm-kernel-win-release-x64-try
Change-Id: I1aab0dd93fa0f06a05299ab4cb019cf898b9e1ef
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/97960
Reviewed-by: Ryan Macnak <rmacnak@google.com>
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. customer-google3
Projects
None yet
Development

No branches or pull requests

2 participants