Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access to File.readAsBytes result is slower than expected #29892

Closed
lexaknyazev opened this issue Jun 15, 2017 · 14 comments
Closed

Access to File.readAsBytes result is slower than expected #29892

lexaknyazev opened this issue Jun 15, 2017 · 14 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.

Comments

@lexaknyazev
Copy link
Contributor

Can't provide a small repro atm, but here is my observation:

Data access in this code

// Load data from file
final bytes = new File('data.bin').readAsBytesSync() as Uint8List;
// Do lots of random access with different aligned typed views (Float32, Uint16)

is about 2 times slower than here:

// Load data from file
final bytes = new Uint8List.fromList(new File('data.bin').readAsBytesSync());
// Do lots of random access with different aligned typed views (Float32, Uint16)

Same applies to async version of File.readAsBytes.

@lexaknyazev
Copy link
Contributor Author

Observatory shows that about 45% of time is being spent within Native (TypedData_GetFloat32, TypedData_GetUint16) in the first (slow) case. In the second (fast) case, Native takes only 10%,

@floitschG floitschG added area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-io labels Jun 15, 2017
@floitschG
Copy link
Contributor

@zanderso, @a-siva Assigning to dart:io for now. Could be a VM perf problem, too.

@zanderso zanderso added the needs-info We need additional information from the issue author (auto-closed after 14 days if no response) label Jun 15, 2017
@zanderso
Copy link
Member

It's likely that we'll need the full example in order to diagnose this.

@lexaknyazev
Copy link
Contributor Author

lexaknyazev commented Jun 15, 2017

Here's the smallest example, that I could isolate. Run each version with --observe and compare Native time.

Slow

import 'dart:io';
import 'dart:typed_data';

const kSize = 10 * 1024 * 1024;

void main() {
  new File('tmp.bin').writeAsBytesSync(new Uint8List(kSize));

  final foos = <Uint8List>[
    new File('tmp.bin').readAsBytesSync() as Uint8List,
    new Uint8List(kSize)
  ];

  for (var foo in foos) {
    for (var e in new Uint8List.view(foo.buffer)) {}
  }
}

Fast

import 'dart:io';
import 'dart:typed_data';

const kSize = 10 * 1024 * 1024;

void main() {
  new File('tmp.bin').writeAsBytesSync(new Uint8List(kSize));

  final foos = <Uint8List>[
    new Uint8List.fromList(new File('tmp.bin').readAsBytesSync()),
    new Uint8List(kSize)
  ];

  for (var foo in foos) {
    for (var e in new Uint8List.view(foo.buffer)) {}
  }
}

When both elements of foos array contain similarly-produced objects (i.e., both from File or both from new Uint8List(kSize)), Native time is alike in both programs.

EDIT: shorten example code.

@lexaknyazev
Copy link
Contributor Author

@zanderso

It's likely that we'll need the full example in order to diagnose this.

Example code above has the same unusual Native CPU time share as a real application.
VM shows that the slow version deopts:

== Deoptimizing code for 'file:///C:/dart/dev/slow.dart_::_main', frame,
Eager deopt fp=0000008731efe8b0 pc=00000268cdbe50cb
Deoptimizing (reason 4 'CheckClass') at pc=00000268cdbe50cb fp=0000008731efe8b0 'file:///C:/dart/dev/slow.dart_::_main' (count 0)
Deoptimizing 'file:///C:/dart/dev/slow.dart_::_main' (count 0)
  Function: file:///C:/dart/dev/slow.dart_::_main
  Line 15: '    for (var e in new Uint8List.view(foo.buffer)) {}'
  Deopt args: 0
Switching 'dart:typed_data__TypedListIterator@6027147_moveNext' to unoptimized code because guard on field 'Field <_TypedListView@6027147._typedData@6027147>: final' was violated.
Switching 'dart:typed_data__Uint8ArrayView@6027147_[]' to unoptimized code because guard on field 'Field <_TypedListView@6027147._typedData@6027147>: final' was violated.
Switching 'file:///C:/dart/dev/slow.dart_::_main' to unoptimized code because guard on field 'Field <_TypedListView@6027147._typedData@6027147>: final' was violated.

@lexaknyazev
Copy link
Contributor Author

@floitschG @zanderso
Have you had a chance to diagnose this yet?

@floitschG
Copy link
Contributor

/cc @mraleph (in case you are interested or have an idea of what's going on).

@mraleph
Copy link
Member

mraleph commented Jun 23, 2017

Taking a look

@mraleph mraleph self-assigned this Jun 23, 2017
@mraleph
Copy link
Member

mraleph commented Jun 23, 2017

This is a regression originally introduced by ae8e670 at least for the microbenchmark case

In Uint8ArrayView.operator[] we have a polymorphic call to _TypedList._getUint8 with receivers _Uint8List and _ExternalUint8Array. To reach good performance we need to polymorphically inline it. Those methods are natives and have no Dart bodies, so their inlining is handled by FlowGraphInliner::TryInlineRecognizedMethod.

However in the polymorphic inliner we only reach recognized method inlining if we have a single cid call target:

bool PolymorphicInliner::TryInliningPoly(const TargetInfo& target_info) {
  if ((!FLAG_precompiled_mode ||
       owner_->inliner_->use_speculative_inlining()) &&
      target_info.IsSingleCid() &&
      TryInlineRecognizedMethod(target_info.cid_start, *target_info.target)) {
    owner_->inlined_ = true;
    return true;
  }
  // ...

Which means that after ae8e670 we no longer inline _TypedList._getUint8 and friends at polymorphic call-sites - because they are never represented as single-cid call target as all children of _TypedList have the same _getUint8 method.

@ErikCorryGoogle - please take a look.

@mraleph mraleph assigned ErikCorryGoogle and unassigned mraleph Jun 23, 2017
@mraleph mraleph added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. and removed area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-io needs-info We need additional information from the issue author (auto-closed after 14 days if no response) labels Jun 23, 2017
@mraleph
Copy link
Member

mraleph commented Jun 23, 2017

When I disable cid range expansion here is what I get (with const kSize = 10 * 10 * 1024 * 1024):

╭─ ~/src/dart/sdk ‹master›
╰─$ time out/ReleaseX64/dart t/bench-fast.dart
out/ReleaseX64/dart t/bench-fast.dart  0.94s user 0.81s system 102% cpu 1.717 total
╭─ ~/src/dart/sdk ‹master›
╰─$ time out/ReleaseX64/dart t/bench-slow.dart
out/ReleaseX64/dart t/bench-slow.dart  1.17s user 0.69s system 102% cpu 1.811 total

@ErikCorryGoogle
Copy link
Contributor

Thanks for tracking this down!

I have a suggested fix here: https://codereview.chromium.org/2956653002

@mraleph
Copy link
Member

mraleph commented Jun 23, 2017

@lexaknyazev how much you can share about your full code in which you are using Dart?

I am really interested in assessing if our approach to implementing typed array views scales well for real world use cases.

@lexaknyazev
Copy link
Contributor Author

@mraleph
The product hasn't been officially released yet, but I can point to the preview repo and datasets.

Source tree (will replace master branch soon):
https://github.com/KhronosGroup/glTF-Validator/tree/dev

Dataset:
https://github.com/KhronosGroup/glTF-Sample-Models/

Cmd-line (cd to dataset dir, assuming that source snapshot was created from /bin/gltf_validator.dart):

$ dart gltf_validator.snapshot -r -p -w ./2.0/

ErikCorryGoogle added a commit that referenced this issue Jun 24, 2017
When dispatching polymorphic calls we try to recognize the case where
adjacent classes have the same method, so that we can group the classes
in the dispatch.  That fails for polymorphic methods, which are pretending to
be the same method, but actually do a class test on the receiver internally. In
this case, we should not expand the classes, because it messes up some other
inlining optimizations.

R=vegorov@google.com
BUG=#29892

Review-Url: https://codereview.chromium.org/2956653002 .
@mraleph
Copy link
Member

mraleph commented Jun 29, 2017

This issue has been fixed.

@mraleph mraleph closed this as completed Jun 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.
Projects
None yet
Development

No branches or pull requests

5 participants