Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refractor compute #99527

Merged
merged 24 commits into from
Mar 17, 2022
Merged

Refractor compute #99527

merged 24 commits into from
Mar 17, 2022

Conversation

jellynoone
Copy link
Contributor

@jellynoone jellynoone commented Mar 4, 2022

This PR fixes #90693 and makes the compute method more efficient by using a single ReceivePort instead of 3.

The code changes were inspired by @irhn 's comment in a Dart SDK issue.

Fixes #90693

Pre-launch Checklist

  • I read the [Contributor Guide] and followed the process outlined there for submitting PRs.
  • I read the [Tree Hygiene] wiki page, which explains my responsibilities.
  • I read and followed the [Flutter Style Guide], including [Features we expect every widget to implement].
  • I signed the [CLA].
  • I listed at least one issue that this PR fixes in the description above.
  • I updated/added relevant documentation (doc comments with ///).
  • I added new tests to check the change I am making, or this PR is [test-exempt].
  • All existing and new tests are passing.

Send all events to one ReceivePort. This is both simpler and more
efficient.
To correctly identify which event was sent the result returned by the
callback passed by the user is wrapped in a List. This way the following
is true.

When the first event received from the port is:
1. null; it is an exit event i. e. the isolate exited prematurely
2. a single element list; it is a success event i. e. the isolate and
computed were successfully completed
3. a double element list; it is an error event i. e. the isolate or user
provided callback failed

Any other type of event is impossible to receive. The first and third
options are guaranteed by the Dart SDK.  While the second we guarantee
ourselves.

We have to wrap the result in a List, because the type of the user
callback could also be a List. Meaning, a simple `result is R` could
return true for an error event. (Which is specified by the Dart SDK)
@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat (don't just cc him here, he won't see it! He's on Discord!).

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

@flutter-dashboard flutter-dashboard bot added the framework flutter/packages/flutter repository. See also f: labels. label Mar 4, 2022
@jellynoone jellynoone changed the title Refractor compute to use single ReceivePort for all events Refractor compute to use single ReceivePort for all events Mar 4, 2022
@jellynoone
Copy link
Contributor Author

jellynoone commented Mar 4, 2022

There is also an edge case here that also wasn't handled in the previous iteration of the code, which is that no error will be reported if the entry point fails after the sending the result back to the main Isolate.

This is because after receiving the data we close all the ports as such no additional handlers are called.

However, this is only an issue for the development team not the clients / consumers.

I've been thinking a great deal about this and there is no simple or secure way of doing this that would be 100% future-proof. Also it's also impossible to test this because it would not something the caller is responsible for but the developer.

To illustrate when this issue would occur:

// if this were to be changed
Isolate.exit(configuration.resultPort, <R>[result]);
//  to something like this
configuration.resultPort.send(<R>[result]);
configuration.resultPort.send(<R>[result]); // This send event would be ignored
// or 
configuration.resultPort.send(<R>[result]);
throw Error(); // This error would not be reported

The previous implementation also ignored additional success / result events, while reported an additional error. But because close() was called on the ReceivePort it actually wasn't handled. (I tested this).

To actually force the handlers to be called one would have to delay returning and do something like await Future.delay(Duration.zero); before closing the ports. And this STILL isn't guaranteed because the failing code could be some newly added listener or delayed response. I. e.

configuration.resultPort.send(<R>[result]);
await Future.delay(Duration(seconds: 1), () => throw Error());

However, as long as we call Isolate.exit and only send one response through the SendPort we should be fine. Even if some erroneous listener was added in the user provided callback, since the entire Isolate is killed.

Am I overthinking this? For now I'll just add a comment and if anyone has a better solution in mind, please let me know.

@jellynoone
Copy link
Contributor Author

@dnfield Would you mind reviewing this?

The API we're currently using doesn't require the full interface the
ReceivePort provides. Additionally we are not restricted / affected by
the low-levelness of the RawReceivePort.

After all, we are simply scheduling a callback which closes the port on
the first message received.
@jellynoone
Copy link
Contributor Author

I also added additional tests:

  • calling compute from compute callback,
  • testing executing Isolate.exit from the compute callback, (Tests the exit branch (which wasn't tested before))
  • and all of the different types of Isolate.current.kill i.e. immediate / beforeNextEvent

Copy link
Contributor

@dnfield dnfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. Would appreciate a review from @aam or @lrhn as well.

@dnfield dnfield requested a review from aam March 5, 2022 08:07
@jellynoone
Copy link
Contributor Author

@dnfield from the looks of it in Dart 2.17 a new method Isolate.run is added which is essentially the same as compute (in terms of functionality).

The differences I see in the implementation is:

  1. we time the functions i.e. Timeline
  2. we use Exception instead of RemoteError or the error that was actually thrown by the user provided function
  3. I see there is some additional safety around closing the receive port
  4. and they use non-growable lists

For now, I will address 3. and 4.
Because 1. and 2. would be a breaking change in compute, I will leave those as is.

How should we proceed?

  • deprecate compute and lose the Timeline
  • keep our own version

@jellynoone jellynoone requested a review from dnfield March 6, 2022 19:59
@jellynoone
Copy link
Contributor Author

After taking a second look at the new Isolate.run implementation it does not have exactly the same functionality.

The run method accepts a no arguments function.
While we require a function with an argument and the argument to be passed to it.

Which means the transition for end users wouldn't be just replacing the call but still relatively trivial if we chose to deprecate compute.

Anyways, awaiting additional feedback.

@dnfield
Copy link
Contributor

dnfield commented Mar 6, 2022

I think it would make sense to do the following:

  • Deprecate compute, and point people instead to Isolate.run.
  • Make its functionality forward to Isolate.run. Keep the timeline functionality in compute.

@goderbauer @Piinks wdyt?

@jellynoone
Copy link
Contributor Author

jellynoone commented Mar 6, 2022

If we want to keep timing exactly intact, we need to actually implement the function ourselves. Since the implementation timed the ReceivePort constructing and closing.

If we're OK with simply timing the user callback then the new implementation is essentially:

// Copyright 2014 The Flutter Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

import 'dart:async';
import 'dart:developer';
import 'dart:isolate';

import 'constants.dart';
import 'isolates.dart' as isolates;

/// The dart:io implementation of [isolate.compute].
Future<R> compute<Q, R>(isolates.ComputeCallback<Q, R> callback, Q message, { String? debugLabel }) {
  debugLabel ??= kReleaseMode ? 'compute' : callback.toString();

  return Isolate.run(() => Timeline.timeSync(
    debugLabel!,
    () => callback(message),
  ));
}

or

// Copyright 2014 The Flutter Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

import 'dart:async';
import 'dart:developer';
import 'dart:isolate';

import 'constants.dart';
import 'isolates.dart' as isolates;

void _void() {}

/// The dart:io implementation of [isolate.compute].
Future<R> compute<Q, R>(isolates.ComputeCallback<Q, R> callback, Q message, { String? debugLabel }) async {
  debugLabel ??= kReleaseMode ? 'compute' : callback.toString();

  final Flow flow = Flow.begin();
  final int flowId = flow.id;

  Timeline.timeSync('$debugLabel: start', _void, flow: flow);
  final R result = await Isolate.run(_fn(callback, message, flowId, debugLabel));
  Timeline.timeSync('$debugLabel: end', _void, flow: Flow.end(flowId));

  return result;
}

FutureOr<R> Function() _fn<Q, R>(isolates.ComputeCallback<Q, R> callback, Q message, int flowId, String debugLabel) {
  return () {
    return Timeline.timeSync(
      debugLabel,
      () => callback(message),
      flow: Flow.step(flowId)
    );
  };
}

if we wish to fake the starts and ends. However, should we take the approach of wrapping the Isolate.run I think the second approach should be preferred since Timeline.timeSync won't time async functions. Thus simply returning the Future (first code simple) would for async compute functions time shorter executions even though they would still last.

Of course, regardless of which wrapping we might chose. Test expectations will need to be changed (because of changed error reporting) and wait for the raised issue in the Dart SDK to be resolved.

@goderbauer
Copy link
Member

My preferred way would be to keep the compute functionality, but implement it in terms of the new Dart API. The tracing that compute adds over the raw Dart API has been useful. The important bits to keep for tracing here are the flow events (going from the point the computation was kicked of to the computation to the point where the result of the computation is received) and the duration of the computation. Measuring how long it takes to instantiate and close the ports is not important, IMO. I believe that was just done in the current implementation to get the flow events inserted properly.

I don't think we should deprecate compute altogether since it does offer the timeline tracing as an additional value. We should adjust the documentation, though, to document this difference over the other API.

Is the switch from Exception to RemoteError actually breaking anything?

Copy link
Contributor

@dnfield dnfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM.

Thanks for sticking with this. It may not be apparent but your work here has helped improve things around the Isolate.run API and identify some key areas needed to improve that.

@dnfield
Copy link
Contributor

dnfield commented Mar 17, 2022

This still needs a second LGTM - @goderbauer?

@skia-gold
Copy link

Gold has detected about 1 new digest(s) on patchset 24.
View them at https://flutter-gold.skia.org/cl/github/99527

Copy link
Member

@goderbauer goderbauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jellynoone
Copy link
Contributor Author

Still LGTM.

Thanks for sticking with this. It may not be apparent but your work here has helped improve things around the Isolate.run API and identify some key areas needed to improve that.

@dnfield Thank you very much. It was a fun exercise.

@jellynoone
Copy link
Contributor Author

jellynoone commented Mar 17, 2022

All tests are green, and I just wanted to add one last note for documentation purposes of the changes.

User facing changes (tiny):

  • one (redundant) timing event was removed;
  • the spawned Isolate is now named with the provided debugLabel;

User facing changes (minor):

  • Exceptions and errors thrown by the computation are re-thrown in the main Isolate;
  • Errors thrown in the computation must be sendable between Isolates
  • RemoteError is thrown instead of an Exception when:
  1. a user does something freaky like immediately kill or exit the current Isolate from the computation;
  2. a user sends back a response which cannot be sent through the SendPort.send;
  3. a user throws an error from the computation which cannot be sent through the SendPort.send; or
  4. we made a mistake somewhere (should only happen when/if new code is add.

Internal:

  • use 1 RawReceivePort instead of 3 ReceivePorts,
  • simplify types and related operations,
  • wrap computation result or error in List to easily differentiate between different event types

@jellynoone
Copy link
Contributor Author

@mit-mit I don't know if you're the right person to tag.

I just want to make sure users aren't forgotten to be informed about changes in my previous post (mostly exception handling).

@lrhn
Copy link
Contributor

lrhn commented Mar 18, 2022

(About the assert: Until all programs are soundly null-safe, as Foo is potentially effectively doing is Foo?.)

@jellynoone
Copy link
Contributor Author

Aha!!! Thank you for the explanation.

@jellynoone jellynoone deleted the improve_compute branch March 22, 2022 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
framework flutter/packages/flutter repository. See also f: labels.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

compute's use of SendPorts is not safe
8 participants