Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide CFE functionality whether is/as checks can omit checking the type arguments #54998

Open
mkustermann opened this issue Feb 23, 2024 · 13 comments
Assignees
Labels
area-front-end Use area-front-end for front end / CFE / kernel format related issues. cfe-encodings Encoding related CFE issues.

Comments

@mkustermann
Copy link
Member

mkustermann commented Feb 23, 2024

For backends it's usually much simpler & faster to check whether an object is an instance of a class than to check whether it's an instance of an interface type with arguments.

Concretely, we have <obj> is/as <type> expressions where <type> is an InterfaceType but with non-defaults-to-bounds type arguments. Doing those checks is expensive.

Now there's a subset of cases where we actually have a static guarantee that the type arguments will match and therefore don't have to be checked, reducing the check to a simple obj instanceof Class check.

Here's a few examples where type argument checks are not needed:

class B0<T> {}
class B1<T, H> extends B0<int> {}
class B2<T> extends B1<T, T> {}

main() {
  final B1<num, num> a
  a is B2<num>

  final B1<int, int> a
  a is B2<num>
  a is B2<int>

  final B1<List<int>, List<int>> a
  a is B2<List<num>>
  a is B2<List<int>>
}

class X<T extends num> {
  void test1() {
    final Iterable<T> a;
    a is List<T>;

    final B1<T, T> a
    a is B2<T>

    final B1<List<T>, List<T>> a
    a is B2<List<T>>
    a is B2<List<num>>
  }
}

The algorithm to determine whether type arguments don't have to be checked for a <obj> is X<A0, ... An> with <obj> having static type Y<B0, ..., Bm> could be something like this:

  • Take the raw/this-type of X, i.e. X<X::T, X::T2, X::T3>
  • Find the path from X to Y
  • Apply the super types from this path using the raw type
  • Match the resulting Y<Expr-1(X::T1..Tn), ..., Expr-m(X::T1..Tn)> against the static type of operand (Y<B0, ..., Bm>)
    • finding values for all type parameters X::T1 ... X::Tn
    • if we find consistent assignments Operand(X::Ti) for all X::Ti
    • and X<Operand(X::Ti), ..., Operand(X::Tn)> is a subtype of X<A0, ..., An>
    • then we can skip checking the type arguments and only check whether obj is an X.

So the concrete request I have is

  • Could CFE team make such an algorithm available in package:kernel/*.dart
  • Could the CFE annotate AsExpression and IsExpression with a bit indicating whether the type arguments don't need to be checked

/cc @johnniwinther @chloestefantsova?

@mkustermann mkustermann added the area-front-end Use area-front-end for front end / CFE / kernel format related issues. label Feb 23, 2024
@chloestefantsova
Copy link
Contributor

That's a very neat optimization that I believe is quite possible, so my answer to both questions would be "yes". I also think it shouldn't take much time, and I'd be happy to implement it.

A related question is how useful such a bit on AsExpression and IsExpression would be to other backends.

@johnniwinther WDYT?

@johnniwinther
Copy link
Member

That seems like a valuable flag to have on AsExpression and IsExpression.

@mkustermann Would it be useful for your use-case to also mark checks against record types as "only structural" (for instance is (Object?, Object?)?

@johnniwinther johnniwinther added the cfe-encodings Encoding related CFE issues. label Feb 23, 2024
@mkustermann
Copy link
Member Author

mkustermann commented Feb 23, 2024

I also think it shouldn't take much time, and I'd be happy to implement it.

See simple version in cl/353900 (as part of a bugfix in dart2wasm compiler)

@mkustermann Would it be useful for your use-case to also mark checks against record types as "only structural" (for instance is (Object?, Object?)?

That may be helpful yes, as we can have fast ways to check whether an object has a certain shape.

A related question is how useful such a bit on AsExpression and IsExpression would be to other backends.

Definitely useful for wasm, vm. Very likely also for dart2js /cc @rakudrama

copybara-service bot pushed a commit that referenced this issue Feb 23, 2024
The change in [0] was somewhat finishing RTT checks but didn't
fix that part correctly. A later fix to that code in [1] also
didn't fix the underlying issue.

[0] https://dart-review.googlesource.com/c/sdk/+/249641/22/pkg/dart2wasm/lib/types.dart#471
[1] https://dart-review.googlesource.com/c/sdk/+/292881 (which got
relanded)

Closes #54994
Issue #54998

Change-Id: I507070514e98cb66a57f4f7f08906a32993265d5
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/353900
Commit-Queue: Martin Kustermann <kustermann@google.com>
Reviewed-by: Ömer Ağacan <omersa@google.com>
@chloestefantsova chloestefantsova self-assigned this Feb 28, 2024
@chloestefantsova
Copy link
Contributor

@mkustermann I'm looking into this optimization now, and I have multiple design alternatives for the flags on IsExpression and AsExpression. Basically, we have two boolean parameters that we can variate the number and the meaning of the flags with: (1) whether we want type-kind specific flags or not, and (2) whether we want the distinction between operand-dependent and operand-independent shape checks or not.

The type-kind specific flags are more or less self explanatory: they indicate what kind of shape that should be checked. We have the following possibilities for non-simple and not-erased types in Kernel: interface types, record types, function types, and FutureOr types.

The flags that make distinction between the operand-dependent and operand-independent checks signal whether the optimizing flag computation took the static type of the operand expression into account. For example, in the check x is List<num> we may conclude that checking for the runtime type of the value stored in x being any List is sufficient because we know that the static type of the expression x is Iterable<int>. In that case we say that the shape-only check is sufficient, and it is operand-dependent. In contrast with that, in the check x is List<dynamic> we conclude that checking for the runtime type of the value stored in x for being any List is sufficient because List<dynamic> is a supertype of any List. In that case we say that the shape-only check is sufficient, and it is operand-independent.

Please let me know which of the alternatives would work the best for you.

Variant 0: We do want the type-specific flags, and we do want to distinguish between the operand-dependent and operand-independent shape checks

We add the following flags to IsExpression and AsExpression AST nodes in Kernel.

  • isOperandIndependentInterfaceTypeShapeCheckSufficient
  • isOperandIndependentRecordTypeShapeCheckSufficient
  • isOperandIndependentFunctionTypeShapeCheckSufficient
  • isOperandIndependentFutureOrTypeShapeCheckSufficient
  • isOperandDependentInterfaceTypeShapeCheckSufficient
  • isOperandDependentRecordTypeShapeCheckSufficient
  • isOperandDependentFunctionTypeShapeCheckSufficient
  • isOperandDependentFutureOrTypeShapeCheckSufficient

The following is an example of those flags being set.

class A<X> {}
class B<Y extends num> extends A<Y> {}

test(A<int> a, (int,) r, Function(num) f, FutureOr<int> fo) {
  a as B<int>; // isOperandDependentInterfaceTypeShapeCheckSufficient = true
  a as B<num>; // isOperandIndependentInterfaceTypeShapeCheckSufficient = true

  r as (num,); // isOperandDependentRecordTypeShapeCheckSufficient = true
  r as (Object?,); // isOperandIndependentRecordTypeShapeCheckSufficient = true

  f as Function(int); // isOperandDependentFunctionTypeShapeCheckSufficient = true
  f as Function(Never); // isOperandIndependentFunctionTypeShapeCheckSufficient = true

  fo as FutureOr<num>; // isOperandDependentFutureOrTypeShapeCheckSufficient = true
  fo as FutureOr<dynamic>; // isOperandIndependentFutureOrTypeShapeCheckSufficient = true
}

Variant 1: We do want the type-specific flags, but we don’t want to distinguish between the operand-dependent and operand-independent shape checks

We add the following flags to IsExpression and AsExpression AST nodes in Kernel.

  • isInterfaceTypeShapeCheckSufficient
  • isRecordTypeShapeCheckSufficient
  • isFunctionTypeShapeCheckSufficient
  • isFutureOrTypeShapeCheckSufficient

The following is an example of those flags being set.

class A<X> {}
class B<Y extends num> extends A<Y> {}

test(A<int> a, (int,) r, Function(num) f, FutureOr<int> fo) {
  a as B<int>; // isInterfaceTypeShapeCheckSufficient = true
  a as B<num>; // isInterfaceTypeShapeCheckSufficient = true

  r as (num,); // isRecordTypeShapeCheckSufficient = true
  r as (Object?,); // isRecordTypeShapeCheckSufficient = true

  f as Function(int); // isFunctionTypeShapeCheckSufficient = true
  f as Function(Never); // isFunctionTypeShapeCheckSufficient = true

  fo as FutureOr<num>; // isFutureOrTypeShapeCheckSufficient = true
  fo as FutureOr<dynamic>; // isFutureOrTypeShapeCheckSufficient = true
}

Variant 2: We don’t want the type-specific flags, but we do want to distinguish between the operand-dependent and operand-independent shape checks

We add the following flags to IsExpression and AsExpression AST nodes in Kernel.

  • isOperandIndependentShapeCheckSufficient
  • isOperandDependentShapeCheckSufficient

The following is an example of those flags being set.

class A<X> {}
class B<Y extends num> extends A<Y> {}

test(A<int> a, (int,) r, Function(num) f, FutureOr<int> fo) {
  a as B<int>; // isOperandDependentShapeCheckSufficient = true
  a as B<num>; // isOperandIndependentShapeCheckSufficient = true

  r as (num,); // isOperandDependentShapeCheckSufficient = true
  r as (Object?,); // isOperandIndependentShapeCheckSufficient = true

  f as Function(int); // isOperandDependentShapeCheckSufficient = true
  f as Function(Never); // isOperandIndependentShapeCheckSufficient = true

  fo as FutureOr<num>; // isOperandDependentShapeCheckSufficient = true
  fo as FutureOr<dynamic>; // isOperandIndependentShapeCheckSufficient = true
}

Variant 3: We don’t want the type-specific flags, and we don’t want to distinguish between the operand-dependent and operand-independent shape checks

We add the following flags to IsExpression and AsExpression AST nodes in Kernel.

  • isShapeCheckSufficient
class A<X> {}
class B<Y extends num> extends A<Y> {}

test(A<int> a, (int,) r, Function(num) f, FutureOr<int> fo) {
  a as B<int>; // isShapeCheckSufficient = true
  a as B<num>; // isShapeCheckSufficient = true

  r as (num,); // isShapeCheckSufficient = true
  r as (Object?,); // isShapeCheckSufficient = true

  f as Function(int); // isShapeCheckSufficient = true
  f as Function(Never); // isShapeCheckSufficient = true

  fo as FutureOr<num>; // isShapeCheckSufficient = true
  fo as FutureOr<dynamic>; // isShapeCheckSufficient = true
}

@chloestefantsova
Copy link
Contributor

See simple version in cl/353900 (as part of a bugfix in dart2wasm compiler)

Thanks! I see it may not handle some of the type combinations, for example, the one in the example below won't be optimized even though it could be, but it gives a good idea.

class A<X> {}
class B<Y> extends A<Y> {}
test(A<Function(num)> a) => a is B<Function(num)>;

@mkustermann
Copy link
Member Author

mkustermann commented Feb 28, 2024

For backends it's somewhat irrelevant why a check can be simplified. It only matters that the check can be simplified. (So I think it's not important to distinguish whether the optimization can be made because it took the static type into consideration or not)

The kind of simplification depends on the DartType:

  • on <obj> is/as InterfaceType<...>: It means we can avoid checking the type arguments
  • on <obj> is (Object?, Object?): It means we can check the record shape only, not the individual types

I think it makes sense to separate the two, as they are different concepts.

Guess this would then be Variant 1?

One additional thing that may(?) make sense to add a flag for:

T? value;
value is T;
value as T;

=> e.g. a IsExpression.isNullCheckSufficient.

@chloestefantsova
Copy link
Contributor

Guess this would then be Variant 1?

Thanks! I'll go with this one then. A related clarification: would it be better to have the set of four flags or an enum instead, where one of the values will signal that no optimizations can be made?

=> e.g. a IsExpression.isNullCheckSufficient.

Good point about nullability check! I'll make sure to include that.

@rakudrama
Copy link
Member

I am skeptical that this will be much use to dart2js. I propose an alternative.

dart2js already detects the nullable-removal pattern x as T and compiles this as x == null ? x as T : x. The logic here is single test. It does not seem worth encoding that test in a serialized format.
Are there cases that the CFE could detect that are missed here?

Another way to express the 'shape' optimization would be to replace the type expression with one instantiated to bounds, or the structural equivalent. Often the only reason we write x is List<T> is because promotion does not work for x is List, but the front end could generate an IsExpression that uses a maximal type instead of the written type (replacing List<T> with List<Object?> or List<dynamic>). The main problem with using a maximal type for an as check is that it will change the error message, which could be confusing.

In many cases these maximal types are already optimized (obj instanceof Class).

Annotating IsExpression and AsExpression with flags solves only half of the problem.

  • It only addresses syntactic is and as tests, not other tests, like inlined parametric covariant checks
  • It is all-or-nothing, whereas the tested type might be partially simplified (for example, it is reasonable to expect is Map<int, dynamic> is faster than is Map<int, List<int>>)
  • Mutable flags open the door to inconsistencies when a transform rewrites part of the code.

dart2js and the VM both use some kind of compiled type testing stubs. This allows x as T to be fast when the type is simple, including parametric covariance checks. For dart2js, the stubs are somewhat generated at run time. It is not clear how to connect the flag on AsExpression to the lazy runtime generation of the stub.

A preferable scheme is the following

  • Compute a maximal equivalent type, one of:
    • CFE stores a maximal equivalent type on the AsExpression/IsExpression nodes, or
    • CFE provides a utility for computing a maximal type from the static type of the value operand and the tested type, or
  • All compilers may use the maximal type for is-tests.
  • Optimizing compilers can use the maximal type for as-checks if permissible to change the error message, or have a special kind of test that uses the maximal type for the test, but has at-hand the 'true' type for creating the Error.

@chloestefantsova
Copy link
Contributor

  • Compute a maximal equivalent type

I really like this! The maximal equivalent types cover a wider range of checks, and generating such helper types can be done with an extension of the algorithm I already had in mind for computing the bits. I think it’s a good idea to store the maximal equivalent types in IsExpressions and AsExpressions and preserve the level of precision of the runtime error messages.

I can see the appeal of the shape-check bits too: even if the maximal type is present, sometimes it can be helpful to know that only the shape of such type needs checking. Consider the following example.

class A<X> {}
class B<X, Y> extends A<X> {}

test(A<int> a) {
  // Maximal equivalent type: B<dynamic, String>.
  // isInterfaceTypeShapeCheckSufficient = false.
  a as B<num, String>;

  // Maximal equivalent type: B<dynamic, dynamic>.
  // isInterfaceTypeShapeCheckSufficient = true.
  a as B<num, dynamic>;
}
  • CFE stores a maximal equivalent type on the AsExpression/IsExpression nodes, or
  • CFE provides a utility for computing a maximal type from the static type of the value operand and the tested type, or

I see value in providing both, together with the shape-check bits or a shape-check enum value. @mkustermann, @rakudrama WDYT?

@mkustermann
Copy link
Member Author

My main request is to have the functionality shared inpackage:kernel instead of all backends doing their own (slightly different) implementation. If the functionality is there, then it's easy to compute the bits - so having them on the IsExpression / AsExpression isn't so important.

The bits on the AST nodes would be a convenience - and wouldn't incur additional memory overhead as those nodes already an int flags. Adding more information to the nodes makes them consume more memory. Not sure what impact that would have (and as already mentioned, it wouldn't cover all cases as e.g. inlining would need to re-compute this for the operand type on call site and the implicit as check for parameter type)

In terms of precision: Only performing shape checks vs full type check covers many important cases, but it would of course be nice if it's more general.

Rewriting the existing type of the is/as check may loose information that backends may rely on in their type propagator. Maybe some backends have an advanced intersection type that can infer from Iterable<T> a; a as List<dynamic>; that a must be List<T> but this isn't true for all backends - it may also make getStaticType() on the nodes more expensive or less precise (?).

It is not clear how to connect the flag on AsExpression to the lazy runtime generation of the stub.

Can only speak for the VM: The flag actually tells what to do:

  • if flag says only nullability has to be checked which we can do inline
  • if flag says check only record shape, we only compare the shape which we can do inline
  • if flag says check subclass check ignoring type args, call a stub that checks the defaults-to-bounds type which will only check for subclasses

So maybe we can start out with having the algorithm in package:kernel?

@chloestefantsova
Copy link
Contributor

Rewriting the existing type of the is/as check may loose information that backends may rely on in their type propagator.

Yep, that would be an inconvenience. But I suggest to store the maximal equivalent type alongside the target check type, not to replace one with the other.

So maybe we can start out with having the algorithm in package:kernel?

Yep, we can certainly do that.

copybara-service bot pushed a commit that referenced this issue Mar 7, 2024
Part of #54998

Change-Id: Ib60e8fbce0ca3bef9b12b6a1950b1943ab86bd99
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/354861
Commit-Queue: Chloe Stefantsova <cstefantsova@google.com>
Reviewed-by: Martin Kustermann <kustermann@google.com>
Reviewed-by: Johnni Winther <johnniwinther@google.com>
copybara-service bot pushed a commit that referenced this issue Mar 14, 2024
…mization

The CFE added this functionality (in a more general form) in [0] so we can
remove our implementation in dart2wasm.

[0] https://dart-review.googlesource.com/c/sdk/+/354861

Issue #54998

Change-Id: I4f3159fc1697d09aa1af68cc333e4b377b71f497
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/357561
Commit-Queue: Martin Kustermann <kustermann@google.com>
Reviewed-by: Ömer Ağacan <omersa@google.com>
Reviewed-by: Chloe Stefantsova <cstefantsova@google.com>
copybara-service bot pushed a commit that referenced this issue Mar 25, 2024
Part of #54998

Change-Id: Ib33391d56208e66dba0396da712ca05d21faece0
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/358453
Commit-Queue: Chloe Stefantsova <cstefantsova@google.com>
Reviewed-by: Johnni Winther <johnniwinther@google.com>
@chloestefantsova
Copy link
Contributor

Sorry for the silent on the issue! Some time ago I implemented the boolean predicate @mkustermann asked, and it can be updated into an algorithm that produces the maximal equivalent types for the type checks too. @rakudrama is there still interest in such API?

@rakudrama
Copy link
Member

rakudrama commented Jun 12, 2024

@chloestefantsova Yes, a function to compute a maximal equivalent type would still be used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-front-end Use area-front-end for front end / CFE / kernel format related issues. cfe-encodings Encoding related CFE issues.
Projects
None yet
Development

No branches or pull requests

4 participants