Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Lint for near-guaranteed cast failure due to inference/covariance clash #1133

Open
eernstg opened this issue Aug 17, 2018 · 3 comments
Labels
lint-request type-code-health Internal changes to our tools and workflows to make them cleaner, simpler, or more maintainable type-enhancement A request for a change that isn't a bug

Comments

@eernstg
Copy link
Member

eernstg commented Aug 17, 2018

This is a proposal for a lint which is intended to flag some statically detectable situations in code where a failure at run-time is near-guaranteed.

Basic idea

To set the scene, here's a scenario with the structure that I'm recommending we should detect and flag:

  • Customer: Hello, I'd like to buy a phone!
  • Salesperson: Sounds great! Prices start at $10!
  • Customer: Here's exactly $10, now give me a Pixel 2 XL 128GB!
  • Salesperson: ... ☈ ...

The basic idea is that we have situations where inference is relying on a straightforward (invariant) type assumption to choose certain inferred types, and those inferred types are then used to create entities whose dynamic type is directly derived from the static assumption. The result of evaluating e at run time may then work in the single special case where the type was actually invariant, but it is guaranteed to fail in all other cases.

Lint criterion

Let e be an expression whose context type is covariant, and assume that inference provides actual type arguments or type annotations on the formal parameters or return value of a function literal in e. This lint shall then flag e if any of said inferred type annotations or actual type arguments occur covariantly or invariantly in the static type of e.

[Edit: A simple special case is described here].

Examples

In order to make it clearer what the lint criterion means, here's a bunch of examples. Further down there is an alternative explanation which uses existential types to show how it works in general.

class C<X> {
  C<X> c;
  List<X> list;
  Map<X, X> map;
  X Function() fun;
}

main() {
  C<num> c = C<int>(); // `num` vs. `int`: Non-trivial covariance.

  // Certain constructs create fresh objects, so they are vulnerable.
  c.c = C(); // Inferred as `C<num>()`: TypeError.
  c.list = []; // Inferred as `<num>[]`: TypeError.
  c.map = {}; // Inferred as `<num, num>{}`: TypeError.
  c.fun = () => c.list[0]; // Inferred as `num Function()`: TypeError.

  // But generic function invocations can also receive an inferred type
  // argument, and it will propagate during execution, potentially any
  // number of steps down the run-time stack, and may thus become
  // part of the type of a fresh object.

  C<X> f_c<X>() => C(); // Inferred as `C<X>()`.
  List<X> f_list<X>() => []; // Inferred as `<X>[]`.
  Map<X, X> f_map<X>() => <X, X>{}; // Issue arises also without inference.
  X Function() f_fun<X>(C<X> c) => () => c.map[c.list[0]];
  C<X> f_c2<X>(C<X> c) => c;

  c.c = f_c(); // Inferred as `f_c<num>()`, returns `C<num>()`: TypeError.
  c.list = f_list(); // Returns a `List<num>`: TypeError.
  c.map = f_map(); // Returns a `Map<num, num>`: TypeError.
  c.fun = f_fun(C()); // Returns a `num Function()`: TypeError.
  c.fun = f_fun(c); // Ditto.
  c.c = f_c2(C()); // Infers `f_c<num>(C<num>())`, returns C<num>: TypeError.

  // But this is fine: No fresh objects.
  c.c = f_c2(c);
}

In all these situations inference relies on the statically known type, which causes a fresh object to be created with elements from that statically known type baked in, but the target which receives said fresh object requires a subtype thereof.

This situation calls more loudly for static detection than ordinary covariance, because the fresh object(s) being created are guaranteed to violate the covariant constraints whenever we have any actual covariance.

The last line in main gives an example of a situation where no fresh objects are created. Regular downcasts are present, but they succeed at run time because said objects "fit together" (both the dynamic type of c and the actual type annotation for c.c are C<S> for some type S which is a subtype of num, and please note that it is the same S!), which is likely to be true for many situations where objects are used in various interactions/collaborations via some covariant types.

As a counter-counter-example, consider c.fun = f_fun(c): This fails because we are creating a function object whose type has num baked in as its return type, but it is required to be int. It is true that c given as an argument to f_fun and c.c have a type C<S> for the same S <: num, but the function literal which is being created in the body of f_fun will still have the return type which is obtained from the statically known type of c.

Hence this proposal to have a lint that flags this "over-optimistic static type baked in" scenario.

Spelling out the lint criterion using existential types

The notion of a context type which is a covariant type is not something that we have defined in the language specification or in any feature specification, but it might already ring a bell (which is the reason why I used it without introducing it ;-). To make it more precise, the most obvious approach is to make it explicit that dart generic types are existential:

Assume that e has static type List<num> but the actual value of e at some point during an execution is of type List<int>. This is perfectly normal in Dart, but we might describe that a bit more directly by saying so:

A static type in Dart which is a parameterized type G<T1, .. Tk> where G is a generic class actually denotes the existential type ∃X1 <: T1, .. Xk <: Tk. G<X1, .. Xk>, that is, we can, at each point in time during execution find some actual type arguments which are subtypes of the statically known ones (which are now bounds in the existential type), and those actual type arguments yields the actual (dynamic) type of the value of an expression of that static type. One statement later we might need to find some other actual type arguments, but there will always be some actual type arguments that are subtypes of the bounds which will work.

In the following, we'll use 'Dart type notation' when we refer to static types where the existential nature is implicit (this is the kind of types that we can write in actual Dart programs), and a plain 'type' for the case where existential types are written explicitly, and a type which is not existential is invariant. That is, a variable x may have a type annotation which is the Dart type notation List<num>, which means that its static type is ∃S <: num. List<S>, and its value at some point in time could be List<Null>.

With that, let us consider an expression e whose context type is ∃X1 <: T1, .. Xk <: Tk. G<X1, .. Xk>. Assume that we run inference on e as if it had had the context type G<T1, .. Tk>.

With that, a successful inference step would provide actual type arguments and/or function literal type annotations such that the static type of e is a subtype of G<T1, .. Tk>. In some situations we may know for sure that a fresh object is created, and that this object has these actual type arguments baked in, and this is the situation where any non-trivial covariance (that is, some Xi is not equal to Ti) is guaranteed to cause a dynamic type error. So we should flag those situations.

Moreover, we may also have the situation where the inferred type arguments to an invocation of a generic function is passed along some number of steps and then used to create a fresh object whose type has some of said type arguments baked in, which will then similarly guarantee failure if such an object is returned. Of course, there may be any number of steps between those two points, e.g., we may call many functions and create many objects, where the overly optimistic actual type arguments are baked into objects and generic function calls in some long causal chain.

The crucial point here is that when inference on e has provided some actual type arguments and/or added some type annotations (parameters or return type) in some function literals, we need to consider whether it matters. So let T0 denote the static type of e with the given inference.

The criterion for flagging this situation is that if we replace one or more of the inferred types by the types that we would have obtained if inference had been based on G<S1, .. Sk> for some Sj where Sj <: Tj for all j, then we'll get another type for e as a whole, call it S0: Flag it if S0 is not a supertype of T0 (which includes the case where they are the same type). That means, we flag this situation if the actual covariance that we must expect will invalidate the subtype relationship that we expect statically when assuming that the Dart type notation is actually the type of e.

In short, we will flag this situation iff some of the inferred types occur in covariant or invariant positions in the static type of e, because we would then, with knowledge about the actual type arguments, have created a fresh object with a type which is not a supertype of the one that we are actually creating, that is, the one that we are actually creating would then not work (because it can masquerade as an instance of a supertype, but in other situations it can't masquerade as the type that we would have wanted).

Be strict or less strict?

We could make the choice to flag only the situations where it is guaranteed that some object will be created afresh with the over-optimistic static types baked in, that is, when inference is used to provide type annotations to a function literal, or it provides actual type arguments to the invocation of a generative constructor or a list or map literal.

We could also make the choice to flag the situations where a generic function is called with inferred type arguments, because that invocation might in turn give rise to creation of a fresh object/function with an over-optimistic type argument baked in at a covariant position. It would not be easy in general to mark any generic functions as being "fresh-safe" in the sense that they would never create an instance whose type has any of the type arguments given to the function invocation baked in. Hence, it might be an overly restrictive approach to flag generic function invocations.

Documenting the actual need for invariance

One extremely common reason for having covariant context types is that we are passing some actual argument a in an instance method invocation where the receiver has a generic type, and the formal parameter corresponding to a has a type annotation which is 'covariant from class'. For instance, List<num> myList = ...; myList.add(42); creates this situation even though there is no downcast statically, because myList could be a List<double>, so we must check.

However, we can eliminate the notion of covariant context types entirely by introducing support for invariant types. I'll use the notation List<exactly num> as the Dart type notation for the type List<num> (that is, it cannot be a List<int>).

We may then change type annotations such that specific expressions get an invariant type, and if they are used as receivers of instance method invocations, it does not matter whether any of the formal parameters are covariant-from-class, because there is no class covariance, so there is no covariance caused by class covariance.

main() {
  C<exactly num> c = ...; // Invariant, actual type argument is `num`.

  // .. same as before, now safely ...
}

This, of course, offers safety, but takes away flexibility, because the code will no longer allow a List<int>, and it might be the case that almost all code paths down in main and whatever is called from there would work just fine with a List<int> and only a couple of corner cases would actually need the invariance.

A more powerful alternative to invariance based on statically known types is to have an existential open construct, that is, the ability to say "let me give a name to the actual type argument of this particular object at that particular type":

main() {
  final C<num> c = ...; // Actual type argument not statically known.
  if (c is C<?N>) {
    // At this point `N` is bound to the actual type argument
    // of the value of `c`, so we can go invariant from here:
    // Promotion gives `c` the type `C<exactly N>`.
    
    // ... same as before, now safely _and_ without committing
    // to a single statically known type argument ...
  }
}
@eernstg eernstg added lint-request type-code-health Internal changes to our tools and workflows to make them cleaner, simpler, or more maintainable labels Aug 17, 2018
@matanlurey
Copy link
Contributor

Duplicate of dart-lang/sdk#33372.

It should really be enabled by default (as a hint or warning) and not hidden behind a lint.

@eernstg
Copy link
Member Author

eernstg commented Aug 17, 2018

It is not quite a duplicate of dart-lang/sdk#33372: That issue is concerned with the situation where an expression whose type is statically known exactly (not just as "this will be a subtype of T" for some T) is subject to a downcast to some other type which is statically known to be a proper subtype thereof.

In that situation it is guaranteed, 100%, to fail at run time, and there is no need for any types to be covariant: For instance, int i = Object(); or List<int> xs = <num>[];.

This lint is concerned with a situation where a covariant context type receives a value whose type is not given: It is crucial that this value is subject to inference, and that the inferred type is "baked into" the dynamic type of a freshly created object. That object then causes a dynamic type error iff the covariant type corresponds to actual covariance (but not if we happen to have invariance at run time; in my example you could use C<num> c = C<num>(); and everything would succeed).

So with this lint there need not be a static downcast at all. We do know statically that the context type is covariant, and we will generate dynamic checks on one or the other side as needed. But as the last line in main in my example shows (it runs without error, and only differs from a previous statement at runtime, not in static types), we may have exactly the same static analysis and still no failure at run time.

It is really crucial that the error is guaranteed if we have covariance, inference, and fresh objects, but not if we have just two of those three.

I think this should suffice to say that it's not the same thing.

.. should .. not [be] a lint

I agree that we can move it around, but introducing it as a lint would be an easy option, organizationally, and it would then be possible to discuss whether we'll take the breakage associated with some other status.

@eernstg
Copy link
Member Author

eernstg commented Aug 20, 2018

Here's a special case that would be a useful starting point for this lint:

Lint rule: Consider an instance method invocation e.m(a1, ..., an) where the static type of e is the parameterized type G<T1, ..., Tk>, the jth formal parameter is covariant-from-class, and aj is a generative constructor invocation, a list literal, a map literal, or a function literal which is subject to type inference (that is, type arguments are accepted but not given, or it is a function literal where some parts of the signature is inferred). Let T be the static type of aj after inference. Next, make the assumption that the static type of e is G<Never, ..., Never> and let S be the static type of aj after inference based on that assumption. Then emit a lint message indicating that the inference on aj is unsafe iff S is not a supertype of T (being a supertype includes S == T).

PS: It is harmless to perform this check also in the case where no parts of aj are subject to type inference (because S == T, so it's considered safe and no messages are emitted), so performance considerations may be used to decide on how much work to do in order to detect whether inference will do anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lint-request type-code-health Internal changes to our tools and workflows to make them cleaner, simpler, or more maintainable type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests

3 participants