Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid inference getting stuck when the expected type contains a union/intersection #8635

Merged
merged 1 commit into from Apr 2, 2020

Conversation

smarter
Copy link
Member

@smarter smarter commented Mar 30, 2020

Avoid inference getting stuck when the expected type contains a union/intersection

When we type a method call, we infer constraints based on its expected
type before typing its arguments. This way, we can type these arguments
with a precise expected type. This works fine as long as the constraints
we infer based on the expected type are necessary constraints, but in
general type inference can go further and infer sufficient
constraints, meaning that we might get stuck with a set of constraints
which does not allow the method arguments to be typed at all.

Since 8067b95 we work around the
problem by simply not propagating any constraint when the expected type
is a union, but this solution is incomplete:

  • It only handles unions at the top-level, but the same problem can
    happen with unions in any covariant position (method b of or-inf.scala)
    as well as intersections in contravariant positions (and-inf.scala,
    i8378.scala)
  • Even when a union appear at the top-level, there might be constraints
    we can propagate, for example if only one branch can possibly match
    (method c of or-inf.scala)

Thankfully, we already have a solution that works for all these
problems: TypeComparer#either is capable of inferring only necessary
constraints. So far, this was only done when inferring GADT bounds to
preserve soundness, this commit extends this to use the same logic when
constraining a method based on its expected type (as determined by the
ConstrainResult mode). Additionally, ConstraintHandling#addConstraint
needs to also be taught to only keep necessary constraints under this
mode.

Fixes #8378 which I previously thought was unfixable :).

Copy link
Member

@dottybot dottybot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, and thank you for opening this PR! 🎉

All contributors have signed the CLA, thank you! ❤️

Commit Messages

We want to keep history, but for that to actually be useful we have
some rules on how to format our commit messages (relevant xkcd).

Please stick to these guidelines for commit messages:

  1. Separate subject from body with a blank line
  2. When fixing an issue, start your commit message with Fix #<ISSUE-NBR>:
  3. Limit the subject line to 72 characters
  4. Capitalize the subject line
  5. Do not end the subject line with a period
  6. Use the imperative mood in the subject line ("Add" instead of "Added")
  7. Wrap the body at 80 characters
  8. Use the body to explain what and why vs. how

adapted from https://chris.beams.io/posts/git-commit

Have an awesome day! ☀️

@odersky
Copy link
Contributor

odersky commented Mar 30, 2020

Following up on the discussion of today's meeting, here's why

A | Null <:< $X | Null 

infers A <:< $X: From the initial goal we derive the subgoals

A  <:< $X | Null 
Null <:< $X | Null

To check the first goal, we try both A <:< $X and A <:< Null. If A is a subtype of Null, the second
alternative succeeds and $X is left unconstrained. Otherwise we get the constraint A <:< $X.

The second goal is always true.

So, the scheme is actually a lot more robust than was implied in our discussions.

/cc @noti0na1

@smarter
Copy link
Member Author

smarter commented Mar 31, 2020

So, the scheme is actually a lot more robust than was implied in our discussions.

It turns out that this only works because of https://github.com/lampepfl/dotty/blob/8a42819bf352a25468faac3fe862d5353ea48e73/compiler/src/dotty/tools/dotc/typer/ProtoTypes.scala#L62-L63

If the parameter type of notNull is not exactly a union type, then it stops working on master too:

object Test {
  def test1 = {
    def notNull[T](x: T | Null): x.type & T = ???

    val x: Int | Null = ???
    val y = notNull(identity(x))
    val yc: Int = y // OK
  }

  def test2 = {
    def notNull[T](x: Any & (T | Null)): x.type & T = ???

    val x: Int | Null = ???
    val y = notNull(identity(x))
    val yc: Int = y // FAIL
  }
}

The problem is that depending on how exactly the type variable of identity gets constrained, we might not get to check A | Null <:< $X | Null but rather A | Null <:< $Y where $Y <:< $X | Null. Therefore, I maintain that the current scheme is not robust. And I don't see any way to make it robust apart from introducing a magic match type NotNull[A] that removes the top-level nulls.

@smarter
Copy link
Member Author

smarter commented Mar 31, 2020

I wonder if we could get it to work reliably by making notNull transparent inline ? I tried it and it seems that currently the flow analysis for nullability doesn't work with inlining, but it seems like a reasonable thing to have.

@odersky
Copy link
Contributor

odersky commented Mar 31, 2020

But if you check A | Null <:< $Y where $Y <:< $X | Null then the constraint satisfaction check will reduce that to A | Null <:< $X | Null, so the result should be the same. Or what am I missing here?

The primary reason not to check expected types that are unions is that in general that can lead to a loss
of precision since we have to pick one or the other type in an either. So I believe it would be dangerous to change that. Maybe we should refine the criteria and also discard prototypes of the form A & (B | C). That would help. But going the other way and discarding less looks wrong to me.

@smarter
Copy link
Member Author

smarter commented Mar 31, 2020

But if you check A | Null <:< $Y where $Y <:< $X | Null then the constraint satisfaction check will reduce that to A | Null <:< $X | Null, so the result should be the same. Or what am I missing here?

Apparently not. The upper-bound of $Y is split in two parts: the param bound is $X and the non-param bound is Any | Null. When we check A | Null <:< $Y we try the non-param bound: A | Null <:< Any | Null. This fails, so we add a lower-bound A | Null to $Y, the same lower-bound is automatically added to $X.

The primary reason not to check expected types that are unions is that in general that can lead to a loss of precision since we have to pick one or the other type in an either

Right, that's precisely the problem that this PR is fixing by using necessaryEither which never leads to picking one thing other another.

@odersky
Copy link
Contributor

odersky commented Mar 31, 2020

$Y is split in two parts: the param bound is $X and the non-param bound is Any | Null.

Hmm. That looks wrong to me. $Y <:< $X | Null does not imply that $Y <:< $X.

@odersky
Copy link
Contributor

odersky commented Mar 31, 2020

I believe it is clear that in

A | B <:< $X | B

we have to infer A <:< $X, unless A <:< B. So why does this fail? What does necessaryEither have to do with this?

@smarter
Copy link
Member Author

smarter commented Mar 31, 2020

Hmm. That looks wrong to me. $Y <:< $X | Null does not imply that $Y <:< $X.

Indeed, but apparently this is intentional and cannot be avoided: https://github.com/lampepfl/dotty/blob/7a28eef2a65e769a4439aff67ff0d158db589003/compiler/src/dotty/tools/dotc/core/ConstraintHandling.scala#L480-L489

What does necssaryEither have to do with this?

necessaryEither isn't directly related. Things break down when I remove the OrType case of disregardProto, which is problematic right now as you mentioned, but won't be if we use necessaryEither as proposed by this PR.

@smarter
Copy link
Member Author

smarter commented Apr 1, 2020

Indeed, but apparently this is intentional and cannot be avoided:

... in fact, I can work around this by wrapping the type of the parameter of notNull in an identity type lambda:

  type Id[A] = A
  def notNull[T](x: Id[T | Null]): x.type & T = ...

And then everything typechecks! I wonder if that also means we can still run into the cycles the comment above prune warns about.

@odersky
Copy link
Contributor

odersky commented Apr 1, 2020

So, we have two sources of incompleteness with | on the right or & on the left:

  • introduced by either, to avoid combinatorial explosion
  • introduced by prune to avoid cycles.

This means that making use of a disjunction in the prototype risks over-approximating; that's why it is disabled. Using necessaryEither for compareResultType gets rid of the first incompleteness but not the second. So it's at best a sideways move from the current disabling - we solve some issues but cause new ones. Fiddling with notNull is not the solution IMO, since the general problem of causing incompleteness remains.

A way forward would if we could avoid the second form of incompleteness. That seems doable at first glance - cycle detection on the toplevel is a solvable problem.

@smarter
Copy link
Member Author

smarter commented Apr 1, 2020

A way forward would if we could avoid the second form of incompleteness.

Yes, I just came to this conclusion too! :) And the current behavior of prune is also probably incorrect for GADT handling (/cc @AleksanderBG). I just pushed a commit tweaking prune to prefer widening instead of narrowing the constraint when going through constrainResult and it did fix notNull.scala! Though t8230a.scala regressed for some reason, so I'll investigate this next, as well as the GADT connection.

cycle detection on the toplevel is a solvable problem.

Do you mean, detecting them proactively whenever we add a constraint, or reactively as we kind of already do with the deep subtype checks, etc. If the former, what would be the correct way to heal a detected cycle ? If it's a cycle like $X <:< $Y <:< $X | $Y it seems that we cannot simply unify the variables which are part of the cycle without overconstraining.

@odersky
Copy link
Contributor

odersky commented Apr 1, 2020

Do you mean, detecting them proactively whenever we add a constraint, or reactively as we kind of already do with the deep subtype checks, etc. If the former, what would be the correct way to heal a detected cycle ? If it's a cycle like $X <:< $Y <:< $X | $Y it seems that we cannot simply unify the variables which are part of the cycle without overconstraining.

I mean retroactively. When adding a bound P <:< T to a constraint, we should split it into a
set of type parameters Q1, ..., Qn and a residual type Tr such that

T =:= Tr & Q1 & ... & Qn

So, P <:< P | T would leave P in the bound. We have to prevent the cycles afterwards, i.e. not make isSubType fall off a cliff when it encounters such a cycle.

…/intersection

When we type a method call, we infer constraints based on its expected
type before typing its arguments. This way, we can type these arguments
with a precise expected type. This works fine as long as the constraints
we infer based on the expected type are _necessary_ constraints, but in
general type inference can go further and infer _sufficient_
constraints, meaning that we might get stuck with a set of constraints
which does not allow the method arguments to be typed at all.

Since 8067b95 we work around the
problem by simply not propagating any constraint when the expected type
is a union, but this solution is incomplete:

- It only handles unions at the top-level, but the same problem can
  happen with unions in any covariant position (method b of or-inf.scala)
  as well as intersections in contravariant positions (and-inf.scala,
  i8378.scala)
- Even when a union appear at the top-level, there might be constraints
  we can propagate, for example if only one branch can possibly match
  (method c of or-inf.scala)

Thankfully, we already have a solution that works for all these
problems: `TypeComparer#either` is capable of inferring only necessary
constraints. So far, this was only done when inferring GADT bounds to
preserve soundness, this commit extends this to use the same logic when
constraining a method based on its expected type (as determined by the
ConstrainResult mode). Additionally, `ConstraintHandling#addConstraint`
needs to also be taught to only keep necessary constraints under this
mode.

Fixes scala#8378 which I previously thought was unfixable :).
@smarter
Copy link
Member Author

smarter commented Apr 1, 2020

Got everything to pass, phew! Thanks a lot for the troubleshooting help!

@smarter smarter assigned odersky and unassigned smarter Apr 1, 2020
@smarter smarter requested a review from odersky April 1, 2020 20:26
Comment on lines +560 to +561
// Unlike in `TypeComparer#either`, the same reasoning does not apply
// to GADT mode because this code is never run on GADT constraints.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AleksanderBG Can you confirm that this comment is correct ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarter missed your comment! GADT code only goes through addToConstraint/addUpperBound/addLowerBound. AFAIT none of these invoke addBound.

// Unlike in `TypeComparer#either`, the same reasoning does not apply
// to GADT mode because this code is never run on GADT constraints.
if ctx.mode.is(Mode.ConstrainResult) && constraintsNarrowed then
constraint = savedConstraint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks good to me. It would be even nicer to have a prune that does not narrow. But let's save this for another day.

smarter added a commit to dotty-staging/dotty that referenced this pull request Apr 7, 2020
Given an expected type `*T`, we allow a sequence argument `xs: _*` to be
either a `Seq[T]` or an `Array[_ <: T]`, irrespective of whether the
method we're calling is a Java or Scala method. So far we typed sequence
arguments without an expected type, meaning that adaptation did not take
place. But thanks to scala#8635, we can type them with an expected type of
`Seq[T] | Array[_ <: T]` and type inference works out. This is what this
commit does.
smarter added a commit to dotty-staging/dotty that referenced this pull request Apr 7, 2020
Given an expected type `*T`, we allow a sequence argument `xs: _*` to be
either a `Seq[T]` or an `Array[_ <: T]`, irrespective of whether the
method we're calling is a Java or Scala method. So far we typed sequence
arguments without an expected type, meaning that adaptation did not take
place. But thanks to scala#8635, we can type them with an expected type of
`Seq[T] | Array[_ <: T]` and type inference works out. This is what this
commit does.
smarter added a commit to dotty-staging/dotty that referenced this pull request Apr 7, 2020
Given an expected type `*T`, we allow a sequence argument `xs: _*` to be
either a `Seq[T]` or an `Array[_ <: T]`, irrespective of whether the
method we're calling is a Java or Scala method. So far we typed sequence
arguments without an expected type, meaning that adaptation did not take
place. But thanks to scala#8635, we can type them with an expected type of
`Seq[T] | Array[_ <: T]` and type inference works out. This is what this
commit does.
smarter added a commit to dotty-staging/dotty that referenced this pull request Apr 8, 2020
Given an expected type `*T`, we allow a sequence argument `xs: _*` to be
either a `Seq[T]` or an `Array[_ <: T]`, irrespective of whether the
method we're calling is a Java or Scala method. So far we typed sequence
arguments without an expected type, meaning that adaptation did not take
place. But thanks to scala#8635, we can type them with an expected type of
`Seq[T] | Array[_ <: T]` and type inference works out. This is what this
commit does.
smarter added a commit to smarter/dotty that referenced this pull request Apr 8, 2020
Given an expected type `*T`, we allow a sequence argument `xs: _*` to be
either a `Seq[T]` or an `Array[_ <: T]`, irrespective of whether the
method we're calling is a Java or Scala method. So far we typed sequence
arguments without an expected type, meaning that adaptation did not take
place. But thanks to scala#8635, we can type them with an expected type of
`Seq[T] | Array[_ <: T]` and type inference works out. This is what this
commit does.
smarter added a commit to smarter/dotty that referenced this pull request Apr 8, 2020
Given an expected type `*T`, we allow a sequence argument `xs: _*` to be
either a `Seq[T]` or an `Array[_ <: T]`, irrespective of whether the
method we're calling is a Java or Scala method. So far we typed sequence
arguments without an expected type, meaning that adaptation did not take
place. But thanks to scala#8635, we can type them with an expected type of
`Seq[T] | Array[_ <: T]` and type inference works out. This is what this
commit does.
smarter added a commit to dotty-staging/dotty that referenced this pull request Apr 13, 2020
Given an expected type `*T`, we allow a sequence argument `xs: _*` to be
either a `Seq[T]` or an `Array[_ <: T]`, irrespective of whether the
method we're calling is a Java or Scala method. So far we typed sequence
arguments without an expected type, meaning that adaptation did not take
place. But thanks to scala#8635, we can type them with an expected type of
`Seq[T] | Array[_ <: T]` and type inference works out. This is what this
commit does.
smarter added a commit to dotty-staging/dotty that referenced this pull request Apr 13, 2020
Given an expected type `*T`, we allow a sequence argument `xs: _*` to be
either a `Seq[T]` or an `Array[_ <: T]`, irrespective of whether the
method we're calling is a Java or Scala method. So far we typed sequence
arguments without an expected type, meaning that adaptation did not take
place. But thanks to scala#8635, we can type them with an expected type of
`Seq[T] | Array[_ <: T]` and type inference works out. This is what this
commit does.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Type inference issue with intersection types
5 participants