Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bidirectional type narrowing for generic types (would close an obscure hole in type checking!) #6732

Open
00dani opened this issue Mar 4, 2022 · 10 comments

Comments

@00dani
Copy link

00dani commented Mar 4, 2022

Feature request

Currently an assignment of the form$this->attribute = $value narrows the type of $this->attribute to the same type as $value for the remainder of the method. In most cases, this makes sense.

For example, the following code is statically sound. Iterables don't always have a forAll method, but there's no possible way $this->stuff can be of a type other than ArrayCollection when we attempt to call the forAll method.

/** @var iterable<int, Thing> */
private iterable $stuff;

public function __construct() {
  $this->stuff = new ArrayCollection();
  $this->stuff->forAll($doAThing);
}

However, a problem arises if the type of $value is in some way wider than the type of $this->attribute. This is only possible when generic types are used: $this->attribute might have narrower type parameters than $value does, although $value might have a more specific class type. Since the assignment only affects the inferred type of $this->attribute, it actually widens those type parameters.

I demonstrated an example of this problem a few minutes ago, which I'll reproduce here with dumpType information included:

/** @var Collection<int, int> $ints */
public iterable $ints;

/** @var Collection<int, string> $strings */
public iterable $strings;

public function __construct() {
	dumpType($this->ints); // Doctrine\Common\Collections\Collection<int, int>
	dumpType($this->strings); // Doctrine\Common\Collections\Collection<int, string>
	$array = new ArrayCollection();
	dumpType($array); // Doctrine\Common\Collections\ArrayCollection<*NEVER*, *NEVER*>
	$this->ints = $array;
	dumpType($this->ints);  // Doctrine\Common\Collections\ArrayCollection<*NEVER*, *NEVER*>
	$this->strings = $array;
	dumpType($this->strings); // Doctrine\Common\Collections\ArrayCollection<*NEVER*, *NEVER*>
}

As you can see, PHPStan "forgets" that $this->ints and $this->strings have different type parameters. This leads to an admittedly-minor hole in the type system, since this mechanism may be used to statically cast from any type to any other, without a corresponding runtime conversion occurring.

In a situation like this, I believe that PHPStan should effectively unify the types of $this->attribute and $value, in the logic programming sense. The inferred types of both variables ought to be narrowed by the assignment expression, preserving type compatibility for both the outer class type and its type parameters. In other words, this ought to happen:

dumpType($this->ints); // Doctrine\Common\Collections\Collection<int, int>
$array = new ArrayCollection();
dumpType($array);  // Doctrine\Common\Collections\ArrayCollection<*NEVER*, *NEVER*>
$this->ints = $array;
dumpType($this->ints); // Doctrine\Common\Collections\ArrayCollection<int, int>
dumpType($array); // Doctrine\Common\Collections\ArrayCollection<int, int>

This narrowing step would cause the subsequent assignment, $this->strings = $array, to fail type-checking: a Collection<int, int> cannot be assigned to a variable of type Collection<int, string>. The hole in the type system is sealed.

I do think this is an unlikely situation, since one would typically write $this->ints = new ArrayCollection() in the first place rather than using a temporary variable. Additionally, Psalm has the exact same type system hole at time of writing. Still, this is an issue worth being aware of. Also, narrowing bidirectionally would allow the constructor to go ahead and populate $this->ints in a properly typesafe way, which currently doesn't occur.

Did PHPStan help you today? Did it make you happy in any way?

Yes! I've been experimenting with API Platform at work, since we're hoping to adopt it for newly planned projects, and API Platform's very structured design has turned out to mesh wonderfully with PHPStan's type checking. As a Haskell fan I generally love powerful type systems, and PHPStan has been giving me those warm fuzzy TypeScript feels despite working in PHP. 🐱

@ondrejmirtes ondrejmirtes added this to the Generics milestone Mar 5, 2022
@arnaud-lb
Copy link
Contributor

Here is an other example: https://phpstan.org/r/064e69da-8b94-461b-872c-a2322d69e2a6

@arnaud-lb
Copy link
Contributor

In a situation like this, I believe that PHPStan should effectively unify the types of $this->attribute and $value, in the logic programming sense. The inferred types of both variables ought to be narrowed by the assignment expression, preserving type compatibility for both the outer class type and its type parameters. In other words, this ought to happen:

Agreed.

So we would have these rules:

(Rules 1-3 are the current behavior)

  1. There should be a notion of "unresolved type parameter" on generic types. The *NEVER* type could denote that, or an other specific type
  2. When a type parameter can not be resolved because of a lack of argument to infer with, the type parameter should be set to unresolved
  3. When considering type parameters, generic types should accept any unresolved type (so that Foo<int> accepts Foo<unresolved>)
  4. Sending a variable resolves all the type parameters of the variable with the receiver's type (as long as the original type is accepted by the receiver). This includes assignment to a property, argument passing, assignment to an other variable, use in a closure, implicit capture in an arrow function.
  5. Method calls should resolve all the type parameters of the receiver.

Rules 4 and 5 prevent aliasing the same value from different variables of different types, which closes the hole. I think that its important that all type parameters are resolved in 4 and 5, otherwise it's still possible to create aliases with different types.

Examples for rule 4 and 5:

/** @var Collection<int> */
private Collection $ints;

/** @var Collection<int> */
private Collection $strings;

public function __construct() {
    $ints = new Collection(); // $ints is a Collection<unresolved>
    $this->ints = $ints; // $ints is now a Collection<int>
    $this->strings = $ints; // error: Collection<int> is not accepted by Collection<string> 
}
/** @param Collection<int> $ints */
function takeInts(Collection $ints): void {}

/** @param Collection<string> $strings */
function takeStrings(Collection $strings): void {}

$ints = new Collection(); // $ints is a Collection<unresolved>
takeInts($ints); // $ints is now a Collection<int>
takeStrings($ints); // error
/** @var Collection<int>|null */
$ints = null;

$tmp = new Collection(); // $tmp is a Collection<unresolved>
$ints = $tmp; // $ints is now a Collection<int>
$ints = new Collection(); // $tmp is a Collection<unresolved>
function () use ($ints): void { // $ints is now a Collection<ints>
    takeInts($ints);
}
$ints = new Collection(); // $tmp is a Collection<unresolved>
fn () => // $ints is now a Collection<ints>
    takeInts($ints);
$ints = new Collection(); // $ints is a Collection<unresolved>
$ints->add(1); // $ints is now a Collection<int>
$ints->add("a"); // error
$ints = new Collection(); // $ints is a Collection<unresolved>

takeInts(
    $foo ? $ints : new Collection()
) // $ints is now a Collection<int>

@marcospassos
Copy link

In my experience, most use cases fall into the argument narrowing use case.

As discussed with @ondrejmirtes here, most programming languages with generic support can proper infer cases like the following:

/**
 * @template T
 */
final class Collection
{
    /**
     * @param T $value
     */
    public function add(mixed $value): void
    {
    }
}

/**
 * @template T
 */
class Example
{
    public function fill(): void
    {
        $this->addAll(new Collection());
    }

    /**
     * @param Collection<T> $collection
     */
    public function addAll(Collection $collection): void
    {
    }
}

Unfortunately, casting is the only workaround available, but it introduces many unnecessary variable assignments and, consequently, a cognitive burden.

@marcospassos
Copy link

marcospassos commented Apr 27, 2022

This example shows how painful this limitation is currently:

<?php declare(strict_types = 1);

class Key {
}

final class Example
{
    /** @var \WeakMap<Key, int> */
    public \WeakMap $map;

    /**
     * @param \WeakMap<Key, int> $map
     */
    public function __construct(\WeakMap $map = new \WeakMap()) {
        $this->map = $map;
    }
}
Property Example::$map (WeakMap<Key, int>) does not accept WeakMap<Key, int>|WeakMap<object, mixed>.

To make it work today, you need this:

<?php declare(strict_types = 1);

class Key {
}

final class Example
{
    /** @var \WeakMap<Key, int> */
    public \WeakMap $map;

    /**
     * @param \WeakMap<Key, int> $map
     */
    public function __construct(?\WeakMap $map = null) {
        if ($map === null) {
            /** @var \WeakMap<Key, int> $map **/
            $map = new \WeakMap();
        }

        $this->map = $map;
    }
}

As expected, due to the same issue, this also doesn't work:

<?php declare(strict_types = 1);

class Key {
}

final class Example
{
    /** @var \WeakMap<Key, int> */
    public \WeakMap $map;

    /**
     * @param \WeakMap<Key, int> $map
     */
    public function __construct(?\WeakMap $map = null) {
        $this->map = $map ?? new \WeakMap();
    }
}

@michelv
Copy link

michelv commented Sep 9, 2022

Hi! Could this issue be the cause of the 2 errors here? https://phpstan.org/r/814c4824-d65a-4108-8547-59058c1c901f

<?php declare(strict_types = 1);

use Ds\Set;

class MyTest {
    /**
     * @return Set<mixed>
     */
    public function provideSets1(): Set
    {
        return new Set(['a']);
    }

    /**
     * @return Set<string>
     */
    public function provideSets2(): Set
    {
        return new Set(['a']);
    }

    /**
     * @return Set<string|int>
     */
    public function provideSets3(): Set
    {
        return new Set(['a']);
    }
}

The errors:

Line | Error
---- | --
  11 | Method MyTest::provideSets1() should return Ds\Set<mixed> but returns Ds\Set<string>.  
  27 | Method MyTest::provideSets3() should return Ds\Set<int\|string> but returns Ds\Set<string>.

@muglug
Copy link
Contributor

muglug commented Sep 13, 2022

I opened a duplicate issue in Psalm's repo: vimeo/psalm#8482

Hack solves this with the solution I give there, but I'm hesitant to copy that implementation because the type hole is a pretty small edge-case.

@marcospassos
Copy link

Hack solves this with the solution I give there, but I'm hesitant to copy that implementation because the type hole is a pretty small edge-case

We've several cases like the one I've shared in a single application and dozens throughout the codebase. I don't think it's uncommon at all.

@muglug
Copy link
Contributor

muglug commented Sep 13, 2022

The edge-case I was talking about is the false-negative I mention in the linked issue. This issue also mentions a number of false-positives in PHPStan (some of which also appear in Psalm), but I wasn't talking about those. In a tool that has few of either, I see false-negatives as inherently more dangerous than false-positives.

@ondrejmirtes
Copy link
Member

because the type hole is a pretty small edge-case

I consider the implementation suggestion above with Collection<unresolved> etc. to be a huge deal, because once implemented, it will allow us to get rid of generic types generalization (where 1 becomes an int). More and more issues are being reported connected to that problem.

Closing a small type hole is just a small part of that...

@ondrejmirtes
Copy link
Member

After this we'll also be able to do phpstan/phpstan-src#2110.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants