Uninstantiable defaulted constructor parameter injection issue #168

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
2 participants
@ezzatron
Contributor

ezzatron commented Jun 13, 2014

I keep encountering this issue, so I thought I'd have a crack at a fix. Unfortunately it wasn't as simple as it initially seemed, so this is only a test case to demonstrate the issue.

A common pattern we use in my workplace is to have defaulted constructor parameters. We also use interfaces for just about everything, so the class may look something like this:

class Foo implements FooInterface
{
    public function __construct(BarInterface $bar, BazInterface $baz = null)
    {
        if (null === $baz) {
            $baz = new Baz;
        }

        $this->bar = $bar;
        $this->baz = $baz;
    }

    // ...
}

So I would add definitions to our container something like:

return array(
    'BarInterface' => DI\object('Bar'),
    'FooInterface' => DI\object('Foo'),
);

Leaving out a definition for BazInterface beacuse we just want to use the default. Now when you request FooInterface from the container:

$container->get('FooInterface');

You get a nasty exception something like:

Entry BazInterface cannot be resolved: BazInterface is not instantiable

The expected result is that get() returns an instance of Foo using the default value for $baz.

This PR includes a rough integration test that currently fails because of the issue mentioned above. It is not intended to be merged as it doesn't follow the style of your other tests, but should be good enough to aid in diagnosing the cause of the issue.

Thanks!

@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jun 13, 2014

Member

Hi, thank you for taking the time to report this and write the test!

I will have a look at it as soon as possible because indeed ideally it should work.

Member

mnapoli commented Jun 13, 2014

Hi, thank you for taking the time to report this and write the test!

I will have a look at it as soon as possible because indeed ideally it should work.

@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jun 16, 2014

Member

OK so here is a (hopefully generic) solution I was thinking about (I am using the same class names as your example):

Right now, has('BazInterface') would return true because a definition exists for that interface. The fact that the definition can't be resolved (i.e. an interface is not instantiable) is known later, when we try to resolve the definition (i.e. when we try to create BazInterface). So both has() and the code that resolves parameters for the constructor of Foo don't know if an entry really can be resolved or not. They can only know if a definition exist.

What I first thought: let's change has() to return true only if a definition exist and the definition can really be created (i.e. if it's an interface, we check that it's mapped to a real class). That way, when PHP-DI resolves the parameters to create Foo, it will call has() and see the optional parameter can't be resolved but we can use the default value instead.

I'm not sure this is easy to do because that would mean introducing a concept of "validating" a definition (i.e. making sure it can be resolved), and that doesn't exist today. That seems like a lot of work.

But the main question is: does that even make sense for has() to return false if a definition exist but is incomplete/invalid and will throw an error if get() is called?

Because what do you expect if you call has('BazInterface')? Do you expect to get false (even if the interface really exist!), whereas you would get true for has('Baz')? I'm not sure I would expect that…

And the second question is: what do we validate? We can check that the class is instantiable. But do we check if all the constructors parameters are set/guessable too? Do we check then recursively for all the dependencies too? It seems a lot of work and performance impact :/

I'm thinking out loud, maybe you see a clear solution that I'm missing? Or maybe you have another opinion on all these questions? Suggestions are welcome.

Member

mnapoli commented Jun 16, 2014

OK so here is a (hopefully generic) solution I was thinking about (I am using the same class names as your example):

Right now, has('BazInterface') would return true because a definition exists for that interface. The fact that the definition can't be resolved (i.e. an interface is not instantiable) is known later, when we try to resolve the definition (i.e. when we try to create BazInterface). So both has() and the code that resolves parameters for the constructor of Foo don't know if an entry really can be resolved or not. They can only know if a definition exist.

What I first thought: let's change has() to return true only if a definition exist and the definition can really be created (i.e. if it's an interface, we check that it's mapped to a real class). That way, when PHP-DI resolves the parameters to create Foo, it will call has() and see the optional parameter can't be resolved but we can use the default value instead.

I'm not sure this is easy to do because that would mean introducing a concept of "validating" a definition (i.e. making sure it can be resolved), and that doesn't exist today. That seems like a lot of work.

But the main question is: does that even make sense for has() to return false if a definition exist but is incomplete/invalid and will throw an error if get() is called?

Because what do you expect if you call has('BazInterface')? Do you expect to get false (even if the interface really exist!), whereas you would get true for has('Baz')? I'm not sure I would expect that…

And the second question is: what do we validate? We can check that the class is instantiable. But do we check if all the constructors parameters are set/guessable too? Do we check then recursively for all the dependencies too? It seems a lot of work and performance impact :/

I'm thinking out loud, maybe you see a clear solution that I'm missing? Or maybe you have another opinion on all these questions? Suggestions are welcome.

@ezzatron

This comment has been minimized.

Show comment
Hide comment
@ezzatron

ezzatron Jun 17, 2014

Contributor

When I was trying to find a solution, and I discovered that has('BazInterface') returned true, it was actually a bit of a WTF moment. What part of PHP-DI is actually responsible for that? Is it some kind of automated step that when you add a class, all of its hinted parameters are added also?

As someone coming in with limited knowledge, I would have expected both has('BazInterface')and has('Baz') to return false until I explicitly added them. Could you explain why they are added automatically?

If they can not be added, at least in certain circumstances, like perhaps, when the hint is an uninstantiable class/interface AND it is optional, I think it might solve this particular issue. But I don't have enough knowledge of PHP-DI to say whether that would break other features.

@jmalloc it would be good to get your opinion here, too.

Contributor

ezzatron commented Jun 17, 2014

When I was trying to find a solution, and I discovered that has('BazInterface') returned true, it was actually a bit of a WTF moment. What part of PHP-DI is actually responsible for that? Is it some kind of automated step that when you add a class, all of its hinted parameters are added also?

As someone coming in with limited knowledge, I would have expected both has('BazInterface')and has('Baz') to return false until I explicitly added them. Could you explain why they are added automatically?

If they can not be added, at least in certain circumstances, like perhaps, when the hint is an uninstantiable class/interface AND it is optional, I think it might solve this particular issue. But I don't have enough knowledge of PHP-DI to say whether that would break other features.

@jmalloc it would be good to get your opinion here, too.

@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jun 17, 2014

Member

What part of PHP-DI is actually responsible for that? Is it some kind of automated step that when you add a class, all of its hinted parameters are added also?

Well it's the "autowiring" thing that is responsible of that. To be clear, there are 3 definition sources:

  • autowiring: any class/interface automatically gets a definition (it scans the constructor too)
  • annotation: enrich the definition generated by the autowiring using annotations (usually it fills unknown parameters and add property and setter injection)
  • array/file: here definitions are explicit as you can guess

For autowiring and annotation, definitions are generated on the fly when get or has is called (of course this is cached, so it's actually generated once). So there isn't a definition for every class that exist, only for those that have been injected at least once.

So when has('BazInterface') and has('Baz') return true, it's actually saying "yes I can generate those classes if you ask get". (this method is defined in a standard, see here)

So maybe it does actually make sense for has to return false if the class is not instantiable…

Member

mnapoli commented Jun 17, 2014

What part of PHP-DI is actually responsible for that? Is it some kind of automated step that when you add a class, all of its hinted parameters are added also?

Well it's the "autowiring" thing that is responsible of that. To be clear, there are 3 definition sources:

  • autowiring: any class/interface automatically gets a definition (it scans the constructor too)
  • annotation: enrich the definition generated by the autowiring using annotations (usually it fills unknown parameters and add property and setter injection)
  • array/file: here definitions are explicit as you can guess

For autowiring and annotation, definitions are generated on the fly when get or has is called (of course this is cached, so it's actually generated once). So there isn't a definition for every class that exist, only for those that have been injected at least once.

So when has('BazInterface') and has('Baz') return true, it's actually saying "yes I can generate those classes if you ask get". (this method is defined in a standard, see here)

So maybe it does actually make sense for has to return false if the class is not instantiable…

@ezzatron

This comment has been minimized.

Show comment
Hide comment
@ezzatron

ezzatron Jun 17, 2014

Contributor

To be honest, I don't know :)

What would happen if when this happens:

  • autowiring: any class/interface automatically gets a definition (it scans the constructor too)

And the autowiring sees BazInterface $baz = null, it doesn't add it to the container beacuse it is both uninstantiable and optional. Would that affect any other features?

If you could point me to where the constructor is scanned I might have another go at a fix, too.

Contributor

ezzatron commented Jun 17, 2014

To be honest, I don't know :)

What would happen if when this happens:

  • autowiring: any class/interface automatically gets a definition (it scans the constructor too)

And the autowiring sees BazInterface $baz = null, it doesn't add it to the container beacuse it is both uninstantiable and optional. Would that affect any other features?

If you could point me to where the constructor is scanned I might have another go at a fix, too.

@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jun 17, 2014

Member

Constructors are scanned by "autowiring" (called Reflection in the code) here: ReflectionDefinitionSource

However BazInterface doesn't get added to the container when the autowiring sees BazInterface $baz = null. It will be added later when the container tries to resolve Foo and thus needs to resolve the $baz parameter.

Here is how it goes:

  • get('Foo')
  • scan Foo (to create the definition)
    • ReflectionDefinitionSource (autowiring) see the constructor with BazInterface $baz = null, so bind $baz to an entry named BazInterface (that doesn't get scanned! Maybe the entry doesn't exist, we don't know yet)
    • AnnotationDefinitionSource sees no annotation -> skip
    • ArrayDefinitionSource sees no entry for Foo -> skip
  • resolve the definition of Foo (to create it)
    • See that the constructor need entry BazInterface, so get('BazInterface')
      • scan BazInterface
        • ReflectionDefinitionSource sees that it's an interface, scan its constructor
        • AnnotationDefinitionSource -> skip
        • ArrayDefinitionSource sees -> skip
      • resolve BazInterface -> exception

edit: a better wording for scan would be "get definition" because that's how it's called in the code…

Member

mnapoli commented Jun 17, 2014

Constructors are scanned by "autowiring" (called Reflection in the code) here: ReflectionDefinitionSource

However BazInterface doesn't get added to the container when the autowiring sees BazInterface $baz = null. It will be added later when the container tries to resolve Foo and thus needs to resolve the $baz parameter.

Here is how it goes:

  • get('Foo')
  • scan Foo (to create the definition)
    • ReflectionDefinitionSource (autowiring) see the constructor with BazInterface $baz = null, so bind $baz to an entry named BazInterface (that doesn't get scanned! Maybe the entry doesn't exist, we don't know yet)
    • AnnotationDefinitionSource sees no annotation -> skip
    • ArrayDefinitionSource sees no entry for Foo -> skip
  • resolve the definition of Foo (to create it)
    • See that the constructor need entry BazInterface, so get('BazInterface')
      • scan BazInterface
        • ReflectionDefinitionSource sees that it's an interface, scan its constructor
        • AnnotationDefinitionSource -> skip
        • ArrayDefinitionSource sees -> skip
      • resolve BazInterface -> exception

edit: a better wording for scan would be "get definition" because that's how it's called in the code…

@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jun 17, 2014

Member

As you can see BazInterface gets scanned and added later (when "get").

So maybe calling has('BazInterface') for the step (before calling get):

  • See that the constructor need entry BazInterface, so get('BazInterface')

is a good idea.

We just have to make has return false if the entry is not instantiable (which means the definition can't be resolved).

Member

mnapoli commented Jun 17, 2014

As you can see BazInterface gets scanned and added later (when "get").

So maybe calling has('BazInterface') for the step (before calling get):

  • See that the constructor need entry BazInterface, so get('BazInterface')

is a good idea.

We just have to make has return false if the entry is not instantiable (which means the definition can't be resolved).

@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jun 17, 2014

Member

A generic way to determine if a definition can be resolved would maybe be to edit the DefinitionResolverInterface to add a canBeResolved() method:

interface DefinitionResolver
{
    public function resolve(Definition $definition, array $parameters = array());

    public function canBeResolved(Definition $definition, array $parameters = array());
}

Then each implementation needs to have this additional method, and can perform checks (or not if not necessary).

I'm beginning to like this actually.

Member

mnapoli commented Jun 17, 2014

A generic way to determine if a definition can be resolved would maybe be to edit the DefinitionResolverInterface to add a canBeResolved() method:

interface DefinitionResolver
{
    public function resolve(Definition $definition, array $parameters = array());

    public function canBeResolved(Definition $definition, array $parameters = array());
}

Then each implementation needs to have this additional method, and can perform checks (or not if not necessary).

I'm beginning to like this actually.

@mnapoli mnapoli added the bug label Jun 17, 2014

@mnapoli mnapoli added this to the 4.2 milestone Jun 17, 2014

@ezzatron

This comment has been minimized.

Show comment
Hide comment
@ezzatron

ezzatron Jun 17, 2014

Contributor

Okay, if that sounds good to you, and it solves the problem, then I am happy! But...

I also managed to get the tests to pass by implementing the idea I was trying to explain before, by changing ReflectionDefinitionSource::getConstructorInjection() like so:

--- a/src/DI/Definition/Source/ReflectionDefinitionSource.php
+++ b/src/DI/Definition/Source/ReflectionDefinitionSource.php
@@ -66,7 +66,7 @@ class ReflectionDefinitionSource implements DefinitionSource
         foreach ($constructor->getParameters() as $index => $parameter) {
             $parameterClass = $parameter->getClass();

-            if ($parameterClass) {
+            if ($parameterClass && ($parameterClass->isInstantiable() || !$parameter->isOptional())) {
                 $parameters[$index] = new EntryReference($parameterClass->getName());
             }
         }
Contributor

ezzatron commented Jun 17, 2014

Okay, if that sounds good to you, and it solves the problem, then I am happy! But...

I also managed to get the tests to pass by implementing the idea I was trying to explain before, by changing ReflectionDefinitionSource::getConstructorInjection() like so:

--- a/src/DI/Definition/Source/ReflectionDefinitionSource.php
+++ b/src/DI/Definition/Source/ReflectionDefinitionSource.php
@@ -66,7 +66,7 @@ class ReflectionDefinitionSource implements DefinitionSource
         foreach ($constructor->getParameters() as $index => $parameter) {
             $parameterClass = $parameter->getClass();

-            if ($parameterClass) {
+            if ($parameterClass && ($parameterClass->isInstantiable() || !$parameter->isOptional())) {
                 $parameters[$index] = new EntryReference($parameterClass->getName());
             }
         }
@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jun 17, 2014

Member

I see what you suggest, but wouldn't there be problems if at some point you bind BazInterface to Baz? Then you would still get the default parameter value, not Baz right?

Member

mnapoli commented Jun 17, 2014

I see what you suggest, but wouldn't there be problems if at some point you bind BazInterface to Baz? Then you would still get the default parameter value, not Baz right?

@ezzatron

This comment has been minimized.

Show comment
Hide comment
@ezzatron

ezzatron Jun 17, 2014

Contributor

Okay, yeah, I see why that's not such a great solution now :)

Contributor

ezzatron commented Jun 17, 2014

Okay, yeah, I see why that's not such a great solution now :)

@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jun 17, 2014

Member

OK then I'll have a look at the other solution. It's going to be a bit more complex, but hopefully it will be in 4.2. If you want to give it a shot let me know, but don't feel obligated I understand this is work all over the place so it requires knowing the internals ;)

Member

mnapoli commented Jun 17, 2014

OK then I'll have a look at the other solution. It's going to be a bit more complex, but hopefully it will be in 4.2. If you want to give it a shot let me know, but don't feel obligated I understand this is work all over the place so it requires knowing the internals ;)

mnapoli added a commit that referenced this pull request Jun 17, 2014

mnapoli added a commit that referenced this pull request Jun 17, 2014

Bug #168 Default parameter values are used when a parameter is mapped…
… to an unresolvable entry

Optional parameters that are type-hinted with interfaces should use the default parameter value if the interface is not mapped to a concrete class.
Else that makes autowiring pretty much useless if a class takes optionally a parameter that we don't want to specify (use default value).
@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jun 17, 2014

Member

@ezzatron I've implemented what I was thiking about and pushed a branch: #169

I'll give it a few days to think about it a bit more. If you can review, and if others want to give their opinion too it's welcome.

The main problem I have with this is that concept of "is the definition resolvable?" Which is a good concept, but here it is just used to spot uninstantiable classes. IMO "un-resolvable" should also cover every case where the object can't be created (in theory).

But then if we do that, should we really fallback to the parameter's default value if the user has an error in his DI configuration? I'm afraid there will be lots of "WTF, why is it not injecting X??" (the user will not get alerted that his definitions are invalid).

Also, even if we keep the "unresolvable == uninstantiable" compromise, here is a use case that can lead to WTF:

return [
    // Woops I mapped the interface to an abstract class but I didn't spot my mistake!
    'MyInterface' => 'MyAbstractClass'
];

class Foo {
    public function __construct(MyInterface $bar = null) {
    }
}

Here the user will get Foo instantiated with null instead of an error telling him that his mapping is wrong (he mapped to an abstract class by mistake).

So I'm not sure that fix is right :/

Member

mnapoli commented Jun 17, 2014

@ezzatron I've implemented what I was thiking about and pushed a branch: #169

I'll give it a few days to think about it a bit more. If you can review, and if others want to give their opinion too it's welcome.

The main problem I have with this is that concept of "is the definition resolvable?" Which is a good concept, but here it is just used to spot uninstantiable classes. IMO "un-resolvable" should also cover every case where the object can't be created (in theory).

But then if we do that, should we really fallback to the parameter's default value if the user has an error in his DI configuration? I'm afraid there will be lots of "WTF, why is it not injecting X??" (the user will not get alerted that his definitions are invalid).

Also, even if we keep the "unresolvable == uninstantiable" compromise, here is a use case that can lead to WTF:

return [
    // Woops I mapped the interface to an abstract class but I didn't spot my mistake!
    'MyInterface' => 'MyAbstractClass'
];

class Foo {
    public function __construct(MyInterface $bar = null) {
    }
}

Here the user will get Foo instantiated with null instead of an error telling him that his mapping is wrong (he mapped to an abstract class by mistake).

So I'm not sure that fix is right :/

@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jun 21, 2014

Member

I am beginning to wonder if it wouldn't be better for autowiring to simply ignore all optional parameters?

I mean when you think about it, it makes more sense that way. For example this kind of pattern is quite common:

class Foo
{
    public function __construct(Bar $bar = null)
    {
        $this->bar = $bar ?: $this->createDefaultBar();
    }
}

Optional parameters are used to customize things that would anyway work without it. So it makes not much sense for the container to want to resolve and inject it.

In the end, if you want an optional parameter to be injected, then you have to specify it manually in the config (or with annotations).

What do you think?

Member

mnapoli commented Jun 21, 2014

I am beginning to wonder if it wouldn't be better for autowiring to simply ignore all optional parameters?

I mean when you think about it, it makes more sense that way. For example this kind of pattern is quite common:

class Foo
{
    public function __construct(Bar $bar = null)
    {
        $this->bar = $bar ?: $this->createDefaultBar();
    }
}

Optional parameters are used to customize things that would anyway work without it. So it makes not much sense for the container to want to resolve and inject it.

In the end, if you want an optional parameter to be injected, then you have to specify it manually in the config (or with annotations).

What do you think?

@mnapoli mnapoli referenced this pull request Jun 21, 2014

Merged

4.2 #165

11 of 11 tasks complete
@ezzatron

This comment has been minimized.

Show comment
Hide comment
@ezzatron

ezzatron Jun 21, 2014

Contributor

Yeah, actually, that does make sense. Personally, it would make my life easier, so I'm all for it. Although, if you change it now, I can see it causing problems for people who currently rely on that feature. It's a tough call.

Contributor

ezzatron commented Jun 21, 2014

Yeah, actually, that does make sense. Personally, it would make my life easier, so I'm all for it. Although, if you change it now, I can see it causing problems for people who currently rely on that feature. It's a tough call.

@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jul 12, 2014

Member

I merged the fix into 4.2, and I think I will still change how optional parameters are handled: the current way is bugged, optional parameters should not be injected by default.

That will be a small BC break, but I think it's much more sane to change it now (#171).

Member

mnapoli commented Jul 12, 2014

I merged the fix into 4.2, and I think I will still change how optional parameters are handled: the current way is bugged, optional parameters should not be injected by default.

That will be a small BC break, but I think it's much more sane to change it now (#171).

@mnapoli mnapoli closed this Jul 12, 2014

@mnapoli

This comment has been minimized.

Show comment
Hide comment
@mnapoli

mnapoli Jul 30, 2014

Member

@ezzatron FYI everything was released yesterday in 4.2!

Member

mnapoli commented Jul 30, 2014

@ezzatron FYI everything was released yesterday in 4.2!

@ezzatron

This comment has been minimized.

Show comment
Hide comment
@ezzatron

ezzatron Aug 3, 2014

Contributor

Awesome, thanks for all the hard work! I'm on holidays right now, but I'll be sure to check out when I get back.

Contributor

ezzatron commented Aug 3, 2014

Awesome, thanks for all the hard work! I'm on holidays right now, but I'll be sure to check out when I get back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment