Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement mechanism for finding prop_info for property slot #3573

Closed
wants to merge 2 commits into from

Conversation

nikic
Copy link
Member

@nikic nikic commented Oct 3, 2018

Currently there is no efficient way of finding the property_info
which corresponds to a given property slot. This patch implements
such a mechanism, by storing an array of property_infos in offset
order on the class. This structure is lazily initialized when it
is needed, though we could also compute it during inheritance.

This patch only uses it to optimize visibility checks during
foreach (and get_object_vars etc). We avoid having to look up the
property by name and can directly check the accessibility.

@nikic
Copy link
Member Author

nikic commented Oct 3, 2018

@bwoebi This would be my solution for #3313 (comment).

@dstogov What do you think about this? Part of the motivation here is to fix a tricky edge case for typed properties, but I think that this will also be useful for other things. (E.g. we could implement object iteration without forcing creation of properties HT, which can be a big memory usage problem.)

Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, this idea may make sense for typed properties, but I don't see benefits for foreach, in current state. This new table is going to be created and accessed even for objects with public properties only, that should be more expensive than additional check for public/private properties.

If this is really necessary, probably, better to combine this patch with typed properties.
I'll try to think about this tomorrow once again.

{
return zend_check_property_info_access(zend_get_property_info_for_slot(obj, slot));
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason to inline this function. Better, merge it with zend_check_property_info_access().

return ce->properties_info_table;
}

return zend_build_properties_info_table(ce);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cached classes with ZEND_ACC_LINKED may build this table once and keep it in SHM, instead of rebuilding on each request.

@@ -135,6 +135,8 @@ struct _zend_class_entry {
HashTable properties_info;
HashTable constants_table;

struct _zend_property_info **properties_info_table;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opcache might need special handling for this field. At least, add an assertion that it's NULL.

if (zend_check_property_access(zobj, key) != SUCCESS) {
if (type) {
const char *tmp;
if (zend_check_property_slot_access(Z_OBJ_P(type), zdata) != SUCCESS) {
/* private or protected property access outside of the class */
continue;
}
zend_unmangle_property_name_ex(key, &tmp, &prop_name, &prop_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We, probably, may get property name from property_info->name, instead of unmangling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

property_info->name is a mangled name though. Unmangling is unavoidable here.

@bwoebi
Copy link
Member

bwoebi commented Oct 3, 2018

I like the approach (in particular as I cannot imagine a much better approach to this issue), though I'm questioning whether it's worth delaying the initialization at all. At least, like Dmitry hinted at, classes with ZEND_ACC_LINKED will want to persist that information.

@dstogov I had that same assumption than you, but as it turns out, you can redeclare a non-static property with the same name in a child class. And we cannot just optimize that redeclaration away as we'd then loose docblock info for reflection. Thus just checking for protected won't work.

@nikic
Copy link
Member Author

nikic commented Oct 3, 2018

Unfortunately there is an ugly test failure relating to ArrayObject that I'm not sure what to do about. The problem is basically that ArrayObject get_properties (depending on flags) may directly return the underlying HT, which may be an object HT, which may contain INDIRECTs. However, these INDIRECTs will point into the properties_table of the inner object, not the ArrayObject.

This is not a problem for the typed properties use case (where we'd be directly working on the properties_table, not going through properties HT), but it is a problem for implementing visibility checks.

@nikic
Copy link
Member Author

nikic commented Oct 3, 2018

This behavior also causes issues without this patch in some other cases. For example:

class Test {
    public $test;
    public $test2;
}

class AO extends ArrayObject {
    private $test;

    public function getObjectVars() {
        return get_object_vars($this);
    }
}

$ao = new AO(new Test);
var_dump(get_object_vars($ao));
var_dump($ao->getObjectVars());

This gives:

array(1) {
  ["test2"]=>
  NULL
}
php: /home/nikic/php-src/Zend/zend_object_handlers.c:570: zend_check_property_access: Assertion `property_info->flags & (1 << 0)' failed.
Aborted (core dumped)

First, the array only contains the test2 property, because visibility ends up being checked against the AO class, which has a private test property (even though it's completely unrelated to the inner object).

In the second case we get an assertion failure because https://github.com/php/php-src/blob/master/Zend/zend_object_handlers.c#L570 does not hold.

I think we need to change get_properties for ArrayObject to always return its own object properties, and additionally overload (array) to return the inner contents -- I think controlling array casts is the actual motivation for the current get_properties behavior.

@dstogov
Copy link
Member

dstogov commented Oct 4, 2018

@nikic could you, please, release your ArrayObject related proposal as a bug fix PR and mention compatibility breaks.

@nikic
Copy link
Member Author

nikic commented Oct 4, 2018

@dstogov I've submitted a PR at #3574.

Currently there is no efficient way of finding the property_info
which corresponds to a given property slot. This patch implements
such a mechanism, by storing an array of property_infos in offset
order on the class. This structure is lazily initialized when it
is needed, though we could also compute it during inheritance.

This patch only uses it to optimize visibility checks during
foreach (and get_object_vars etc). We avoid having to look up the
property by name and can directly check the accessibility.

We're also interested in having this mapping to handle some edge
cases in the typed properties implementation.
@nikic
Copy link
Member Author

nikic commented Oct 10, 2018

I just noticed a new issue (again, also existing prior to this patch):

<?php
  
class Test {
    private $prop = "Test";

    function run() {
        foreach ($this as $k => $v) {
            echo "$k => $v\n";
        }
        var_dump(get_object_vars($this));
    }
}
class Test2 extends Test {
    public $prop = "Test2";
}

(new Test2)->run();

This currently prints:

prop => Test2
prop => Test
array(2) {
  ["prop"]=>
  string(5) "Test2"
  ["prop"]=>
  string(4) "Test"
}

So both the private and the public property are shown. The get_object_vars() case returns a corrupted array with duplicate keys.

We can't simply assume that because a property is public that is visible from the current scope, because there might be a more specific private property.

@nikic
Copy link
Member Author

nikic commented Oct 10, 2018

I have fixed the above issue with 4d5d779. It was actually throwing an assertion failure on master.

However, @bwoebi pointed out another, larger problem:

<?php
  
class Test {
    private $prop = "Test";

    function run() {
        foreach ($this as $k => $v) {
            echo "$k => $v\n";
        }
        var_dump(get_object_vars($this));
    }
}
class Test2 extends Test {
}

$test2 = new Test2;
$test2->prop = "Test2";
$test2->run();

This also currently prints:

prop => Test
prop => Test2
array(2) {
  ["prop"]=>
  string(4) "Test"
  ["prop"]=>
  string(5) "Test2"
}

In this case we would have to hide the dynamic property, as the private one is in scope. This goes against our usual assumption that dynamic properties are always visible.

This means we must perform visibility checks for dynamic properties as well :( This is maybe less terrible than it sounds, because we can short-circuit this for the case where the class defines no properties, so at least stdClass and similar will not need checks.

@php-pulls
Copy link

Comment on behalf of petk at php.net:

Labelling

nikic added a commit to bwoebi/php-src that referenced this pull request Jan 7, 2019
Taken from PR php#3573, minus the foreach/iteration parts.
@nikic
Copy link
Member Author

nikic commented Jan 7, 2019

I've included this in the typed properties patch at #3313 (without the foreach/iteration changes -- only used for typed props for now).

@nikic nikic closed this Jan 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants