Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SplFixedArray::__set_state #6261

Closed

Conversation

morrisonlevi
Copy link
Contributor

@morrisonlevi morrisonlevi commented Oct 2, 2020

Add SplFixedArray::__set_state

Fixes bug 75268.

The design is that all string keys must come before all integer
keys. The string keys become properties and the integer keys become
values in the array.

Other serialization/unserialization methods should adopt this same
strategy as there are bugs:
https://3v4l.org/IMnI5

Migrate SplFixedArray to get_properties_for.

Extract spl_fixedarray_import, as I expect to reuse this code for
other cases as well, such as the serialization bug mentioned above.

@morrisonlevi morrisonlevi force-pushed the SplFixedArray.__set_state branch 3 times, most recently from 6b90914 to d11e94e Compare October 4, 2020 23:42
@morrisonlevi morrisonlevi marked this pull request as ready for review October 4, 2020 23:46
@morrisonlevi morrisonlevi force-pushed the SplFixedArray.__set_state branch 2 times, most recently from e006122 to f2ded73 Compare October 10, 2020 03:47
Fixes bug 75268.

The design is that all string keys must come before all integer
keys. The string keys become properties and the integer keys become
values in the array.

Other serialization/unserialization methods should adopt this same
strategy as there are bugs:
https://3v4l.org/IMnI5

Migrate SplFixedArray to get_properties_for.

Extract spl_fixedarray_import, as I expect to reuse this code for
other cases as well, such as the serialization bug mentioned above.
@morrisonlevi
Copy link
Contributor Author

@nikic I simplified it since the last time you reviewed. This is targeting master still, which now means 8.1 not 8.0. Leaves plenty of time to fix the inevitable bugs that still exist in serialization (not this PR). What do you think?

@morrisonlevi
Copy link
Contributor Author

@nikic Can I get this reviewed? I need to rebase on master, but aside from that can you see anything else that still needs changed?

Copy link
Contributor

@TysonAndre TysonAndre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than needing to test what happens with __set and minor comments this looks reasonable (from the implementation of ZEND_VM_HANDLER ZEND_ASSIGN_OBJ, I assume write_property calls set if it exists?)

for (zend_long i = 0; i < intern->array.size; i++) {
zend_hash_index_update(ht, i, &intern->array.elements[i]);
zend_hash_index_add_new(ht, i, &intern->array.elements[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct - If I remember correctly, everything in zend_std_get_properties is a string starting in php 7.x(I forget the exact minor version), and even 123 gets coerced to '123' when added to the properties table.

php > $x = new SplFixedArray(1); $x[0] = 'field'; $x->{'0'} = 'prop'; var_dump($x);
object(SplFixedArray)#1 (2) {
  ["0"]=>
  string(4) "prop"
  [0]=>
  string(5) "field"
}
php > echo serialize($x);
O:13:"SplFixedArray":2:{s:1:"0";s:4:"prop";i:0;s:5:"field";}


/* For performance we do a single pass over the input array and the
* spl_fixedarray elems. This complicates the implementation, but part
* of the reason for choosing SplFixedArray is for peformance, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* of the reason for choosing SplFixedArray is for peformance, and
* of the reason for choosing SplFixedArray is for performance, and

goto value_error;

} else {
if (UNEXPECTED(max_idx == 0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: UNEXPECTED may be a bit worse for performance - it's guaranteed to happen exactly once in non-empty SplFixedArrays, and the typical fixed array is small. It moves the code to a cold region far away from the fast code.
(i.e. arrays of size 1 are more common than size 2, and so on)

The UNEXPECTED macro means you think there's a 90%+ chance the branch won't be taken in gcc by default # define UNEXPECTED(condition) __builtin_expect(!!(condition), 0) - newer gcc versions support __builtin_expect_with_probability for specifying a higher probability, but this is off topic

https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

For the purposes of branch prediction optimizations, the probability that a __builtin_expect expression is true is controlled by GCC’s builtin-expect-probability parameter, which defaults to 90%.

const char *ex_msg = NULL; // message for the value exception

ZEND_HASH_FOREACH_KEY_VAL(ht, num_idx, str_idx, val) {
if (str_idx != NULL) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would UNEXPECTED make sense - common use cases wouldn't add dynamic properties

end = fixed->elements + nints;
}

ZEND_ASSERT(num_idx == max_idx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is redundant from the previous condition


} else {
if (UNEXPECTED(max_idx == 0)) {
nints = nht - nstrings;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to check if the size changed to be safe? If there's UNEXPECTEDly a non-empty array when this line is changed, then either

  1. Throw
  2. Clear the array (e.g. call destructors) and (optionally) check for an exception - it's probably safe not to check for an exception from clearing or write_property but I'm not 100% sure

For example, really ill-formed code might do the following - this may be worth adding as a unit test (does it warn about a leak with --enable-debug)

class MyFixedArray extends SplFixedArray {
    public function __set($name, $value) { $this->setSize(1); $this[0] = [$name, $value]; }
}
// write_property calls __set?
MyFixedArray::__set_state(['key' => 'value', 0 => 'other value']);

@TysonAndre
Copy link
Contributor

TysonAndre commented Jan 23, 2021

Out of scope: https://bugs.php.net/bug.php?id=80663 - I noticed when looking at spl classes that SplFixedArray (and probably SplObjectStorage for emalloced key-value entries) doesn't check if the underlying storage is freed in the middle of resizing, i.e. there are missing checks for re-entry due to __destruct being called.

  • e.g. calling SplObjectStorage->detach() from __destruct when updating an existing SplObjectStorage entry

It might be possible to fix those warnings by allocating a temporary buffer as a local variable, copying the range of zvals that need to be released, then calling those destructors. Or by explicitly forbidding resizing while modifying

In realistic applications, nobody would write code like this and there might be other known issues

__set_state would have the same issue in zval_ptr_dtor_range (e.g. if an error handler triggered a call to setSize())

<?php
class InvalidDestructor {
    public function __destruct() {
        $GLOBALS['obj']->setSize(0);
    }
}

$obj = new SplFixedArray(1000);
$obj[0] = new InvalidDestructor();
$obj->setSize(0);

The inner call to setSize frees the buffer twice causing reads from invalid memory.

...other errors
==99097== Invalid read of size 1
==99097==    at 0x7A40E4: zval_ptr_dtor (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x664023: zim_SplFixedArray_setSize (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x817557: execute_ex (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x817A59: zend_execute (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x7A6EFB: zend_execute_scripts (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x732B51: php_execute_script (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x8425DF: do_cli (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x363D91: main (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==  Address 0x949c0d9 is 25 bytes inside a block of size 16,000 free'd
==99097==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==99097==    by 0x664034: zim_SplFixedArray_setSize (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x817557: execute_ex (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x7969E7: zend_call_function (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x796E4A: zend_call_known_function (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x835BB3: zend_objects_destroy_object (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x83A672: zend_objects_store_del (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x664023: zim_SplFixedArray_setSize (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x817557: execute_ex (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x817A59: zend_execute (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x7A6EFB: zend_execute_scripts (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x732B51: php_execute_script (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==  Block was alloc'd at
==99097==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==99097==    by 0x76FF4C: __zend_malloc (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x663E1B: zim_SplFixedArray___construct (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x817557: execute_ex (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x817A59: zend_execute (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x7A6EFB: zend_execute_scripts (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x732B51: php_execute_script (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x8425DF: do_cli (in /path/to/php-8.0.1-nts-install/bin/php)
==99097==    by 0x363D91: main (in /path/to/php-8.0.1-nts-install/bin/php)

@cmb69
Copy link
Contributor

cmb69 commented Dec 2, 2021

What's the status here? There are merge conflicts, and @TysonAndre's review comments apparently haven't been addressed. @morrisonlevi, if you find some time, could you please have a look?

@github-actions
Copy link

github-actions bot commented Feb 7, 2022

There has not been any recent activity in this PR. It will automatically be closed in 7 days if no further action is taken.

@github-actions github-actions bot added the Stale label Feb 7, 2022
if (!spl_fixedarray_empty(&intern->array)) {
zend_long j = zend_hash_num_elements(ht);
/* Keep the values and properties separate*/
HashTable *ht = zend_array_dup(zend_std_get_properties(obj));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#8046

In order for a native datastructure to correctly implement
*get_properties_for for var_export's cycle detection,
it would need to return the exact same array every time prior to this PR. (#8046)

I was working with get_properties_for again after the last review, I'm pretty sure this will cause new problems with var_export on get_properties_for if #8046 isn't merged, for

$arr = new SplFixedArray(1);
$arr[0] = $arr;
var_export($arr);  // now calls get_properties_for with purpose *VAR_EXPORT

@@ -1,5 +1,7 @@
--TEST--
Objects with overloaded get_properties are incompatible with ArrayObject
--XFAIL--
SplFixedArray has migrated to get_properties_for; find new test case
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you could add a stub class to ext/zend_test and move this test there for unit testing?

zval *end = NULL; // points one element passed the last element
const char *ex_msg = NULL; // message for the value exception

ZEND_HASH_FOREACH_KEY_VAL(ht, num_idx, str_idx, val) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ZEND_HASH_FOREACH_KEY_VAL(ht, num_idx, str_idx, val) {
ZEND_HASH_FOREACH_KEY_VAL(ht, num_idx, str_idx, val) {
ZVAL_DEREF(val);

This should either dereference references or throw an exception, e.g. for $arr = [&$val]; SplFixedArray::__set_state($arr); there shouldn't be references in the final result.

@github-actions
Copy link

github-actions bot commented Apr 9, 2022

There has not been any recent activity in this PR. It will automatically be closed in 7 days if no further action is taken.

@github-actions github-actions bot added the Stale label Apr 9, 2022
@morrisonlevi
Copy link
Contributor Author

morrisonlevi commented Apr 13, 2022

@TysonAndre I think at this point, you understand what's going on here much better than I do. Feel free to take over the PR, or close it and open a new one that replaces it, etc. I was hoping I could fix this one pretty quickly, but as has happened with many PHP bugs I've picked up over the years, things are rarely so nice to fix :/

I'm going to be focusing on some issues/shortcomings that have a larger impact for me, so I don't think I'll have time to revisit this in time for the next release, if I'm being realistic.

@github-actions github-actions bot removed the Stale label Apr 14, 2022
@github-actions
Copy link

There has not been any recent activity in this PR. It will automatically be closed in 7 days if no further action is taken.

@github-actions github-actions bot added the Stale label Jun 13, 2022
@Girgias
Copy link
Member

Girgias commented Jun 20, 2022

Went stale

@Girgias Girgias closed this Jun 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants