Skip to content

Fix bug 72610 by deferring the calls to __wakeup to the very end. #2004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

TysonAndre
Copy link
Contributor

The PHP7 source code for all versions of php 7 was keeping pointers
to object/array zvals, not copies of the object properties.

This causes two bugs in __wakeup:

  • Modifying a object property (or property's property) during __wakeup will affect copies/references to that properties
  • If dynamic properties are added during __wakeup, pointers to other dynamic properties are invalidated

There are two ways to fix this. This PR chooses the second. (So does HHVM).

Make the publicly exposed APIs var_unserialize and var_unserialize_ex
perform this finalization at the very end.
Many extensions use var_unserialize to unserialize data
(e.g. ext/session, or extensions outside of php-src),
and this change is made in a way that shouldn't break them.

  • Move the inner zval deserialization code into a static method, so that
    they won't accidentally be called and miss a call to __wakeup.
  • I can't see a use for that method right now, in extensions.

@@ -137,6 +182,57 @@ static zval *var_access(php_unserialize_data_t *var_hashx, zend_long id)
return var_hash->data[id];
}

static int var_wakeup_all(php_unserialize_data_t *var_hashx, const zend_bool is_cleanup)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add comment and explain the return values {0, 1} ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing else adds comments. I assume the conventions are 1 on success, 0 on failure, for all "boolean" functions.

@TysonAndre TysonAndre force-pushed the fix-bug-72610-unserialize-v2 branch 2 times, most recently from b6d5bd6 to 11be29e Compare July 18, 2016 19:56
@laruence laruence added the Bug label Jul 21, 2016
@chtg
Copy link
Contributor

chtg commented Jul 21, 2016

I don't think this is a good idea, it is possible to lead to some security issues.

@TysonAndre
Copy link
Contributor Author

@chtg could you clarify what type of security issues? Do you have alternative proposals to fix https://bugs.php.net/bug.php?id=72610 ?

Overall, this makes php unserialize closer to php5 unserialize. Except that it affects cyclic data structures with objects defining __wakeup. (Objects that haven't had __wakeup called yet will have all of their properties set, instead of only some properties set).

Have you run these examples (See https://3v4l.org/3Ubnc and https://3v4l.org/OGUXG )?
What do you expect https://github.com/TysonAndre/php-src/blob/11be29e6480a679174f5f2cf860270d796f1787c/ext/standard/tests/serialize/bug72601.phpt to do?

The circumstances when I think deferring __wakeup would matter are:

  1. unserialize() called on attacker provided data in php7 (Already pretty bad)
  2. A permitted class defines __wakeup (or all classes permitted)
  3. The attacker provided data has cyclic objects/references (Otherwise, this does the same thing as the old behavior)
  4. The __wakeup function is secure when called if not all of the attacker-provided object properties (of an object B referenced by A(in a cyclic structure such as A->B->A)) are defined, but __wakeup is somehow insecure if all of the attacker-provided object properties are defined

Or are you talking about non-attacker provided data? There are circumstances where this would change the behavior, such as https://github.com/TysonAndre/php-src/blob/11be29e6480a679174f5f2cf860270d796f1787c/ext/standard/tests/serialize/bug70963.phpt , but the behavior seems closer to php 5.

@@ -164,6 +260,7 @@ PHPAPI void var_destroy(php_unserialize_data_t *var_hashx)
efree_size(var_dtor_hash, sizeof(var_dtor_entries));
var_dtor_hash = next;
}
var_wakeup_all(var_hashx, 1); /* Free the copies of objects (which would call __wakeup) if we haven't already. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be clearer to write this the same way as for var_dtor_hash -- I found the intermixed wakeup/no-wakeup code in var_wakeup_all pretty hard to follow.

@nikic
Copy link
Member

nikic commented Jul 25, 2016

This approach looks reasonable to me in general (I didn't check the impl in detail). I'm also not clear on how this would cause security issues that don't already exist.

@TysonAndre
Copy link
Contributor Author

@nikic
I could split it up into a function with a loop to call __wakeup and a function with a loop to free the memory, though that's probably slightly less efficient (visits pointers twice instead of once). Is that what you meant?

I had 2 reasons for interleaving wakeup and non-wakeup code.

  1. If we are given invalid data, we must call var_ptr_dtor (dtors) to avoid memory leaks.
    However, there's no reason to call __wakeup on the objects if we just end up returning false from unserialize().
  2. If an exception is thrown in __wakeup(), the existing behavior is to stop unserializing. This means other __wakeup methods won't be called.

Suppose that an array of 5 objects is being unserialized, and each of them defines __wakeup().

In the existing implementation, if the 3rd object throws an exception, then the 4th and 5th objects don't have __wakeup called (This is just going to return false, so there's no point setting up resources).
See https://3v4l.org/iXPi6

@@ -48,6 +48,8 @@ struct php_unserialize_data {
void *last;
void *first_dtor;
void *last_dtor;
void *first_wakeup;
Copy link
Contributor Author

@TysonAndre TysonAndre Aug 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could someone familiar with the php release process clarify something for me? I think that this might fall under incompatible ABI, and should instead be released in the 7.1 beta? (Or done by adding an additional param in the "inner" API implementation, accepting another parameter with *first_wakeup and *last_wakeup
E.g.

  1. An extension which uses php_serialize_data_t to call serialize is compiled against 7.0.9. A shared object is compiled.
  2. If this changes in 7.0.10, then that extension would break in 7.1.10 if the shared object wasn't recompiled against 7.0.10 (The struct the extension allocates on the stack only has enough space for the first 4 parameters)

Or is it the case that shared objects are already expected to be recompiled every time the subversion changes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, this change would be an ABI break. DSOs are expected to work between patch versions without recompilation. (And the allocation of the structure is part of a macro, so it will be wrong even if extensions don't manually use the structure.)

The PHP source code for all versions of php 7 was keeping **pointers**
to object properties, not copies of the object properties.

There are two ways to fix this. This PR chooses the second. So does HHVM.

- Keep copies of **all** zvals instead of pointers to zvals while
  unserializing, increment reference counts.
- Defer calls to __wakeup until the very end. This requires copying only
  the objects of classes implementing __wakeup.
  hhvm handles __wakeup this way.

  https://github.com/facebook/hhvm/blob/2d5f00afbb033aec0cbc51cbe3a897af79cdcb28/hphp/runtime/base/variable-unserializer.cpp#L333
  https://github.com/facebook/hhvm/blob/2d5f00afbb033aec0cbc51cbe3a897af79cdcb28/hphp/runtime/base/variable-unserializer.cpp#L965

Make the publicly exposed APIs var_unserialize and var_unserialize_ex
perform this finalization at the very end.
Many extensions use var_unserialize to unserialize data
(e.g. ext/session, or extensions outside of php-src),
and this change is made in a way that shouldn't break them.

- Move the inner zval deserialization code into a static method, so that
  they won't accidentally be called and miss a call to __wakeup.
It referred to a value that was deleted in __wakeup, so it used to be
converted to UNDEF.
This conversion to UNDEF is no longer necessary, since we create the copy/reference *before* __wakeup is
called
@TysonAndre TysonAndre force-pushed the fix-bug-72610-unserialize-v2 branch from 11be29e to 04b9498 Compare August 3, 2016 04:09
@TysonAndre TysonAndre force-pushed the fix-bug-72610-unserialize-v2 branch from 04b9498 to 3481ec1 Compare August 3, 2016 04:32
@krakjoe
Copy link
Member

krakjoe commented Jan 3, 2017

@TysonAndre fix conflicts please :)

@nikic can I ask that you review this in detail, and if it's okay, please merge into appropriate branch (once fixed).

@nikic
Copy link
Member

nikic commented Jan 3, 2017

See #2050 (comment), closing.

@nikic nikic closed this Jan 3, 2017
@TysonAndre TysonAndre deleted the fix-bug-72610-unserialize-v2 branch February 1, 2017 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants