-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Forbid dynamic calls to scope introspection functions #1886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently we allow performing dynamic calls to functions that introspect the parent stack frame. Apart from this being a source of segmentation faults previously and having wildly varying behavior between PHP5, PHP7 (namespaced and non-namespaced) and HHVM (for examples see bug #71220), this also causes issues for the data-flow optimizer, which is not able to detect these calls and abort optimization (see the dynamic_call_007.phpt test for a simple example of misoptimization). Solve this by forbidding these dynamic calls. Affects: * extract() * compact() * get_defined_vars() * parse_str() with one arg * mb_parse_str() with one arg * func_get_args() * func_get_arg() * func_num_args()
/cc @dstogov There currently doesn't seem to be a good way to detect whether a call is dynamic. I'm currently scanning for the init opcode, but that's not a nice solution and slow. Do you think we can extend the call info flags by using const_flags as well (it shouldn't be currently used in this context)? |
zend_error(E_WARNING, "func_num_args(): Called from the global scope - no function context"); | ||
RETURN_LONG(-1); | ||
} | ||
|
||
if (zend_forbid_dynamic_call("func_num_args()") == FAILURE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, you don't need specific "func_num_args", since it can be get by EG(current_execute_data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func_num_args() is not a problem from a safety perspective, but the current behavior for dynamic func_num_args() calls is inconsistent depending on whether we're in namespaced / non-namespaced code: https://3v4l.org/inP5v -- anything that inspects the "next frame up" will have this issue, this includes all func_* functions.
Furthermore, while func_num_args() is not an issue for the current optimizer, it would become a problem once we implement inlining (in which case func_num_args() would get the argument number of the function we inlined into).
As such I think we should forbid it as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow! Great inconsistency example :)
but the following code is still legal and should provide consistent result
function foo() {
$f = 'func_num_args';
var_dump($f());
}
So the problem might be with INIT_USER_CALL only.
or you are disagree?
From: Nikita Popov notifications@github.com
Sent: Monday, April 25, 2016 11:20
To: php/php-src
Cc: Dmitry Stogov; Mention
Subject: Re: [php/php-src] Forbid dynamic calls to scope introspection functions (#1886)
In Zend/zend_builtin_functions.chttps://github.com//pull/1886#discussion_r60875906:
zend_error(E_WARNING, "func_num_args(): Called from the global scope - no function context"); RETURN_LONG(-1); }
if (zend_forbid_dynamic_call("func_num_args()") == FAILURE) {
func_num_args() is not a problem from a safety perspective, but the current behavior for dynamic func_num_args() calls is inconsistent depending on whether we're in namespaced / non-namespaced code: https://3v4l.org/inP5v -- anything that inspects the "next frame up" will have this issue, this includes all func_* functions.
Furthermore, while func_num_args() is not an issue for the current optimizer, it would become a problem once we implement inlining (in which case func_num_args() would get the argument number of the function we inlined into).
As such I think we should forbid it as well.
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHubhttps://github.com//pull/1886/files/0fc8f96e7b5f5f42f777ff826b072a1c22ab734b#r60875906
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I think maybe I didn't make myself clear, I mean, you don't need
if (zend_forbid_dynamic_call("func_num_args()") == FAILURE) {
you could make it simpler by:
if (zend_forbid_dynamic_call() == FAILURE) {
the function name can be get via EG(current_execute_data).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laruence Ahhh, sorry, I didn't catch what you mean. The reason why I'm passing in the function name is to throw an error like "Cannot call parse_str() with a single argument dynamically" for parse_str/mb_parse_str, as these are only forbidden if the second argument is not used. If I remove this special case we can indeed determine the function name automatically.
The following code is legal. It just makes troubles to optimiser. Currently we don't have a free bit in "call info flags", but this would be a better solution. May be we should mark "unsafe" internal functions and verify their fn_flags when they are called dynamically? anyway, I'm not sure if we should limit language behavior just because of optimization troubles. Thanks. Dmitry. From: Nikita Popov notifications@github.com /cc @dstogovhttps://github.com/dstogov There currently doesn't seem to be a good way to detect whether a call is dynamic. I'm currently scanning for the init opcode, but that's not a nice solution and slow. Do you think we can extend the call info flags by using const_flags as well (it shouldn't be currently used in this context)? You are receiving this because you were mentioned. |
Yes, which is why I'm asking if there would be a problem with extending call flags to not only used the "reserved" bits, but also the "const_flags" bits. This will extend call info to 16 bit, which gives us plenty of space. Is there an issue with doing this?
Yes, it's tricky... The way I see it, this is the situation:
So, given that this functionality is not useful (and not used), unclear semantics and the issues in the optimizer (currently causing segfaults), I think forbidding these calls is the best solution. I would use the simple rule of "no dynamic calls" rather than try to carve out exceptions like "calling with call_user_func is not okay, but using $func() is allowed". |
On 04/25/2016 01:16 PM, Nikita Popov wrote:
Right, but dynamic function call should not be a problem (only call
|
Another segfault related to this: https://bugs.php.net/bug.php?id=72102 (set_error_handler + func_get_args) Also found an exciting new way to set a variable in PHP! (https://3v4l.org/NJh0H) spl_autoload_register('parse_str');
function test() {
$FooBar = 1;
class_exists('FooBar');
var_dump($FooBar); // string(0) ""
}
test(); So clearly, class_exists is a dangerous function :) |
what about "new Foo()"? May be we should deprecate most of this "dynamic" cases? From: Nikita Popov notifications@github.com Another segfault related to this: https://bugs.php.net/bug.php?id=72102 (set_error_handler + func_get_args) Also found an exciting new way to set a variable in PHP! spl_autoload_register('parse_str'); So clearly, class_exists is a dangerous function :) You are receiving this because you were mentioned. |
Requires switching to 16-bit call info.
I've updated the patch to use a ZEND_CALL_DYNAMIC flag. @dstogov Yes, the same applies to spl_autoload_register('parse_str');
spl_autoload_register(function($class) {
// Avoid undefined class exception
eval("class $class {}");
});
function test() {
$FooBar = 1;
new FooBar;
var_dump($FooBar); // string(0) ""
}
test(); |
Also, I'd eventually move the dynamic call check to the zend_is_callable_ex fetch? [set a flag in arginfo or similar? … We anyway have 4 byte of padding (64 bit at least) left in that structure...] |
This solution is cheaper, because we perform check only for "dangerous" function. From: Bob Weinand notifications@github.com Regarding the error message: Not sure whether an user will understand what dynamically exactly means. Also, I'd eventually move the dynamic call check to the zend_is_callable_ex fetch? [set a flag in arginfo or similar? ... We anyway have 4 byte of padding (64 bit at least) left in that structure...] You are receiving this because you were mentioned. |
We already discussed this OTR, but repeating it here for the record: This patch does also cover |
I still fell a bit uncomfortable with this solution. I understand Nikta's point, and it makes sense. BTW: I think we have two problems:
It's easy to fix them with additional check for prev_execute_data->func->type:
Interesting, but call_user_func('func_num_args') will work, because of call through trampoline :) This also works with Nikita's solution. We also may "unwind" to the upper user call frame (I'm not sure)
I really don't know how to solve the problem, but proposed solution (prohibit dynamic calls) doesn't look good. The following code is valid, but it's prohibited, because it makes troubles to optimizer. function foo() {
} foo(); A better solution, would require "aliases and side effects analyses" in optimizer, but this is significantly more difficult of course. Nikita, I think the second option would require RFC. I won't object against it, but also won't support. Thanks. Dmitry. From: Nikita Popov notifications@github.com We already discussed this OTR, but repeating it here for the record: This patch does also cover $f = 'func_get_args'; $f();, i.e. it forbids dynamic calls in all forms, not just those going through zend_call_function. From a user perspective $f() and call_user_func($f) are the same thing, so if one is forbidden, the other should be as well. You are receiving this because you were mentioned. |
@dstogov I definitely see your concern here. There's a trade-off here between userland freedom and internal guarantees. As this is clearly more contentious than I thought it would be, I will have to create a discussion on the internals list. I would like to note that your example using This issue does not exist in PHP right now (because all our functions are implemented in C), but it does show a general issue, a dichotomy between internal and userland functions in this regard. For example, if someone were to implement a function map($function, $iterable) {
foreach ($iterable as $value) {
yield $function($value);
}
} and then used your example If this is taken into account, i.e. thinking about the implementation of something like array_map in terms of how you'd do it in userland, it's downright weird that "extract" ends up modifying the scope of the calling function, rather the scope of array_map (which actually does the extract call). (Of course the latter is just not possible because it's an internal function -- but that's where the modification should happen logically.) |
the best solution In long terms, from my point of view, is to deprecate (and then disable) extract() and parse_str() with one argument. From: Nikita Popov notifications@github.com @dstogovhttps://github.com/dstogov I definitely see your concern here. There's a trade-off here between userland freedom and internal guarantees. As this is clearly more contentious than I thought it would be, I will have to create a discussion on the internals list. I would like to note that you example using array_map may be "valid", but it's definitely not portable. It will not work in HHVM. The reason is that in HHVM array_map is implemented as a userland function (or rather as a HHAS function), so the "extract" will actually extract the variables into the scope of array_map, not the calling function. This issue does not exist in PHP right now (because all our functions are implemented in C), but it does show a general issue, a dichotomy between internal and userland functions in this regard. For example, if someone were to implement a map function in userland in order to support Traversables and not only arrays using code like function map($function, $iterable) { and then used your example map("extract", [$_GET, $_POST,["a"=>"b"]), this would not work, even though the same using array_map worked (in PHP at least). If this is taken into account, i.e. thinking about the implementation of something like array_map in terms of how you'd do it in userland, it's downright weird that "extract" ends up modifying the scope of the calling function, rather the scope of array_map (which actually does the extract call). (Of course the latter is just not possible because it's an internal function -- but that's where the modification should happen logically.) The following code is valid, but it's prohibited, because it makes troubles to optimizer. function foo() { array_map("extract", [$_GET, $_POST,["a"=>"b"]); var_dump(get_defined_vars()); } foo(); You are receiving this because you were mentioned. |
Thanks for making PHP more sane! |
This should have a detailed note in UPGRADING with the list of functions affected. |
Merged as 91f5940. |
Currently we allow performing dynamic calls to functions that introspect the parent stack frame. Apart from this being a source of segmentation faults previously and having wildly varying behavior between PHP5, PHP7 (namespaced and non-namespaced) and HHVM (for examples see bug #71220), this also causes issues for the data-flow optimizer, which is not able to detect these calls and abort optimization (see the dynamic_call_007.phpt test for a simple example of misoptimization).
Solve this by forbidding these dynamic calls.
Affects: