-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fix bug #63217 #1649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug #63217 #1649
Conversation
d95d040
to
ae43375
Compare
Hi Andrea, it is quite late for 7.0.0 in general, but also as RC8 is already tagged and this is definitely not a security patch. I would suggest to target master therefore. Too bad, as it's such a long living ticket. Apart from that, IMHO i would see the heaviness of this bug as low. Clear, the consistency of the ArrayAccess API matters, but the practical win would lay very much on the implementation. Adversely affecting the regular array API fe were IMO not worth it. So some balanced patch would need some time and good review. Thanks |
It could go into a 7.0.x release actually, the BC break risk is very low. I'm going to try optimising it. |
I would really prefer to target master with this. What you say is true - the impact ArrayAccess part is very small. I'm more concerned about the regular array API, changing the array API for the reason to have "1" instead of 1 is not a sufficient reason to touch that so late in the release. Thanks. |
This doesn't change the API anywhere. The change is entirely internal to the executor. |
a64f548
to
469487a
Compare
Okay, that second commit is a new optimisation, intended to mitigate the impact of removing the buggy optimisation somewhat. Let's benchmark! Test script: <?php
$arr = [
"foo" => true,
"foobar" => 3,
"elePHPant" => "qux",
"long_key_name" => "something contrived",
"123" => 76,
"001234" => null,
"6276372623" => "another"
];
$start = microtime(true);
for ($i = 0; $i < 30000000; $i++) {
$foo = $arr["foo"];
$arr["foo"] = $foo;
$exists = isset($arr["foo"]);
$foobar = $arr["foobar"];
$arr["foobar"] = $foobar;
$exists = isset($arr["foobar"]);
$elePHPant = $arr["elePHPant"];
$arr["elePHPant"] = $elePHPant;
$exists = isset($arr["elePHPant"]);
$long_key_name = $arr["long_key_name"];
$arr["long_key_name"] = $long_key_name;
$exists = isset($arr["long_key_name"]);
}
$end = microtime(true);
echo basename(PHP_BINARY), ": ", $end - $start, "s", PHP_EOL; Machine is my MacBook Air 2013 13" (1.3GHz i5, 4GB RAM, OS X 10.11). Compiler: Apple LLVM version 7.0.0 (clang-700.1.76).
Config line: And...
...what the hell? It looks like removing the optimisation we had actually makes things faster! Maybe the removal of the extra branch speeds things up. If this can be replicated, we should just merge the bugfix alone (the first commit).
|
Andrea, I really don't feel comfortable with this patch in 7.0. Please fetch someone for review and target master. Thanks. |
That means the bug won't be fixed for users until late next year, which is a shame, as it makes ArrayAccess less useful. However, if that is how it has to be, so be it. |
Interestingly, if the performance improvement is reproducible and happens on other configs too, we could trivially remove the optimisation yet keep the bug. I wouldn't recommend that, though :p |
For what it's worth, this would be a significant fix for 7.0.0. Here is why: Currently the only mechanism we have to make use of array syntax is via For example, you might have a
It's the only impact of this fix, for which there is no workaround.
This fix doesn't affect the array API in any way.
So not only is this a significant consistency fix, but also potentially increases performance? That to me is a no brainer for a major release, and I would personally be very disappointed if this patch is deferred. Wouldn't this also be a BC break, and therefore not even eligible for 7.0.x? |
Well, we don't know yet on perf. It might perform differently on other compilers and OSes. But we'll see :) |
I think this fix missed the part in opcache, zend_optimizer_update_op2_const do some similar converting.. |
I wondered if there might be an opcache part. I'll look into that. |
469487a
to
c034d4f
Compare
I removed my new optimisation from this pull request as I think it's a distraction, and it might actually make perf worse. It could always be added later, or another approach to re-optimisation. My new optimisation is now in a separate branch at @laruence: Thanks for the pointer, the bugfix now addresses the opcache version of the optimisation as well and updates the test case to check for it. |
I did the same test as before, but with GCC ( Results:
Huh. Even on GCC, the fixed version is faster. But interestingly, my new optimisation is faster still - unlike on clang, where it is slower than the unfixed version. Odd. Going to look into this a little more. |
At @bwoebi's suggestion I looked at the instruction counts with callgrind (valgrind-3.11.0.SVN), with only 300,000 loop iterations (instead of 30,000,000) since valgrind is very slow. Results:
Well, this is interesting. clang generates way more instructions, which explains the huge difference between it and GCC in the benchmark. But more importantly, the unfixed version generates the least instructions in GCC, and the most in clang. My new optimisation generates the most instructions in GCC, yet takes the least time, and generates the least instructions in clang, yet takes the most time. ?! |
Regarding whether or not this is safe to merge, the danger lies in the assumption that constant operands to indexing opcodes are never numeric strings. So long as that assumption remains, it's unsafe to remove the compilation stage numeric string check. However, as this patch removes this assumption in the executor, it's safe to remove the compiler checks. If we wanted to, we could remove the VM assumption but not the compiler check: this doesn't fix the bug, but might provide a performance boost if my numbers are somehow reproducible. This would be a bit silly, though. Note that with the VM assumption removed, it's of no consequence whether or not the compiler still does the check, nor does it matter what the opcodes contain, as the VM will always do a numeric string check now. So we could fix this bug in a 7.0.x release and it shouldn't break anything looking at the opcodes. Opcodes for an unfixed 7.0.0 would be executed correctly by a fixed 7.0.1, though the reverse wouldn't be true. |
instead of fix this in vm, maybe we should fix this in SPL ? I mean, for array access, always convert numeric string to long, which will make thing consistent, and also won't slowdown anything? |
like this patch: http://pastebin.com/S6c7WKYj we enforce array access always convert numeric string to int. which seems also acceptable as we do it long time for plain array access |
Andrea, i was actually talking about the normal array API in userland. As the patch touches the place where the boringly normal access to array happens. So to fix 1 vs. "1" it looks like using a gun to shoot a mouche, because it's unknown how the normal array usage is affected. Clear it's a bug, a long standing one. I'd ask you to please check the Hui's suggestion as well, probably also to discuss this on internals. Touching such core functionality, especially removing optimization, IMHO should be justified and verified. If it has to be - so be, but please lets clear out the concerns. Thanks. |
@laruence Imho you should be able to implement a proper, type-safe map using ArrayAccess, if you want. That patch would only make the problem worse... |
I like laruence's approach. Instead of keeping PHP5 behaviour of ArrayAccess that allows unconverted numeric strings, we should make them behave in consistency with regular arrays. |
I'm scratching my head at the need for the casting to int based on the results @TazeTSchnitzel got, if there is no performance gains from it why have it at all in either ArrayAccess or regular arrays? |
@nikic it depends, if we all can get a rule, that is, in PHP, numeric string is treat as long while doing arrayaccess, then this will be easy for the user. he can either treat all numeric string as int or string as he like, at least he will be sure, that numeric string and int will be passed in as int form. |
ArrayAccess already doesn't behave like an array. You can use objects with ArrayAccess, e.g. SplObjectStorage. You can also use floats. If we are to change ArrayAccess to coerce numeric strings to integers, we should probably change how it handles all other types. But we can't do that. |
Even if this would hurt performance, the difference does not look to be that big even in this synthetic benchmark. I think it's a distraction. |
I disagree with @laruence and @dstogov on this one. The fix is not in the SPL. These must be the same when
Similarly these must also be the same when
Additionally, This cannot be anything except a bug. In this case it's an errant optimization in the engine causing this bug, not the SPL, so the engine must be fixed. |
Independently of whether we want to merge it, the current patch is not complete, in particular adjustments in zend_inference.c will be necessary. |
@TazeTSchnitzel the ship has sailed for 7.0 and 7.1, this PR needs to be rebased on master, completed (see Nikitas comment) and pushed forward there ... sorry about that ... |
I'm aware. I might update this eventually. |
@TazeTSchnitzel if you do not intend to work on this PR, please close it. If no commits are made a month hence, it will be considered abandoned and be closed. |
Okay, I'm closing it. There's no need to keep it open even as a reminder, bugs.php.net reminds me regularly that it needs updating. |
@hikari-no-yume I know you've lost interest in this bug so I would like to self-assign it and try my best to fix it myself. Are you definitely not going to fix it? I'm not entirely sure what's involved but happy to give it a shot. |
Fixes https://bugs.php.net/bug.php?id=63217
I wouldn't merge this just yet, as the current fix most likely hurts performance, as it removes a buggy optimisation. I am going to try and add a replacement optimisation which doesn't have the same issue.
The nature of the bug means that fixing it is possibly a BC break, although one very unlikely to have any impact. For that reason, I'd prefer for it to go into 7.0.0.