Introduce a type check cache (TCC) #5096
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This was an experiment I did while trying to learn more about PHP internals. When the union types feature was merged it raised some concerns regarding the cost of complex type checks. I was wondering if type checks were cached in some way. It turned out this was not the case. The same type check is redone over and over again, both simple and complex checks.
I also noticed that the JIT compiler does not generate efficient code for complex type checks. A cache would turn complex checks into simple lookups, probably simple enough to implement in the JIT compiler. A double gain.
So here it is, a type check cache for PHP. The PR also extends the JIT compiler to exploit the cache. Where the JIT generated code previously had to bail out to slow code paths it now keeps running at full jitty speed.
Some key characteristics:
An additional gain that having a cache may bring is that it might allow the PHP type system to continue to develop in directions that are currently not considered due to the performance cost involved.
Weaknesses
Things I am not sure about
Some numbers finally
Using a benchmarking script I measured the performance difference relative to current master. The script is based on the script written by Dmitry Stogov to benchmark union type checks. It can be found here:
https://gist.github.com/dtakken/1539d64170921363dc8d1ed62effcd45
Below I placed the benchmark results I obtained of the master branch and the tcc branch side by side to compare the overall performance gain. The numbers in the leftmost columns are time spent doing a large number of operations that trigger type checks in a tight loop. Overhead of the loop itself is subtracted. First some numbers with JIT turned off:
The numbers are slightly noisy. Still, the effect of the TCC shows nicely here. With the TCC enabled, the cost of simple and complex checks is similar.
Next, the same run with JIT enabled:
While these numbers look really nice, there are some important things to take into consideration here.
The final measurement compares the performance of the master branch to the performance of the tcc branch while setting the TCC capacity to zero. This shows the worst case scenario of having a cache in place while badly misconfiguring it:
Please note that this is my first significant contribution, I'm not familiar with the things I had to touch. Careful review is highly appreciated.