-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fix bug #73887 by evicting realpath cache entries #2407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The cache evicts buckets with round robin now instead of not caching new entries at all.
@weltling Can this be merged into the next patch release or does it have to wait to go through RC? |
It doesn't look like anything ultra-urgent, I'd rather wait for the next RC. |
@kelunik thanks for the PR. As usual, everything has to go through QA. Also, the patch looks like targeting master, as there's an API breach. But also, I'd be not that quick. I see, that this is an attempt to implement some kind of LRU handling. However, some points are questionable, as for me. Those are
Please expand more on the implementation idea and give a benchmark. This part affects any FS function, so we have to be very conservative while evaluating changes to it. What were safe for stable branches is to follow the realpath_cache_find logic and walk through the buckets within the same or multiple slots, respecting ttl. A more comprehensive solution could taget next minor for proper LRU or LFU implementation. Thanks. |
It doesn't. But "random" eviction should be better than no eviction at all. Currently we just don't cache anything in case the cache is full and only evict on finds.
The TTL is a maximum cache period, not a minimum. In case of a full cache, an item can be evicted at any time. Both LFU and LRU are more complex and need additional data structures. |
@kelunik yeah, that's my point as well, no suitable structure currently to integrate the algorithm. Whether it's complicated - probably not, a queue that would pop entries from the end were already an LRU, would need more refactoring ,though. Anyway, depending on approach, that would be more for master. Fixing in user land done by
Both are fine tuning specific to a given app without any patching. But regarding the patch approach, worst case - it would free the entries just added in previous invocation. In that case the risk is to do false work by adding same things over and over, that's Murphy's. If you want to target a stable branch, IMO some less intrusive and more robust approach is needed. Simply mirroring the logic from realpath_cache_find would be already an improvement on low cost. Maybe as extension on this, it would be ok to also free items that are about to expire, or by kinda grace period criteria. For a given application, a fine tuning will have to be needed in any case, if one cares. Otherwise, a comprehensive implementation should target master. Thanks. |
Both of those aren't fixes.
What works is watching the remaining space and triggering a full cache clear in case it's full, but that's obviously a lot worse.
Regardless of the approach, something like that might happen. In case of LRU it happens in case you repeatedly access |
Exactly, it's not perfect, but it's not different from any other like Apache or MySQL. There are no endless resources, so one has to fine tune for a specific app. This is the devop work, that we're hard to do from PHP.
Yes. So other slots might need to be walked through as well, to free space. The current TTL logic should be kept, if you target stable branch. Things that are configurable INI options should not be ignored.
No. When something is found or inserted, the item is moved on top of the queue. So then, you'd pop something and that would be an LRU item, which is known exactly. Remove some random and hope for best is not suitable for this case. LFU could probably be even better, if doing refactoring. In the end - there should be either a solid implementation to target master, or a compatible approach to target a stable branch. The patch in the current form is none of both. Plus, a benchmark is required, please provide one. Thanks. |
I think doing random eviction (like this patch essentially does) is fine as a starting point. As long as the usual assumptions about temporal locality hold true, random eviction still performs better than no eviction, as it is more likely that the newly inserted path will be reused than any random path. Of course we do need some benchmarks to determine the impact of this change. This would also tell us whether or not we should invest in a more sophisticated eviction strategy. A naive LRU implementation will take 16 bytes per realpath cache entry and it's not clear to me whether that's a good tradeoff or not. We should also reevaluate the default realpath cache size if eviction is implemented. I think the change to a 4MB default size is excessive. I think in the thread where this change was made it was determined that even large applications do not use more than 1MB of realpath cache, so I'm not sure why we went with that particular number. It has been pointed out elsewhere that such a large realpath cache is problematic if you have many worker processes, as the cache is replicated in each one. With eviction implemented, we can probably go for a lower value here -- but I'm not sure just how well the random eviction will perform in some cases, for example a generational use caused by symlink swapping deployment. An LRU cache would be able to displace all the entries form the previous generation in one iteration, while random displacement will take more time to converge to a stable state. |
Zend/zend_virtual_cwd.h
Outdated
@@ -232,6 +232,7 @@ typedef struct _virtual_cwd_globals { | |||
zend_long realpath_cache_size; | |||
zend_long realpath_cache_size_limit; | |||
zend_long realpath_cache_ttl; | |||
zend_ulong realpath_cache_clean_index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @weltling pointed out, this is an ABI break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a suggestion what to do instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case just moving the new member to be the last member in the struct is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. 👍
@kelunik Symlink deployments resolve symlinks at the webserver level (otherwise the deployment is non-atomic if opcache is used), so the realpath cache does not really matter for consistency. |
A usual app can be fine tuned, it is the fact. The actual issue targets servers written in PHP, which is by far not the common case. The common case is an app running under some web server, with known requirements, etc. The increase of the default cache size is already the improvement for any case, even if it's oversized. I'm not talking about no eviction, just that the patch is not yet well thought and there's currently no repro case to observer the behaviors. The realpath_cache_lookup()'s can be done also outside the core, so the patch can affect external exts as well. We currently iterate the same slot in realpath_cache_find(), which f.e. could be optimized. If we find the exact path, but the bucket has expired - so a simple action could be to renew its ttl, no free required. In general, the find function should not free anything. Being lazy at the place would spare processor time, especially if the cache never gets full. Even the default size is big now, we promise the cache TTL, even if it's not obvious on the user side. If we had a reproduce case, we'd see further strategy, indeed. As with this promise, only expired buckets should actually go away. If we abandon the promise, even conditionally, so it should be done a consistent way. All in all, i'd not see any reason for handling this fast in the stable branch. The patch doesn't target the common case, the default cache size is big, and the fine tuning can be done. Far better were to first work out an idea to fix this in all the complexity, preferably based on the show case that we can observe and taking in account possible points/issues mentioned. Not only fixing the given case, but taking care about the other parts. @kelunik for a repro, ideally there should be a synthetic case. Or a test repository with prepared app setup. As last, a VM to fetch would probably work, too, vmware or hyper-v as for me :) Thanks. |
A cache doesn't promise any TTL. A value store might, but not a cache. A cache evicts entries as it sees need in case it's full. If you want to see a simple example benchmark:
|
In case you fine tuned the limit and it's large enough for your app, nothing will change. Entries are only evicted in case of a full cache. |
The previous benchmark isn't really fair. The build I used was configured differently. Compiled with the same options from
|
@weltling ping |
Thanks for the repro case. Likely, the result will be different with different cache size, especially with the current cache size, have you tried? For now, the gain seems not very significant.
It is a tautology. The cache consists on storage. There's an INI option to control both the size and the TTL. It is a feature.
But this is the exact point - the doable solution is in place and we see there's an improvement potential on the patch that possibly won't fit into the stable branch or have side effects that we don't see yet. I'm all for improving this part. The current state is not convincing for a stable branch, we sohuld really rethink and go for some refactoring, that allows a robust handling, especially it's enough time to to be well tested for 7.2. As for me, i'd feel more certain with an approach where we would know exactly why the entries are freed. Thanks. |
Before I do that... what's significant for you?
Again, the TTL is a maximum. It doesn't make sense to say it is guaranteed, because it's not. Every path accessed after the cache is full is currently not cached at all. Quoting @nikic:
What's the eviction strategy you propose? The patch as is improves the situation for all long running scripts while not affecting applications which are fine tuned to fit exactly in any significant way (a single
There's no solution in place. It just doesn't cache anything in case the cache is full. |
I took time to test your patch. The snippet, current 7.0 TS 64-bit, run on 7.0 branch source
Output with patch Without patch The realpath cache size is default, 4096k as of today. From what i see, the performance even gets worse, at least with the snippet. Seems we talk past each other :) TTL is an INI option, even if the cache is full. Your approach is to free some random item, with no regard to TTL. But, even if storage is full, it doesn't mean there are no expired items. In theory, as Nikita also mentioned, the recent items are likely to be used. We need to test tihs. My concern, in theory, is that items added can be freed just in the next step, tuhs producing more overhead, because nothing in your patch ensures the items to be freed are not expired. I'm not proposing any approach, maybe yet. Only what I say is, that the current approach is not convincing and doesn't fit stable branch. If i would propose or start one, it would be rather about going a proper LRU/LFU implementation, maybe involving some refactoring and targeting 7.2
Though, I might be able to provide some QA, but likely can't work close on this at least for two next weeks. If there's another approach without concerns on stability, it might be still included in a stable branch, but otherwise there is yet 2-3 months time to workout a good solution for 7.2 that colud be we well tested in the pre release cycle. Thanks. |
I've implemented an alternative in #2431, which is probably more in line with what @weltling has in mind. If the cache is full it will remove all expired entries from the cache. This pruning is limited to intervals >= ttl, as it is a full cache scan. To validate the implement, I've used the following test script: <?php
function createFiles($gen, $n) {
$dir = __DIR__ . '/rctest' . $gen;
@mkdir($dir);
for ($i = 0; $i < $n; $i++) {
touch("$dir/$i.txt");
}
}
function accessFiles($gen, $n, $k) {
$dir = __DIR__ . '/rctest' . $gen;
if (1) {
for ($j = 0; $j < $k; $j++) {
for ($i = 0; $i < $n; $i++) {
fopen("$dir/$i.txt", "r");
}
}
} else {
for ($j = 0; $j < $k; $j++) {
for ($i = 0; $i < $n; $i++) {
fopen("$dir/$i.txt", "r");
}
}
}
}
echo "Realpath cache size: ", ini_get('realpath_cache_size'), "\n";
echo "Realpath cache TTL: ", ini_get('realpath_cache_ttl'), "\n";
$n = 1000;
$k = 100;
createFiles(1, $n);
$t = microtime(true);
accessFiles(1, $n, $k);
echo "1. gen access time: ", microtime(true) - $t, "\n";
createFiles(2, $n);
$t = microtime(true);
accessFiles(2, $n, $k);
echo "2. gen access time: ", microtime(true) - $t, "\n";
sleep(4);
$t = microtime(true);
accessFiles(2, $n, $k);
echo "2. gen access time after sleep: ", microtime(true) - $t, "\n"; With sample output:
The idea behind this benchmark is that we create two generations of files, where the cache size matches the size of one generation. The expectation was that prior to this change the access to the 1. gen is fast (cached) and to the 2. gen slow both before and after we exceed the TTL. After this change the first access to the 2. gen should still be slow, but the second one after expiry should once again be fast. To my surprise it turned out that even with the current implementation the second (post-TTL) access is fast and the patch only provides a minor improvement. The reason is that the lookup code will remove expired entries not only for the path that is being looked up, but for any path that falls in the same cache bucket. Because any cache insertion also involves a cache lookup, this means that chances are pretty good that the previous lookup will clean out a bucket, at least in this kind of generational test, which all cache entries expire simultaneously. Due to this effect, I believe that the currently implemented invalidation strategy is actually much better than we initially assumed. It implies that at least over longer time frames (>>ttl) the realpath cache remains efficient. Of course, the change proposed in this PR (random invalidation without respect for ttl) may improve the situation by reducing the reaction time of the cache to access pattern changes (below the ttl level). |
I've now compared both patches and benchmarks crosswise, with same ini
With the given ini conditions, it is to see some improvement with the bench from #2407 (comment). With the bench from #2407 (comment), seems this PR loses agains the patch in #2431. In the end what it tells to me - both patches optimize for the corresponding benchmark only. From the previous test, the patch from this PR makes it slower with the bigger cache size. It also came to my attention yet, that we have a similar situation in ext/pcre. Patterns are cached and freed arbitrary. From this point, thinking about LRU might make a profit there, too. In regard of realpath, seems we still need to test with permutatiton matrix of size and ttl. I still think it is more confident to target master with the developments on this. As soon as the cache is full, the eviction algorithm affects the common case. Probably more benchmarks were not bad, but also there should be at least some basic test with some apps like wordpress or symfony. I might have some spare time this WE to experiment on some LRU implementation as well. Thanks. |
Unfortunately I don't have much free time to test this extensively currently. @staabm could you redo your tests with Symfony / Wordpress you initially did when creating the bug report? |
While i still couldn't come to play with implementation, here's a very matching reading regarding the subject https://blogs.dropbox.com/tech/2012/10/caching-in-theory-and-practice/. Thanks. |
I was checking a possible LRU implementation some time ago master...weltling:realpath_lru2 but was led away from by the other topics. So far it's just an experiment that brought me to some more thoughts. Basically, the disctinct cases we talk about are
The current realpath cache implementation is good enough for the first case, which is also the base use case. The second one - is what this PR tries to improve. The third one - currently not handled, when the cache is full, paths are simply not cached. In general, case one can be solved quite a simple way by observing the system and configuring the realpath cache. With the arbitrary situation it is possible as well, but in some cases might be hard to achieve. From what i've seen in my experiment, especially if paths don't change much, it might make sense to handle the case by the LRU strategy. In that case, it can allow setting the cache size lower while not having to care about TTL. The basic use case is still the simplest way to achieve speed, but in the long run LRU might win in the certain scenario. For example, one could say - if the TTL setting is set to -1, we'd use the LRU strategy, and otherwise the currently implemented strategy. It my patch, i've tried to combine both strategies, but that doesn't seem to work well, while I haven't seen neither huge performance increase nor significant memory increase. Under certain conditions LRU is quite an option, still. If it were possible to combine both, that would be a good reason to reduce the cache size. So i'm dropping this here for anyone interested to perhaps review the patch and the idea. Thanks. |
There seems to be a consensus that actually this isn't really necessary, or at least that there is nothing to gain by merging this implementation (or another that @nikic worked on) ... This is a topic that may be revisited at some time in the past, but for the moment, I'm closing this PR as the conversation has died and the will evaporated for now ... Thanks for all the effort folks :) |
The cache evicts buckets with round robin now instead of not caching new entries at all.
See https://bugs.php.net/bug.php?id=73887.
/cc @bwoebi @nikic @staabm @krakjoe