New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cache] Add stampede protection via probabilistic early expiration #27009

Merged
merged 1 commit into from Jun 11, 2018

Conversation

Projects
None yet
6 participants
@nicolas-grekas
Member

nicolas-grekas commented Apr 22, 2018

Q A
Branch? master
Bug fix? no
New feature? yes
BC breaks? no
Deprecations? yes
Tests pass? yes
Fixed tickets -
License MIT
Doc PR

This PR implements probabilistic early expiration on top of $cache->get($key, $callback);

It adds a 3rd arg to CacheInterface::get:

float $beta A float that controls the likelyness of triggering early expiration. 0 disables it, INF forces immediate expiration. The default is implementation dependend but should typically be 1.0, which should provide optimal stampede protection.

@nicolas-grekas nicolas-grekas added this to the next milestone Apr 22, 2018

@nicolas-grekas nicolas-grekas force-pushed the nicolas-grekas:cache-stats branch from 8e556be to 0b647d2 Apr 23, 2018

*/
public function get(string $key, callable $callback, float $beta = null)
{
if (!\is_string($key)) {

This comment has been minimized.

@javiereguiluz

javiereguiluz Apr 23, 2018

Member

Just asking: above we have string $key in the method argument ... will this if (!is_string($key)) code be ever executed if we pass a non-string argument?

$innerItem->set(null);
return $item;
},
null,
CacheItem::class
);
$this->setInnerItem = \Closure::bind(
function (CacheItemInterface $innerItem, array $item) {
if ($stats = $item["\0*\0newStats"]) {

This comment has been minimized.

@javiereguiluz

javiereguiluz Apr 23, 2018

Member

Not sure it it'd help much, but maybe move \0*\0 to private const PREFIX = '\0*\0'; in this class?

This comment has been minimized.

@nicolas-grekas

nicolas-grekas Apr 23, 2018

Member

Not sure also, using the const would just add indirection IMHO, for a purely internal thing.
Other comments addressed thanks.

{
/**
* @param callable(CacheItemInterface $item):mixed $callback Should return the computed value for the given key/item
* @param float $beta A float that controls the likelyness of triggering early expiration.

This comment has been minimized.

@javiereguiluz

javiereguiluz Apr 23, 2018

Member

likelyness -> likeness

/**
* @param callable(CacheItemInterface $item):mixed $callback Should return the computed value for the given key/item
* @param float $beta A float that controls the likelyness of triggering early expiration.
* 0 disables it, INF forces immediate expiration. The default is implementation dependend

This comment has been minimized.

@javiereguiluz

javiereguiluz Apr 23, 2018

Member

dependend -> dependent

/**
* @author Nicolas Grekas <p@tchwork.com>
*/
final class CacheItem implements CacheItemInterface
{
/**
* References the unix timestamp stating when the item will expire, as integer.

This comment has been minimized.

@javiereguiluz

javiereguiluz Apr 23, 2018

Member

unix -> Unix

@nicolas-grekas nicolas-grekas force-pushed the nicolas-grekas:cache-stats branch 3 times, most recently from c2e16bf to 6ae48dc Apr 23, 2018

@nicolas-grekas

This comment has been minimized.

Member

nicolas-grekas commented Apr 23, 2018

ping @Nyholm FYI: here, I deprecate getPreviousTags() and replace it by getMetadata()

@nicolas-grekas nicolas-grekas force-pushed the nicolas-grekas:cache-stats branch 2 times, most recently from 7a76d9f to af895ff Apr 23, 2018

@nicolas-grekas

This comment has been minimized.

Member

nicolas-grekas commented Apr 24, 2018

I don't like unclear default values. Why not 1.0 directly?

I designed it this way so that the interface is generic enough. Hardcoding 1.0 as default would prevent any other implementations from providing a different default. Yes, there are no other implementations than the ones in this PR, but that's not a reason to break the abstraction provided by the interface.
Note that PSR-6 does the same for default expiry of items.

@nicolas-grekas nicolas-grekas force-pushed the nicolas-grekas:cache-stats branch 2 times, most recently from 99634fc to 472b593 Apr 24, 2018

{
/**
* @param callable(CacheItemInterface $item):mixed $callback Should return the computed value for the given key/item
* @param float|null $beta A float that controls the likeness of triggering early expiration.

This comment has been minimized.

@Tobion

Tobion Apr 24, 2018

Member

typo: likeliness

This comment has been minimized.

@nicolas-grekas

nicolas-grekas Apr 24, 2018

Member

I did the same mistake as you, but @javiereguiluz made me fix it: it's really "likeness".

This comment has been minimized.

@Tobion

Tobion Apr 24, 2018

Member

They have different meanings. You mean likeliness here.
likeliness = probability
likeness = resemblance

This comment has been minimized.

@nicolas-grekas

nicolas-grekas Apr 24, 2018

Member

indeed, sorry for the misunderstanding (on my side)! (fixed)

@nicolas-grekas nicolas-grekas force-pushed the nicolas-grekas:cache-stats branch 2 times, most recently from 5e7db51 to d01a58b Apr 24, 2018

@nicolas-grekas nicolas-grekas force-pushed the nicolas-grekas:cache-stats branch from d01a58b to 5076b6d May 21, 2018

@nicolas-grekas

This comment has been minimized.

Member

nicolas-grekas commented May 21, 2018

Rebased, PR ready for review.

@nicolas-grekas nicolas-grekas force-pushed the nicolas-grekas:cache-stats branch from 5076b6d to 7ec3d50 May 28, 2018

@nicolas-grekas

This comment has been minimized.

Member

nicolas-grekas commented May 28, 2018

PR is ready.

The protection needs storing expiry+computation time. This is done by encoding these values in the key of a wrapping array:
$persistedValue = array("\x9D".$expiry.$computeTime."\x5F" => $cachedValue);

by using this magic-numbered structure, we are able to persist both raw + wrapped values in the same store, providing forward/backward compat at the storage level.

Stampede protection is always enabled when accessing values through the new CacheInterface::get() method (one should use the PSR-6 interface if they don't want the overhead of storing expiry+compute-time, thus opting out from stampede protection.)

@Nyholm

This comment has been minimized.

Member

Nyholm commented May 28, 2018

Does the getPreviousTags really need to be deprecated? Wouldn't it be good to still have that? As far as I know that would make us compatible with the "soon to be"-PSR

@nicolas-grekas

This comment has been minimized.

Member

nicolas-grekas commented May 28, 2018

That's the thing we need to discuss :)
So, my current pov is that accessing tags is useful, but also is accessing the expiry. One use case for the expiry is the one present in this PR, another one is for HTTP caches, allowing to compute the max-age directive when serving cached values.
Which means if we spend some effort having an updated PSR, it could be worth doing it once for all.
This getMetadata() method would be my proposal: an array with keyed values, thus extensible, but still having some keys reserved by const name at least (tags, expiry - computation time could be a custom extension - or in the PSR).

@nicolas-grekas nicolas-grekas force-pushed the nicolas-grekas:cache-stats branch from a81d3ba to fdce538 May 31, 2018

@nicolas-grekas

This comment has been minimized.

Member

nicolas-grekas commented May 31, 2018

Rebased

@nicolas-grekas

This comment has been minimized.

Member

nicolas-grekas commented Jun 6, 2018

@Nyholm

I've looked at this PR. It is real hard to understand what you are doing. Even though I know what you are trying to do. :/

I do not mind the API changes but I would be happy to see some more comments.

$item->isHit = $isHit;
$item->defaultLifetime = $defaultLifetime;
if (\is_array($v) && 1 === \count($v) && 10 === \strlen($k = \key($v)) && "\x9D" === $k[0] && "\0" === $k[5] && "\x5F" === $k[9]) {

This comment has been minimized.

@Nyholm

Nyholm Jun 8, 2018

Member

This line looks really complicated. I cannot tell what it is doing. Maybe add a comment?

This comment has been minimized.

@nicolas-grekas

nicolas-grekas Jun 9, 2018

Member

Comments added, I hope it will be more clear.

$innerItem->set(null);
return $item;
},
null,
CacheItem::class
);
$this->setInnerItem = \Closure::bind(
function (CacheItemInterface $innerItem, array $item) {
if (isset(($metadata = $item["\0*\0newMetadata"])[CacheItem::METADATA_TAGS])) {

This comment has been minimized.

@Nyholm

Nyholm Jun 8, 2018

Member

Same here, Im really trying to understand this, but I can't =)

This comment has been minimized.

@nicolas-grekas

nicolas-grekas Jun 9, 2018

Member

Comment added, better?

@nicolas-grekas nicolas-grekas force-pushed the nicolas-grekas:cache-stats branch from fdce538 to f6ca712 Jun 9, 2018

@fabpot

fabpot approved these changes Jun 10, 2018

with minor comments

$item->isHit = $isHit;
$item->defaultLifetime = $defaultLifetime;
// Detect wrapped values that encode for their expiry and creation duration
// For compacity, these values are packed in the key of an array using magic numbers

This comment has been minimized.

@fabpot

fabpot Jun 10, 2018

Member

for compactness?

$item->isHit = $isHit;
$item->defaultLifetime = $defaultLifetime;
// Detect wrapped values that encode for their expiry and creation duration
// For compacity, these values are packed in the key of an array using magic numbers
if (\is_array($v) && 1 === \count($v) && 10 === \strlen($k = \key($v)) && "\x9D" === $k[0] && "\0" === $k[5] && "\x5F" === $k[9]) {

This comment has been minimized.

@fabpot

fabpot Jun 10, 2018

Member

Magic 10 here, not sure if we want to make it explicit (as we would want to make 5 and 9 explicit as well).

}
// For compacity, expiry and creation duration are packed in the key of a array, using magic numbers as separators

This comment has been minimized.

@fabpot

fabpot Jun 10, 2018

Member

compactness as well?

$item->isHit = $innerItem->isHit();
$item->defaultLifetime = $defaultLifetime;
$item->innerItem = $innerItem;
$item->poolHash = $poolHash;
// Detect wrapped values that encode for their expiry and creation duration
// For compacity, these values are packed in the key of an array using magic numbers

This comment has been minimized.

@fabpot

fabpot Jun 10, 2018

Member

same?

@nicolas-grekas nicolas-grekas force-pushed the nicolas-grekas:cache-stats branch from f6ca712 to fe26414 Jun 10, 2018

@nicolas-grekas

This comment has been minimized.

Member

nicolas-grekas commented Jun 10, 2018

comments addressed thanks

@@ -140,10 +157,24 @@ public function tag($tags)
* Returns the list of tags bound to the value coming from the pool storage if any.
*
* @return array
*
* @deprecated since Symfony 4.2. Use the "getMetadata()" method instead.

This comment has been minimized.

@fabpot

fabpot Jun 10, 2018

Member

4.2, use the

This comment has been minimized.

@nicolas-grekas

@nicolas-grekas nicolas-grekas force-pushed the nicolas-grekas:cache-stats branch from fe26414 to 13523ad Jun 10, 2018

@fabpot

fabpot approved these changes Jun 11, 2018

@fabpot

This comment has been minimized.

Member

fabpot commented Jun 11, 2018

Thank you @nicolas-grekas.

@fabpot fabpot merged commit 13523ad into symfony:master Jun 11, 2018

3 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
fabbot.io Your code looks good.
Details

fabpot added a commit that referenced this pull request Jun 11, 2018

feature #27009 [Cache] Add stampede protection via probabilistic earl…
…y expiration (nicolas-grekas)

This PR was merged into the 4.2-dev branch.

Discussion
----------

[Cache] Add stampede protection via probabilistic early expiration

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | yes
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        |

This PR implements [probabilistic early expiration](https://en.wikipedia.org/wiki/Cache_stampede#Probabilistic_early_expiration) on top of `$cache->get($key, $callback);`

It adds a 3rd arg to `CacheInterface::get`:
> float $beta A float that controls the likelyness of triggering early expiration. 0 disables it, INF forces immediate expiration. The default is implementation dependend but should typically be 1.0, which should provide optimal stampede protection.

Commits
-------

13523ad [Cache] Add stampede protection via probabilistic early expiration

fabpot added a commit that referenced this pull request Jun 11, 2018

feature #27031 [Cache] Use sub-second accuracy for internal expiry ca…
…lculations (nicolas-grekas)

This PR was merged into the 4.2-dev branch.

Discussion
----------

[Cache] Use sub-second accuracy for internal expiry calculations

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | not really
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Embeds #26929, #27009 and #27028, let's focus on the 4th commit for now.

This is my last significant PR in the Cache series :)

By using integer expiries internally, our current implementations are sensitive to abrupt transitions when time() goes to next second: `$s = time(); sleep(1); echo time() - $s;` *can* display 2 from time to time.
This means that we do expire items earlier than required by the expiration settings on items.
This also means that there is no way to have a sub-second expiry. For remote backends, that's fine, but for ArrayAdapter, that's a limitation we can remove.

This PR replaces calls to `time()` by `microtime(true)`, providing more accurate timing measurements internally.

Commits
-------

08554ea [Cache] Use sub-second accuracy for internal expiry calculations

@nicolas-grekas nicolas-grekas deleted the nicolas-grekas:cache-stats branch Jun 11, 2018

@nicolas-grekas nicolas-grekas modified the milestones: next, 4.2 Nov 1, 2018

This was referenced Nov 3, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment