New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PSR-6: Cache #149

Merged
merged 131 commits into from Jul 11, 2014
Commits
Jump to file or symbol
Failed to load files and symbols.
+703 −0
Diff settings

Always

Just for now

View
@@ -0,0 +1,311 @@
PSR-Cache Meta Document
===================
1. Summary
----------
Caching is a common way to improve the performance of any project, making
caching libraries one of the most common features of many frameworks and
libraries. This has lead to a situation where many libraries roll their own
caching libraries, with various levels of functionality. These differences are
causing developers to have to learn multiple systems which may or may not
provide the functionality they need. In addition, the developers of caching
libraries themselves face a choice between only supporting a limited number
of frameworks or creating a large number of adapter classes.
2. Why Bother?
--------------
A common interface for caching systems will solve these problems. Library and
framework developers can count on the caching systems working the way they're
expecting, while the developers of caching systems will only have to implement
a single set of interfaces rather than a whole assortment of adapters.
Moreover, the implementation presented here is designed for future extensibility.
It allows a variety of internally-different but API-compatible implementations
and offers a clear path for future extension by later PSRs or by specific
implementers.
Pros:
* A standard interface for caching allows free-standing libraries to support
caching of intermediary data without effort; they may simply (optionally) depend
on this standard interface and leverage it without being concerned about
implementation details.
* Commonly developed caching libraries shared by multiple projects, even if
they extend this interface, are likely to be more robust than a dozen separately
developed implementations.
Cons:
* Any interface standardization runs the risk of stifling future innovation as
being "not the Way It's Done(tm)". However, we believe caching is a sufficiently
commoditized problem space that the extension capability offered here mitigates
any potential risk of stagnation.
3. Scope
--------
## 3.1 Goals
* A common interface for basic and intermediate-level caching needs.
* A clear mechanism for extending the specification to support advanced features,
both by future PSRs or by individual implementations. This mechanism must allow
for multiple independent extensions without collision.
## 3.2 Non-Goals
* Architectural compatibility with all existing cache implementations.
* Advanced caching features such as namespacing or tagging that are used by a
minority of users.
4. Approaches
-------------
### 4.1 Chosen Approach
This specification adopts a "repository model" or "data mapper" model for caching
rather than the more traditional "expire-able key-value" model. The primary
reason is flexibility. A simple key/value model is much more difficult to extend.
The model here mandates the use of a CacheItem object, which represents a cache
entry, and a Pool object, which is a given store of cached data. Items are
retrieved from the pool, interacted with, and returned to it. While a bit more
verbose at times it offers a good, robust, flexible approach to caching,
especially in cases where caching is more involved than simply saving and
retrieving a string.
Most method names were chosen based on common practice and method names in a
survey of member projects and other popular non-member systems.
Pros:
* Flexible and extensible
* Allows a great deal of variation in implementation without violating the interface
* Does not implicitly expose object constructors as a pseudo-interface.
Cons:
* A bit more verbose than the naive approach

This comment has been minimized.

@samdark

samdark Sep 7, 2013

Contributor

I can add "less performance" and "more complexity" compared to key-value approach.

@samdark

samdark Sep 7, 2013

Contributor

I can add "less performance" and "more complexity" compared to key-value approach.

This comment has been minimized.

@tedivm

tedivm Sep 7, 2013

Contributor

There are multiple reasons why that doesn't work well (and isn't even true), that are being discussed on the mailing list.

@tedivm

tedivm Sep 7, 2013

Contributor

There are multiple reasons why that doesn't work well (and isn't even true), that are being discussed on the mailing list.

This comment has been minimized.

@samdark

samdark Sep 7, 2013

Contributor

Creating a wrapper object for each value obviously will add significantly more memory usage and performance drop.

@samdark

samdark Sep 7, 2013

Contributor

Creating a wrapper object for each value obviously will add significantly more memory usage and performance drop.

This comment has been minimized.

@tedivm

tedivm Sep 8, 2013

Contributor

No, it really won't. The PHP Option project did benchmarks because of this very type of argument and showed that the cost is negligible. What you're saying is simply FUD- back it up with numbers if you think otherwise.

@tedivm

tedivm Sep 8, 2013

Contributor

No, it really won't. The PHP Option project did benchmarks because of this very type of argument and showed that the cost is negligible. What you're saying is simply FUD- back it up with numbers if you think otherwise.

This comment has been minimized.

@samdark

samdark Sep 8, 2013

Contributor

https://gist.github.com/samdark/6488056
https://gist.github.com/samdark/6488060

objects.php

Done in: 0.005024 s
Memory used: 512 KB 

scalars.php

Done in: 0.002922 s
Memory used: 256 KB 

Memory usage and execution time are doubled in case of object wrappers if there are only 2 private members. If there are more then more memory will be used. Absolute numbers aren't that bad in both cases, still overall framework performance forms of many small decisions.

You can say that's micro-optimization and I agree that it's not a thing to worry for application developers. Still, at framework level it's a thing to consider since it will improve thousands of applications w/o their authors doing anything.

@samdark

samdark Sep 8, 2013

Contributor

https://gist.github.com/samdark/6488056
https://gist.github.com/samdark/6488060

objects.php

Done in: 0.005024 s
Memory used: 512 KB 

scalars.php

Done in: 0.002922 s
Memory used: 256 KB 

Memory usage and execution time are doubled in case of object wrappers if there are only 2 private members. If there are more then more memory will be used. Absolute numbers aren't that bad in both cases, still overall framework performance forms of many small decisions.

You can say that's micro-optimization and I agree that it's not a thing to worry for application developers. Still, at framework level it's a thing to consider since it will improve thousands of applications w/o their authors doing anything.

This comment has been minimized.

@philsturgeon

philsturgeon Sep 8, 2013

Contributor

Notice the part where @tedivm said "the cost is negligible", not "non-existant". Pure key/value scalar stuff is half the speed of the objects, but its drastically less useful.

  1. Using 1000 cache requests in a single page load is going to be rare.
  2. Even when you do 1000 its well within the realms of being perfectly performant.

This feels like when folks argue over which is more performant: single quotes or double quotes. The answer to that is always "it doesnt matter".

@philsturgeon

philsturgeon Sep 8, 2013

Contributor

Notice the part where @tedivm said "the cost is negligible", not "non-existant". Pure key/value scalar stuff is half the speed of the objects, but its drastically less useful.

  1. Using 1000 cache requests in a single page load is going to be rare.
  2. Even when you do 1000 its well within the realms of being perfectly performant.

This feels like when folks argue over which is more performant: single quotes or double quotes. The answer to that is always "it doesnt matter".

This comment has been minimized.

@samdark

samdark Sep 8, 2013

Contributor

@philsturgeon it's a bit more than the overhead of the quotes that is really negligible and I agree this overhead alone doesn't matter but if there will be 9 more places where values will be wrapped with objects it would matter.

I still have doubts that object wrapper is really necessary in this case. Pro is that there is a clear difference between cached null and non-existing cache item but caching null is a pretty rare situation I've personally never experienced in any of projects I've worked on. If dependency information will be separated from value-wrapper I simply don't see any extra stuff that can be put in it.

The conversation @tedivm mentioned can change my opinion though so I'm eager to read it.

@samdark

samdark Sep 8, 2013

Contributor

@philsturgeon it's a bit more than the overhead of the quotes that is really negligible and I agree this overhead alone doesn't matter but if there will be 9 more places where values will be wrapped with objects it would matter.

I still have doubts that object wrapper is really necessary in this case. Pro is that there is a clear difference between cached null and non-existing cache item but caching null is a pretty rare situation I've personally never experienced in any of projects I've worked on. If dependency information will be separated from value-wrapper I simply don't see any extra stuff that can be put in it.

The conversation @tedivm mentioned can change my opinion though so I'm eager to read it.

This comment has been minimized.

@samdark

samdark Sep 8, 2013

Contributor

btw., if you'll look at Java's JSR107 it specifically defines that "Cache keys and values must not be null".

@samdark

samdark Sep 8, 2013

Contributor

btw., if you'll look at Java's JSR107 it specifically defines that "Cache keys and values must not be null".

This comment has been minimized.

@philsturgeon

philsturgeon Sep 8, 2013

Contributor

Because everyone loves Java.

It's not simply a case of "null is bad", it allows other optional features such as TTL control, adding tags and all sorts of fun.

https://groups.google.com/forum/#!msg/php-fig/_HMACz6NzrU/au9k2Vkro2gJ

I am responding with links to the mailing list conversations that have already covered this, hoping to either end this PR discussion or move it to the ML.

Anybody in the FIG will tell you that these discussions need to happen on the mailing list, so PLEASE go over there and discuss it. Find a topic that is relevant and reply, or make your own called "[Cache] Some specific question/feature".

I can almost guarantee you anything you have to say has been discussed before, but post away and we'll try and link to relevant conversations or reply in kind if the new point is new. :)

@philsturgeon

philsturgeon Sep 8, 2013

Contributor

Because everyone loves Java.

It's not simply a case of "null is bad", it allows other optional features such as TTL control, adding tags and all sorts of fun.

https://groups.google.com/forum/#!msg/php-fig/_HMACz6NzrU/au9k2Vkro2gJ

I am responding with links to the mailing list conversations that have already covered this, hoping to either end this PR discussion or move it to the ML.

Anybody in the FIG will tell you that these discussions need to happen on the mailing list, so PLEASE go over there and discuss it. Find a topic that is relevant and reply, or make your own called "[Cache] Some specific question/feature".

I can almost guarantee you anything you have to say has been discussed before, but post away and we'll try and link to relevant conversations or reply in kind if the new point is new. :)

This comment has been minimized.

@samdark

samdark Sep 8, 2013

Contributor

OK, will move to ML from now on. It's a bit harder to navigate it than github but I'll try to handle it.

@samdark

samdark Sep 8, 2013

Contributor

OK, will move to ML from now on. It's a bit harder to navigate it than github but I'll try to handle it.

Examples:
Some common usage patterns are shown below. These are non-normative but should
demonstrate the application of some design decisions.
```php
/**
* Gets a list of available widgets.
*
* In this case, we assume the widget list changes so rarely that we want
* the list cached forever until an explicit clear.
*/
function get_widget_list()
{
$pool = get_cache_pool('widgets');
$item = $pool->getItem('widget_list');
if (!$item->isHit()) {
$value = compute_expensive_widget_list();
$item->set($value);
$pool->save($item);
}
return $item->get();
}
```
```php
/**
* Caches a list of available widgets.
*
* In this case, we assume a list of widgets has been computed and we want
* to cache it, regardless of what may already be cached.
*/
function save_widget_list($list)
{
$pool = get_cache_pool('widgets');
$item = $pool->getItem('widget_list');
$item->set($list);
$pool->save($item);
}
```
```php
/**
* Clears the list of available widgets.
*
* In this case, we simply want to remove the widget list from the cache. We
* don't care if it was set or not; the post condition is simply "no longer set".
*/
function clear_widget_list()
{
$pool = get_cache_pool('widgets');
$pool->deleteItems(['widget_list']);
}
```
```php
/**
* Clears all widget information.
*
* In this case, we want to empty the entire widget pool. There may be other
* pools in the application that will be unaffected.
*/
function clear_widget_cache()
{
$pool = get_cache_pool('widgets');
$pool->clear();
}
```
```php
/**
* Load widgets.
*
* We want to get back a list of widgets, of which some are cached and some
* are not. This of course assumes that loading from the cache is faster than
* whatever the non-cached loading mechanism is.
*
* In this case, we assume widgets may change frequently so we only allow them
* to be cached for an hour (3600 seconds). We also cache newly-loaded objects
* back to the pool en masse.
*
* Note that a real implementation would probably also want a multi-load
* operation for widgets, but that's irrelevant for this demonstration.
*/
function load_widgets(array $ids)
{
$pool = get_cache_pool('widgets');
$keys = array_map(function($id) { return 'widget.' . $id; }, $ids);
$items = $pool->getItems($keys);
$widgets = array();
foreach ($items as $key => $item) {
if ($item->isHit()) {
$value = $item->get();
}
else {
$value = expensive_widget_load($id);
$item->set($value, 3600);
$pool->saveDeferred($item, true);
}
$widget[$value->id()] = $value;
}
$pool->commit(); // If no items were deferred this is a no-op.
return $widgets;
}
```
```php
/**
* This examples reflects functionality that is NOT included in this
* specification, but is shown as an example of how such functionality MIGHT
* be added by extending implementations.
*/
interface TaggablePoolInterface extends Psr\Cache\PoolInterface
{
/**
* Clears only those items from the pool that have the specified tag.
*/
clearByTag($tag);
}
interface TaggableItemInterface extends Psr\Cache\ItemInterface
{
public function setTags(array $tags);
}
/**
* Caches a widget with tags.
*/
function set_widget(TaggablePoolInterface $pool, Widget $widget)
{
$key = 'widget.' . $widget->id();
$item = $pool->getItem($key);
$item->setTags($widget->tags());
$item->set($widget);
$pool->save($item);
}
```
### 4.2 Alternative: "Weak item" approach
A variety of earlier drafts took a simpler "key value with expiration" approach,
also known as a "weak item" approach. In this model, the "Cache Item" object
was really just a dumb array-with-methods object. Users would instantiate it
directly, then pass it to a cache pool. While more familiar, that approach
effectively prevented any meaningful extension of the Cache Item. It effectively
made the Cache Item's constructor part of the implicit interface, and thus
severely curtailed extensibility or the ability to have the cache item be where

This comment has been minimized.

@samdark

samdark Sep 7, 2013

Contributor

I doubt any intelligence should be attached to the cached item itself.

@samdark

samdark Sep 7, 2013

Contributor

I doubt any intelligence should be attached to the cached item itself.

the intelligence lives.
In a poll conducted in June 2013, most participants showed a clear preference for
the more robust if less conventional "Strong item" / repository approach, which
was adopted as the way forward.
Pros:
* More traditional approach.

This comment has been minimized.

@samdark

samdark Sep 7, 2013

Contributor

More performance, simpler to learn and understand.

@samdark

samdark Sep 7, 2013

Contributor

More performance, simpler to learn and understand.

This comment has been minimized.

@samdark

samdark Sep 7, 2013

Contributor

Doesn't mix data to be cached with caching logic.

@samdark

samdark Sep 7, 2013

Contributor

Doesn't mix data to be cached with caching logic.

Cons:
* Less extensible or flexible.
### 4.3 Alternative: "Naked value" approach

This comment has been minimized.

@samdark

samdark Sep 7, 2013

Contributor

While the problem with null-value is valid, as a developer it's easy to learn one rule that null shouldn't be stored to cache.

This approach gives best possible performance and is simpler than any of the other ones. Caching rules (expiry time, dependencies etc.) could be set separately.

@samdark

samdark Sep 7, 2013

Contributor

While the problem with null-value is valid, as a developer it's easy to learn one rule that null shouldn't be stored to cache.

This approach gives best possible performance and is simpler than any of the other ones. Caching rules (expiry time, dependencies etc.) could be set separately.

Some of the earliest discussions of the Cache spec suggested skipping the Cache
Item concept all together and just reading/writing raw values to be cached.
While simpler, it was pointed out that made it impossible to tell the difference
between a cache miss and whatever raw value was selected to represent a cache
miss. That is, if a cache lookup returned NULL it's impossible to tell if there
was no cached value or if NULL was the value that had been cached. (NULL is a
legitimate value to cache in many cases.)
Most more robust caching implementations we reviewed -- in particular the Stash
caching library and the home-grown cache system used by Drupal -- use some sort
of structured object on `get` at least to avoid confusion between a miss and a
sentinel value. Based on that prior experience FIG decided that a naked value
on `get` was impossible.
### 4.4 Alternative: ArrayAccess Pool
There was a suggestion to make a Pool implement ArrayAccess, which would allow
for cache get/set operations to use array syntax. That was rejected due to
limited interest, limited flexibility of that approach (trivial get and set with
default control information is all that's possible), and because it's trivial
for a particular implementation to include as an add-on should it desire to
do so.
5. People
---------
### 5.1 Editor
* Larry Garfield
### 5.2 Sponsors
* Pádraic Brady (Coordinator)
* John Mertic
### 5.3 Contributors
* Paul Dragoonis
* Robert Hafner
6. Votes
--------
7. Relevant Links
-----------------
_**Note:** Order descending chronologically._
* [Survey of existing cache implementations][1], by @dragoonis
* [Strong vs. Weak informal poll][2], by @Crell
* [Implementation details informal poll][3], by @Crell
[1]: https://docs.google.com/spreadsheet/ccc?key=0Ak2JdGialLildEM2UjlOdnA4ekg3R1Bfeng5eGlZc1E#gid=0
[2]: https://docs.google.com/spreadsheet/ccc?key=0AsMrMKNHL1uGdDdVd2llN1kxczZQejZaa3JHcXA3b0E#gid=0
[3]: https://docs.google.com/spreadsheet/ccc?key=0AsMrMKNHL1uGdEE3SU8zclNtdTNobWxpZnFyR0llSXc#gid=1
Oops, something went wrong.
ProTip! Use n and p to navigate between commits in a pull request.