Yet another cache proposal #63

Closed
wants to merge 5 commits into
from

Conversation

Projects
None yet
@dlsniper

dlsniper commented Dec 1, 2012

Problem to be solved

Unify the caching interface across libraries and framework in a simple yet
extensible way.

Current implementations

Doctrine 2 has a stable implementation going on.

Zend Framework 2 has something as well.

Stash provides a good standalone library.

Symfony2 has none and I haven't checked the other frameworks / libraries.

Inspiration and credits

This is based on the work of @evert from here:
https://github.com/evert/fig-standards/blob/master/proposed/objectcache.md

It is also taken feedback from here Symfony2 comunity here:
symfony/symfony#3211

It has been started here as a proposal for a way to implement cache:
symfony/symfony#5902

Also @c-datculescu has been very helpful in identifying possible use cases and
problems as well as bouncing back ideas or helping out with thoughts on the
various issues.

This was later on merged with the solution from @evert while talking on IRC
about our proposals and commonality of approaches.

Proposed solution

You can view it rendered here:
https://github.com/dlsniper/fig-standards/blob/cache-proposal/proposed/psr-cache.md

This solution addresses the basic needs for a driver library since most of the
time caching is done using key/value systems that do not support advanced
features like tags, namespaces, transactions or locking.

Having this fact into consideration the libraries should be kept as simple as
possible and don't reinvent the wheel while help for advanced use cases or
problems should be pointed out.

Lack of certain features, see tagging, namespace or locking

Tagging and namespaces

In regards to the lack of tags/namespace support from the proposed solution,
I give you the following case, that originated in a discussion with a friend
of mine.

- What if the driver doesn't support tagging?

- Simple, it's the proxy job to know what the driver supports and in this case
emulate support for the missing feature.

- Ok, but how would you handle say 5 tags / item with a length of say 10
characters and about 15.000 items in stored?

- Well we'd need implement something that gets the maximum permitted storage
size for the given driver, then create some special, unique entries where
the information is stored.

- Great but now you have another problem. Concurrency/traffic and time spent
in the proxy emulating this feature. In a highly concurrent environment, what
whould happen to the tag pool and changes you want to perform on it?

- Without locking it would introduce a problem as one client could update same
tag as another but with different results. One should then check in the tag
pool if the desired result is there then redo the operation if it failed.

So you can see where this is leading.

Same goes for namespaces.

Locking

Locking is something that could be done easily but I currently don't see many
cases where it would be useful. Locking would be useful in highly concurrent
environment but in that case it would mean that you would rely on users, for
example, to generate the cache, which would mean that you have one user unhappy
because he's performing a potentially heavy operation. Which is not good by
design.

For situations where a locking would be required, think about the implications
about resources implied, from waiting users to running PHP processes that wait
for the initial process to finish processing the cache update or to the
detection of when that process finished/crashed.

Reason for skipping the metioned features from the current proposal

While the common approach with caching is:


$cache = new ApcDriver();

$cacheItem = $cache->get($key, $exists);

if (null === $cacheItem) {
    // make heavy operation
    $cacheValue = $resultFromOperation;
    $cache->set($key, $cacheItem, 300);
} else {
    $cacheValue = $cacheItem->getValue();
}

A better approach would be to have multiple caching layers and just go from
one to another if there's a problem retrieving the data from one layer until
a cronjob makes it available again.


$apcCache = new ApcDriver();

$exists = false;
$cacheItem = $apcCache->get($key);

if (null === $cacheItem) {
    $memcacheCache = new MemcachedDriver();
    $cacheItem = $memcacheCache->get($key);

    if (null === $cacheItem) {
        $dbCache = new DbCache();
        $cacheItem = $dbCache->get($key);

        if (null === $cacheItem) {
            // make heavy operation
            $cacheValue = $resultFromOperation;
            $apcCache->set($key, $cacheValue, 300);
        } else {
            $cacheValue = $cacheItem->getValue();
        }
    } else {
        $cacheValue = $cacheItem->getValue();
    }
} else {
    $cacheValue = $cacheItem->getValue();
}

Note: a nicer approach can be done with the approach @schmittjoh suggests here

Also, in most cases, APC driver should be a good one. But if you need a
distributed solution, Memcached, Redis or even MongoDB (thought it would
not be a recommendation to use it for caching necessarily) make out great
distributed systems which can be clustered and have redundancy and so on.
Solutions like having a cluster and in front of it a HAProxy server to balance
out requests one servers are down are available in such situations.

For lower level websites that just need a simple solution, the concurrency
should not be an issue in terms of having less caching levels as well as
even saving the object from the user request.

The ideal solution would have a cronjob that's able to populate the cache
before the items in it expire.

It is true that we can definitely provide emulation for all of the features,
having the library providing them would mean an additional level of abstraction
that will stand between the user and desired functionality and this is what
enterprise users don't agree with, for good reasons.

It is my opinion that we also need to teach the users how to think better
applications rather that just provide them tools that can magically do
more that they are suppose to do.

This way, people will be aware of the limitations and instead of having
a library/framework doing the workaround they would either create a better
logic in the application or help improve the tools/applications that are
suppose to be handling the issue.

If the above train of thoughts wasn't enough to satisfy the needs of the
community both current and future, then I've also added some 'advanced'
interfaces that are designed to tackle the missing functionality from the
initial proposal and you can find it here:
https://github.com/dlsniper/fig-standards/blob/extended-cache-proposal/proposed/psr-extended-cache.md

I've also created a dedicated topic on the Google Group for this
https://groups.google.com/forum/?fromgroups=#!topic/php-fig/VRUEzicwjb8
where you can leave your feedback.

Thank you!

Best regards.

@schmittjoh

This comment has been minimized.

Show comment Hide comment
@schmittjoh

schmittjoh Dec 2, 2012

I'd like to suggest to consider returning an Option object instead:

$item = $apcDriver->get('foo')
           ->orElse($memcache->get('foo'))
           ->orElse($dbCache->get('foo'))
           ->getOrCall(function() {
                // heavy
           });

The overhead of that is in the nano second range (i.e. should not be relevant here), but the API is a lot more elegant (see https://github.com/schmittjoh/php-option).

I'd like to suggest to consider returning an Option object instead:

$item = $apcDriver->get('foo')
           ->orElse($memcache->get('foo'))
           ->orElse($dbCache->get('foo'))
           ->getOrCall(function() {
                // heavy
           });

The overhead of that is in the nano second range (i.e. should not be relevant here), but the API is a lot more elegant (see https://github.com/schmittjoh/php-option).

@dlsniper

This comment has been minimized.

Show comment Hide comment
@dlsniper

dlsniper Dec 2, 2012

@schmittjoh Thanks for your feedback. I think we could have the CacheProxy return null instead of a CacheItem in order for chaining to work as you described.

Another approach that could benefit from your solution would be to make the CacheProxy not have the driver set in the constructor but instead have a method called addCacheDriver($driver, $priority) which can allow for more that one cache driver to be used by the proxy. At this point, we could have the default get and set methods only use the highest priority driver, for consistency, and either add a new parameter to them, $fromFirstAvailable which is default false or create new methods for these operations. This way, the repeating code will be moved into one place and the 'userland` code will be cleaner.

dlsniper commented Dec 2, 2012

@schmittjoh Thanks for your feedback. I think we could have the CacheProxy return null instead of a CacheItem in order for chaining to work as you described.

Another approach that could benefit from your solution would be to make the CacheProxy not have the driver set in the constructor but instead have a method called addCacheDriver($driver, $priority) which can allow for more that one cache driver to be used by the proxy. At this point, we could have the default get and set methods only use the highest priority driver, for consistency, and either add a new parameter to them, $fromFirstAvailable which is default false or create new methods for these operations. This way, the repeating code will be moved into one place and the 'userland` code will be cleaner.

@nfx

This comment has been minimized.

Show comment Hide comment
@nfx

nfx Dec 6, 2012

nfx commented Dec 6, 2012

@c-datculescu

This comment has been minimized.

Show comment Hide comment
@c-datculescu

c-datculescu Dec 6, 2012

@nfx I am pretty sure CAS was taken into consideration, but i really think in this case CAS cannot help. Even more, CAS can produce problems. Basicly the entire purpose of CAS is to check at the moment of writing if the data you read is still untouched, which in high concurrency enviroments will not happen. Let's not forget that a caching mechanism is not intended to be used as some sort of transactional mechanism. In the current case, CAS can actually cause alot more retries if not careful.

Also, please remember that if CAS is implemented in memcached (and only in the memcached extension, not in the memcache one), it does not mean that APC for example provides at least a similar mechanism. Nor do other fast k/v storages.

@nfx I am pretty sure CAS was taken into consideration, but i really think in this case CAS cannot help. Even more, CAS can produce problems. Basicly the entire purpose of CAS is to check at the moment of writing if the data you read is still untouched, which in high concurrency enviroments will not happen. Let's not forget that a caching mechanism is not intended to be used as some sort of transactional mechanism. In the current case, CAS can actually cause alot more retries if not careful.

Also, please remember that if CAS is implemented in memcached (and only in the memcached extension, not in the memcache one), it does not mean that APC for example provides at least a similar mechanism. Nor do other fast k/v storages.

@dlsniper

This comment has been minimized.

Show comment Hide comment
@dlsniper

dlsniper Dec 7, 2012

@nfx I agree with all that @c-datculescu said above.
Also you could have a memcached driver that uses CAS for set operation just fine along with the other custom drivers should you need such a operation. This is a implementation detail from the driver vendor.
The example I have is how currently the fallback is done, either at logical step or implementation step.

dlsniper commented Dec 7, 2012

@nfx I agree with all that @c-datculescu said above.
Also you could have a memcached driver that uses CAS for set operation just fine along with the other custom drivers should you need such a operation. This is a implementation detail from the driver vendor.
The example I have is how currently the fallback is done, either at logical step or implementation step.

@stof

View changes

proposed/cache.md
+There are two types of cache proxies that can be used.
+The simple one, which provides the basic functionality like get/set/remove.
+
+Cache proxies are reponsible for sending the right data to the drivers, be it

This comment has been minimized.

Show comment Hide comment
@stof

stof Dec 12, 2012

typo here. Missing s

@stof

stof Dec 12, 2012

typo here. Missing s

@stof

View changes

proposed/cache.md
+ *
+ * @param CacheDriverInterface $cacheDriver
+ */
+ public function __construct(CacheDriverInterface $cacheDriver);

This comment has been minimized.

Show comment Hide comment
@stof

stof Dec 12, 2012

I don't like the fact that you add the constructor in the interface. The constructor is part of the way you instantiate the class, which is tied to the implementation anyway so I don't see the reason to forbid implementors to have other required dependencies.

@stof

stof Dec 12, 2012

I don't like the fact that you add the constructor in the interface. The constructor is part of the way you instantiate the class, which is tied to the implementation anyway so I don't see the reason to forbid implementors to have other required dependencies.

This comment has been minimized.

Show comment Hide comment
@vimishor

vimishor Dec 12, 2012

As far as I know, an interface can not be instantiated and they are used just to specify how to pass data. So what is the purpose of the constructor in there ?

If you need some sort of initialization, maybe an abstract class can serve this purpose better than interface.

@vimishor

vimishor Dec 12, 2012

As far as I know, an interface can not be instantiated and they are used just to specify how to pass data. So what is the purpose of the constructor in there ?

If you need some sort of initialization, maybe an abstract class can serve this purpose better than interface.

This comment has been minimized.

Show comment Hide comment
@sumpygump

sumpygump Dec 12, 2012

@vimishor Because this constructor method has a type hint it is a way of enforcing that a class that implement this interface must use a CacheDriverInterface as the parameter for its construction. But stof does has a point that the construction of an object doesn't need to be enforced at the interface level.

@sumpygump

sumpygump Dec 12, 2012

@vimishor Because this constructor method has a type hint it is a way of enforcing that a class that implement this interface must use a CacheDriverInterface as the parameter for its construction. But stof does has a point that the construction of an object doesn't need to be enforced at the interface level.

This comment has been minimized.

Show comment Hide comment
@vimishor

vimishor Dec 12, 2012

@sumpygump Ah, now I got the idea behind that constructor. Thanks for clarification.

@vimishor

vimishor Dec 12, 2012

@sumpygump Ah, now I got the idea behind that constructor. Thanks for clarification.

This comment has been minimized.

Show comment Hide comment
@dlsniper

dlsniper Dec 12, 2012

I agree that having the constructor locked is not such a good idea, I'll make the changes for it, thanks for the suggestion @stof.

@dlsniper

dlsniper Dec 12, 2012

I agree that having the constructor locked is not such a good idea, I'll make the changes for it, thanks for the suggestion @stof.

@dlsniper

This comment has been minimized.

Show comment Hide comment
@dlsniper

dlsniper Dec 13, 2012

I've updated the proposal with some things inspired from the LoggerInterface proposal as well as some rewording / enhancements in order for this to be a bit more clear.

I've updated the proposal with some things inspired from the LoggerInterface proposal as well as some rewording / enhancements in order for this to be a bit more clear.

@andrerom

View changes

proposed/psr-cache.md
+ *
+ * @return CacheProxyInterface
+ */
+ public function setCacheDriver(DriverInterface $cacheDriver);

This comment has been minimized.

Show comment Hide comment
@andrerom

andrerom Jan 4, 2013

Should this be up to implementation so we can support proxy's with several drivers for several layers of cache?
In such a case you would not have to consider returning option object as suggested by @schmittjoh, cause the proxy implementation handles it for you.

@andrerom

andrerom Jan 4, 2013

Should this be up to implementation so we can support proxy's with several drivers for several layers of cache?
In such a case you would not have to consider returning option object as suggested by @schmittjoh, cause the proxy implementation handles it for you.

This comment has been minimized.

Show comment Hide comment
@dlsniper

dlsniper Jan 4, 2013

Hi @andrerom, please add your comment on the mailing list so that others can see it and I'll reply to it asap.

Thanks!

@dlsniper

dlsniper Jan 4, 2013

Hi @andrerom, please add your comment on the mailing list so that others can see it and I'll reply to it asap.

Thanks!

@dlsniper

This comment has been minimized.

Show comment Hide comment
@dlsniper

dlsniper Jan 30, 2013

@bschussek thanks for grammar check! 👍

@bschussek thanks for grammar check! 👍

proposed/psr-cache.md
+/**
+ * Interface for a cache driver that supports TTLs
+ */
+interface CacheInterface extends CacheInterface

This comment has been minimized.

Show comment Hide comment
@marc-mabe

marc-mabe Feb 25, 2013

There is a typo in name

@marc-mabe

marc-mabe Feb 25, 2013

There is a typo in name

@dlsniper

This comment has been minimized.

Show comment Hide comment
@dlsniper

dlsniper Mar 5, 2013

@dragoonis will come with the final proposal soonish.

dlsniper commented Mar 5, 2013

@dragoonis will come with the final proposal soonish.

@dlsniper dlsniper closed this Mar 5, 2013

@dlsniper dlsniper reopened this Mar 6, 2013

@dragoonis

This comment has been minimized.

Show comment Hide comment
@dragoonis

dragoonis Mar 6, 2013

Member

@dlsniper Some personal issues came up, so the proposal from me will have to wait until Friday. Thanks.

Member

dragoonis commented Mar 6, 2013

@dlsniper Some personal issues came up, so the proposal from me will have to wait until Friday. Thanks.

@dlsniper

This comment has been minimized.

Show comment Hide comment
@dlsniper

dlsniper Mar 7, 2013

@dragoonis thanks, I've just opened it to be easier to spot for the time being.
Take your time and solve your issues, there's no rush.
Thanks for work on this.

dlsniper commented Mar 7, 2013

@dragoonis thanks, I've just opened it to be easier to spot for the time being.
Take your time and solve your issues, there's no rush.
Thanks for work on this.

@dragoonis

This comment has been minimized.

Show comment Hide comment
@dragoonis

dragoonis Mar 7, 2013

Member

Thanks Florin! :-)

On Thu, Mar 7, 2013 at 2:10 PM, Florin Patan notifications@github.comwrote:

@dragoonis https://github.com/dragoonis thanks, I've just opened it to
be easier to spot for the time being.
Take your time and solve your issues, there's no rush.
Thanks for work on this.


Reply to this email directly or view it on GitHubhttps://github.com/php-fig/fig-standards/pull/63#issuecomment-14562221
.

Member

dragoonis commented Mar 7, 2013

Thanks Florin! :-)

On Thu, Mar 7, 2013 at 2:10 PM, Florin Patan notifications@github.comwrote:

@dragoonis https://github.com/dragoonis thanks, I've just opened it to
be easier to spot for the time being.
Take your time and solve your issues, there's no rush.
Thanks for work on this.


Reply to this email directly or view it on GitHubhttps://github.com/php-fig/fig-standards/pull/63#issuecomment-14562221
.

@dragoonis

This comment has been minimized.

Show comment Hide comment
@dragoonis

dragoonis Mar 12, 2013

Member

Florin has just asked me to close this off. Doing so.

Member

dragoonis commented Mar 12, 2013

Florin has just asked me to close this off. Doing so.

@dragoonis dragoonis closed this Mar 12, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment