Skip to content

PerformanceCacheOptimization

haschek edited this page Nov 29, 2011 · 1 revision

Performance & Cache Optimizations

PubwichFork implements two performance strategies:

  • Caching: a cache is a component that transparently stores data so that future requests for that data can be served faster. Pubwich already used caches to save data aggregated from the configured web services. PubwichFork adds caching for the final output. You can configure a timeframe how long the data/output is used from the cache (cache invalidation time).
  • Post-output processing: the application is still working after the output response was sent back to the browser. So, PubwichFork is able to answer requests immediately, using a output cache, but it keeps running to update data caches.

Caching

PubwichFork uses two constants to define global cache invalidation times:

  • CACHE_LIMIT: time (in seconds) after data cache gets updated
  • OUTPUT_CACHE_LIMIT: time (in seconds) after output cache gets updated

You must define a valid cache folder using constant CACHE_LOCATION or caching won't work!

Beside the global cache invalidation time you can optionally define the time limit for each individual service class, using a config parameter:

  • cache_limit: number of seconds cache will stay valid for this service

If you have a lot of data streams with the same cache invalidation time, all that services will probably be updated in the same application run, within the same user request. PubwichFork can use a displacement factor to shift randomly the invalidation time in a defined range. For example if you define 1 hour as invalidation time, then a displacement factor of 0.5 leads to cache limit randomly calculated (within every request) between 30 and 90 minutes. This way the aggregation requests will be homogeneously distributed but the average invalidation time is still 1 hour.

  • CACHE_DISPLACEMENT: factor between 0 and 1, it is recommended to use values between 0.1 and 0.5, currently this is only used for data caches.

Post-output processing

In addition to the output cache, it is possible to configure that PubWichFork uses even invalidated output caches. This is important for situations when the application cannot use valid caches, e.g. if you only have 20 visitors per day (homogeneously distributed) then a invalidation time of 15 minutes would lead to a lot of requests without valid output cache because the last rendered cache is too old. All your visitors have to wait a long time for the response. In this case it is a good strategy (at least for non-critical resources) to use an invalid cache for a fast response, and updating everything after it.

  • ENABLE_INVALID_CACHE: set it to true to activate it. Currently it is only used for the output cache.

Note: even without the usage of invalid output caches, PubwichFork is updating the data caches after sending back a valid output cache. If you do not want to use it, you must disable the output cache by setting OUTPUT_CACHE_LIMIT to 0 seconds.

Performance recommendations

  • Always configure cache invalidation times related to your update behaviour! E.g. if you make updates within your used social web services only 2 or 3 times on a average day, a CACHE_LIMIT of 6 hours is good enough. Use cache_limit to adjust special data streams, e.g. 1 hour for Twitter and 24 hours for your weblog.
  • Use a output cache what is valid the half time of your shortest data cache.
  • Enable the usage of invalid output caches, especially if you have only a few visitors per day! Be honest to yourself :)