Skip to content

Commit

Permalink
Merge pull request #45 from NextThought/cache-docs
Browse files Browse the repository at this point in the history
Update zeo-client-cache-tracing to use accurate script names.
  • Loading branch information
jamadden committed Jul 12, 2016
2 parents 6b39090 + 4fb51de commit edf0bf0
Showing 1 changed file with 13 additions and 12 deletions.
25 changes: 13 additions & 12 deletions doc/zeo-client-cache-tracing.txt
Expand Up @@ -34,11 +34,12 @@ name or IP address) are logged.
Analyzing a Cache Trace
-----------------------

The stats.py command-line tool is the first-line tool to analyze a cache
trace. Its default output consists of two parts: a one-line summary of
essential statistics for each segment of 15 minutes, interspersed with lines
indicating client restarts, followed by a more detailed summary of overall
statistics.
The cache_stats.py command-line tool (``python -m
ZEO.scripts.cache_stats``) is the first-line tool to analyze a cache
trace. Its default output consists of two parts: a one-line summary of
essential statistics for each segment of 15 minutes, interspersed with
lines indicating client restarts, followed by a more detailed summary
of overall statistics.

The most important statistic is the "hit rate", a percentage indicating how
many requests to load an object could be satisfied from the cache. Hit rates
Expand All @@ -48,7 +49,7 @@ server's performance) by increasing the ZEO cache size. This is normally
configured using key ``cache_size`` in the ``zeoclient`` section of your
configuration file. The default cache size is 20 MB, which is small.

The stats.py tool shows its command line syntax when invoked without
The cache_stats.py tool shows its command line syntax when invoked without
arguments. The tracefile argument can be a gzipped file if it has a .gz
extension. It will be read from stdin (assuming uncompressed data) if the
tracefile argument is '-'.
Expand All @@ -57,7 +58,7 @@ Simulating Different Cache Sizes
--------------------------------

Based on a cache trace file, you can make a prediction of how well the cache
might do with a different cache size. The simul.py tool runs a simulation of
might do with a different cache size. The cache_simul.py tool runs a simulation of
the ZEO client cache implementation based upon the events read from a trace
file. A new simulation is started each time the trace file records a client
restart event; if a trace file contains more than one restart event, a
Expand All @@ -66,7 +67,7 @@ statistics is added at the end.

Example, assuming the trace file is in /tmp/cachetrace.log::

$ python simul.py -s 4 /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py -s 4 /tmp/cachetrace.log
CircularCacheSimulation, cache size 4,194,304 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 1429329 24046 41517 44.4% 40776 99.8
Expand All @@ -80,7 +81,7 @@ by object eviction and not yet reused to hold another object's state).

Let's try this again with an 8 MB cache::

$ python simul.py -s 8 /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py -s 8 /tmp/cachetrace.log
CircularCacheSimulation, cache size 8,388,608 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 2182722 31315 41517 67.8% 40016 100.0
Expand All @@ -89,15 +90,15 @@ That's a huge improvement in hit rate, which isn't surprising since these are
very small cache sizes. The default cache size is 20 MB, which is still on
the small side::

$ python simul.py /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py /tmp/cachetrace.log
CircularCacheSimulation, cache size 20,971,520 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 2982589 37922 41517 92.7% 37761 99.9

Again a very nice improvement in hit rate, and there's not a lot of room left
for improvement. Let's try 100 MB::

$ python simul.py -s 100 /tmp/cachetrace.log
$ python -m ZEO.scripts.cache_simul.py -s 100 /tmp/cachetrace.log
CircularCacheSimulation, cache size 104,857,600 bytes
START TIME DURATION LOADS HITS INVALS WRITES HITRATE EVICTS INUSE
Jul 22 22:22 39:09 3218856 3218741 39572 41517 100.0% 22778 100.0
Expand All @@ -115,7 +116,7 @@ never loaded again. If, for example, a third of the objects are loaded only
once, it's quite possible for the theoretical maximum hit rate to be 67%, no
matter how large the cache.

The simul.py script also contains code to simulate different cache
The cache_simul.py script also contains code to simulate different cache
strategies. Since none of these are implemented, and only the default cache
strategy's code has been updated to be aware of MVCC, these are not further
documented here.
Expand Down

0 comments on commit edf0bf0

Please sign in to comment.