python_memcached with consistent hashing
Python
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
consistent_memcache
scripts
tests
.gitignore
README
setup.py

README

WARNGING: Still experimental

Python Consistent Memcached

What is this?

There was no pure python memcached client that supports consistent
hashing.  python_consistent_memcached extends python_memcached to
support high performance consistent hashing with minimal code changes.

What is consistent hashing?

The traditional approach to distributing keys across memcached servers
is to take the hash value of the key and mod it by the number of
servers.  This method is fast but very fragile in the face of
deployment changes.

Consistent hash schemes try to keep keys assigned to the same servers
across server changes - cache misses typically only apply to those
mapped to different servers following a deployment change.


Installation

pip install python_consistent_memcached 

Usage

Instead of importing memcached

    from memcached import Client

write this instead 

    from consistent_memcached import Client


That's all.  

Details

A small module to bring consistent hashing to the python-memcached
module.  Hashing consistency is the tendency for the mapping from keys
to servers to remain somewhat the same when small changes are made to
the number of deployed servers.  

Modulo hashing, a naive sharding algorithm where nodes are distributed
on the basis of their hash modulus the number of servers has poor
consistency performance when the number of servers changes - moving
from 2 to 3 servers will typically result in a 2/3 cache miss.

Consistent hashing algorithms tie the key/server mapping to the hash
of the server name rather than its position in a list.  This results
in a significantly higher hit rate at the expense of some complexity.

Some memcached libraries support consistent hashing; for python the
libc-memcached supports naive modulus and the continuum algorithm.
The continuum algorithm has excellent consistency, but it requires
O(lg(n)) time per key look-up.   

python-memcached however only supports modulus hashing - it offers no
continuous hashing option.  This library implements a form of modulus
hashing called "rehashing" that seamlessly operates with modulus hash
based sharing and offers better algorithmic run-time performance than
continuum.