Skip to content
This repository has been archived by the owner on Jan 9, 2024. It is now read-only.

Commit

Permalink
Merge branch 'unstable'
Browse files Browse the repository at this point in the history
  • Loading branch information
Grokzen committed Oct 27, 2015
2 parents 5883200 + 85d5ec6 commit 92810eb
Show file tree
Hide file tree
Showing 35 changed files with 1,258 additions and 469 deletions.
21 changes: 19 additions & 2 deletions .travis.yml
@@ -1,13 +1,30 @@
sudo: false
language: python
python:
- "2.7"
- "3.2"
- "3.3"
- "3.4"
- "3.5"
- "nightly"
services:
- redis-server
install:
- make redis-install
- pip install coverage python-coveralls tox
- pip install -r dev-requirements.txt
- pip install -e .
- "if [[ $TEST_HIREDIS == '1' ]]; then pip install hiredis; fi"
env:
- TEST_HIREDIS=0
- TEST_HIREDIS=1
script:
- make test
- make start
- coverage erase
- coverage run --source rediscluster -p -m py.test
- make stop
after_success:
- coverage combine
- coveralls
matrix:
allow_failures:
- python: "nightly"
17 changes: 15 additions & 2 deletions CHANGES
@@ -1,3 +1,16 @@
* 1.1.0
* Refactored exception handling and exception classes.
* Added READONLY mode support, scales reads using slave nodes.
* Fix __repr__ for ClusterConnectionPool and ClusterReadOnlyConnectionPool
* Add max_connections_per_node parameter to ClusterConnectionPool so that max_connections parameter is calculated per-node rather than across the whole cluster.
* Improve thread safty of get_connection_by_slot and get_connection_by_node methods (iandyh)
* Improved error handling when sending commands to all nodes, e.g. info. Now the connection takes retry_on_timeout as an option and retry once when there is a timeout. (iandyh)
* Added support for SCRIPT LOAD, SCRIPT FLUSH, SCRIPT EXISTS and EVALSHA commands. (alisaifee)
* Improve thread safety to avoid exceptions when running one client object inside multiple threads and doing resharding of the
cluster at the same time.
* Fix ASKING error handling so now it really sends ASKING to next node during a reshard operation. This improvement was also made to pipelined commands.
* Improved thread safety in pipelined commands, along better explanation of the logic inside pipelining with code comments.

* 1.0.0
* No change to anything just a bump to 1.0.0 because the lib is now considered stable/production ready.

Expand All @@ -12,9 +25,9 @@

* 0.2.0
* Moved pipeline code into new file.
* Code now uses a proper cluster connection pool class that handles
* Code now uses a proper cluster connection pool class that handles
all nodes and connections similar to how redis-py do.
* Better support for pubsub. All clients will now talk to the same server because
* Better support for pubsub. All clients will now talk to the same server because
pubsub commands do not work reliably if it talks to a random server in the cluster.
* Better result callbacks and node routing support. No more ugly decorators.
* Fix keyslot command when using non ascii characters.
Expand Down
82 changes: 43 additions & 39 deletions README.md
@@ -1,45 +1,20 @@
# redis-py-cluster

Redis cluster client in python for the official cluster support targeted for redis 3.0.
This client provides a working client for redis cluster that was added in redis 3.0.

This project is a port of `redis-rb-cluster` by antirez, with alot of added functionality. The original source can be found at https://github.com/antirez/redis-rb-cluster

[![Build Status](https://travis-ci.org/Grokzen/redis-py-cluster.svg?branch=master)](https://travis-ci.org/Grokzen/redis-py-cluster) [![Coverage Status](https://coveralls.io/repos/Grokzen/redis-py-cluster/badge.png)](https://coveralls.io/r/Grokzen/redis-py-cluster) [![PyPI version](https://badge.fury.io/py/redis-py-cluster.svg)](http://badge.fury.io/py/redis-py-cluster) [![Gitter](https://badges.gitter.im/Join Chat.svg)](https://gitter.im/Grokzen/redis-py-cluster?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) [![Code Health](https://landscape.io/github/Grokzen/redis-py-cluster/unstable/landscape.svg)](https://landscape.io/github/Grokzen/redis-py-cluster/unstable)
[![Build Status](https://travis-ci.org/Grokzen/redis-py-cluster.svg?branch=master)](https://travis-ci.org/Grokzen/redis-py-cluster) [![Coverage Status](https://coveralls.io/repos/Grokzen/redis-py-cluster/badge.png)](https://coveralls.io/r/Grokzen/redis-py-cluster) [![PyPI version](https://badge.fury.io/py/redis-py-cluster.svg)](http://badge.fury.io/py/redis-py-cluster) [![Code Health](https://landscape.io/github/Grokzen/redis-py-cluster/unstable/landscape.svg)](https://landscape.io/github/Grokzen/redis-py-cluster/unstable)



# Project status

The project is not dead but not much new development is done right now. I do awnser issue reports and pull requests as soon as possible and if you have a problem you can ping me inside the gitter channel that you can find here [![Gitter](https://badges.gitter.im/Join Chat.svg)](https://gitter.im/Grokzen/redis-py-cluster?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) and i will help you out with problems or usage of this lib.
The project is not dead but, not much new development is done right now. I do answer issue reports and pull requests as soon as possible. If you have a problem with the code, you can ping me inside the gitter channel that you can find here [![Gitter](https://badges.gitter.im/Join Chat.svg)](https://gitter.im/Grokzen/redis-py-cluster?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) and i will help you out with problems or usage of this lib.

As of release `0.3.0` this project will be considered stable and usable in production. Just remember that if you are going to use redis cluster to please reda up on the documentation that you can find in the bottom of this Readme. It will contain usage examples and descriptions of what is implemented and what is not implemented and why things are the way they are.
As of release `0.3.0` this project will be considered stable and usable in production. If you are going to use redis cluster in your project, you should read up on all documentation that you can find in the bottom of this Readme file. It will contain usage examples and descriptions of what is and what is not implemented. It will also describe how and why things work the way they do in this client.

On the topic about porting/moving this code into `redis-py` there is currently work over here https://github.com/andymccurdy/redis-py/pull/604 that will bring cluster uspport based on this code. But my suggestion is that until that work is completed that you should use this lib.



## Upgrading instructions

Please read the [following](docs/Upgrading.md) documentation that will go through all changes that is required when upgrading `redis-py-cluster` between versions.



## Dependencies & supported python versions

- Python: redis >= `2.10.2` is required
- Redis server >= `3.0.0` is required
- Optional Python: hiredis >= `0.1.3`

Hiredis is tested and supported on all supported python versions.

Supported python versions, all minor releases in each major version should be supported unless otherwise stated here:

- 2.7.x
- 3.2.x
- 3.3.x
- 3.4.1+

Python 3.4.0 do not not work with pubsub because of segfault issues (Same as redis-py has). If rediscluster is runned on 3.4.0 it will raise RuntimeError exception and exit. If you get this error locally when running tox, consider using `pyenv` to fix this problem.
On the topic about porting/moving this code into `redis-py` there is currently work over here https://github.com/andymccurdy/redis-py/pull/604 that will bring cluster support based on this code. But my suggestion is that until that work is completed that you should use this lib.



Expand All @@ -61,26 +36,54 @@ $ python setup.py install

## Usage example

Small sample script that show how to get started with RedisCluster. `decode_responses=True` is required to have when running on python3.
Small sample script that shows how to get started with RedisCluster. It can also be found in [examples/basic.py](examples/basic.py)

```python
>>> from rediscluster import StrictRedisCluster

>>> startup_nodes = [{"host": "127.0.0.1", "port": "7000"}]

>>> # Note: decode_responses must be set to True when used with python3
>>> rc = StrictRedisCluster(startup_nodes=startup_nodes, decode_responses=True)

>>> rc.set("foo", "bar")
True
>>> rc.get("foo")
>>> print(rc.get("foo"))
'bar'
```

The following imports can be imported from `redis` package.

- `StrictRedisCluster`
- `RedisCluster`
- `StrictClusterPipeline`
- `ClusterPubSub`

`StrictRedisCluster` is based on `redis.StrictRedis` and `RedisCluster` has the same functionality as `redis.Redis` even if it is not directly based on it.
## Upgrading instructions

Please read the [following](docs/Upgrading.md) documentation that will go through all changes that is required when upgrading `redis-py-cluster` between versions.



## Dependencies & supported python versions

- Python: redis >= `2.10.2` is required
- Redis server >= `3.0.0` is required
- Optional Python: hiredis >= `0.1.3`

Hiredis is tested on all supported python versions.

List of all supported python versions.

- 2.7
- 3.2
- 3.3
- 3.4.1+
- 3.5

Experimental:

- Python 3.6.0a0 - Currently broken due to `coverage` is not yet compatible with python 3.6


### Python 3.4.0

A segfault was found when running `redis-py` in python `3.4.0` that was introduced into the codebase in python `3.4.0`. Because of this both `redis-py` and `redis-py-cluster` will not work when running with `3.4.0`. This lib has decided to block the lib from execution on `3.4.0` and you will get a exception when trying to import the code. The only solution is to use python `3.4.1` or some other higher minor version in the `3.4` series.



Expand All @@ -107,6 +110,7 @@ More detailed documentation can be found in `docs` folder.
- [Pipelines](docs/Pipelines.md)
- [Threaded Pipeline support](docs/Threads.md)
- [Cluster Management class](docs/ClusterMgt.md)
- [READONLY mode](docs/Readonly_mode.md)
- [Authors](docs/Authors)


Expand All @@ -115,7 +119,7 @@ More detailed documentation can be found in `docs` folder.

Both Redis cluster and redis-py-cluster is considered stable and production ready.

But this depends on what you are going to use clustering for. In the simple use cases with SET/GET and other single key functions there is not issues. If you require multi key functinoality or pipelines then you must be very carefull when developing because they work slightly different from the normal redis server.
But this depends on what you are going to use clustering for. In the simple use cases with SET/GET and other single key functions there is not issues. If you require multi key functinoality or pipelines then you must be very careful when developing because they work slightly different from the normal redis server.

If you require advance features like pubsub or scripting, this lib and redis do not handle that kind of use-cases very well. You either need to develop a custom solution yourself or use a non clustered redis server for that.

Expand Down
4 changes: 3 additions & 1 deletion benchmarks/simple.py
Expand Up @@ -65,7 +65,9 @@ def timeit_pipeline(rc, itterations=50000):
p.execute()

t1 = time.time() - t0
print("{}k SET/GET operations inside pipelines took: {} seconds... {} operations per second".format((itterations / 1000) * 2, t1, (itterations / t1) * 2))
print("{}k SET/GET operations inside pipelines took: {} seconds... {} operations per second".format(
(itterations / 1000) * 2, t1, (itterations / t1) * 2)
)


if __name__ == "__main__":
Expand Down
7 changes: 4 additions & 3 deletions dev-requirements.txt
@@ -1,8 +1,9 @@
-r requirements.txt

coverage >= 3.7.1
hiredis >= 0.1.3
pytest >= 2.5.0
coverage >= 3.7.1,< 4.0.0
-e git://github.com/pytest-dev/pytest.git@master#egg=pytest
testfixtures >= 4.0.1
mock == 1.0.1
docopt == 0.6.2
tox
python-coveralls
3 changes: 3 additions & 0 deletions docs/Authors
Expand Up @@ -18,3 +18,6 @@ Authors who contributed code or testing:
- 72squared - https://github.com/72squared
- Neuron Teckid - https://github.com/neuront
- iandyh - https://github.com/iandyh
- mumumu - https://github.com/mumumu
- awestendorf - https://github.com/awestendorf
- Ali-Akber Saifee - https://github.com/alisaifee
17 changes: 10 additions & 7 deletions docs/Commands.md
Expand Up @@ -26,7 +26,6 @@ The following commands will send the same request to all nodes in the cluster. R
- lastsave
- ping
- save
- script_flush
- slowlog_get
- slowlog_len
- slowlog_reset
Expand All @@ -42,13 +41,21 @@ The following commands will only be send to the master nodes in the cluster. Res
- flushdb
- scan


This command will sent to a random node in the cluster.

- publish

This command will be sent to the server that matches the first key.
The following commands will be sent to the server that matches the first key.

- eval
- evalsha

This following commands will be sent to the master nodes in the cluster.

- script load - the result is the hash of loaded script
- script flush - the result is `True` if the command succeeds on all master nodes, else `False`
- script exists - the result is an array of booleans. An entry is `True` only if the script exists on all the master nodes.

The following commands will be sent to the sever that matches the specefied key.

Expand All @@ -70,13 +77,9 @@ Either because they do not work, there is no working implementation or it is not

- bitop - Currently to hard to implement a solution in python space
- client_setname - Not yet implemented
- evalsha - Lua scripting is not yet implemented
- move - It is not possible to move a key from one db to another in cluster mode
- register_script - Lua scripting is not yet implemented
- restore
- script_exists - Lua scripting is not yet implemented
- script_kill - Lua scripting is not yet implemented
- script_load - Lua scripting is not yet implemented
- script_kill - Not yet implemented
- sentinel
- sentinel_get_master_addr_by_name
- sentinel_master
Expand Down
19 changes: 19 additions & 0 deletions docs/Pipelines.md
@@ -1,3 +1,22 @@
# How pipelining works

Just like in redis-py, redis-py-cluster queues up all the commands inside the client until execute is called. But, once execute is called, redis-py-cluster internals work slightly differently. It still packs the commands to efficiently
transmit multiple commands across the network. But since different keys may be mapped to different nodes, redis-py-cluster must first map each key to the expected node. It then packs all the commands destined for each node in the cluster into its own packed sequence of commands. It uses the redis-py library to communicate with each node in the cluster.
Ideally all the commands should be sent to each node in the cluster in parallel so that all the commands can be processed as fast as possible. The naive approach is to iterate through each node and send each batch of commands sequentially to each node. If redis-py supported some sort of non-blocking i/o we could send the network requests first
and multiplex the socket responses from each node. Instead, we use threads to send the requests in parallel so that the total execution time only equals the amount of time for the slowest round trip to and from the given set of nodes in the cluster needed to process the commands.
In previous versions of the library there were some bugs associated with threaded operations and pipelining. We were freeing connections back into the connection pool prior to reading the responses from each thread and it caused all kinds of problems. But these issues should now be addressed. Ff you are worried, you can disable the threaded behavior with a flag so that it only sends the commands to each node sequentially.


# Connection Error handling

The other way pipelines differ in redis-py-cluster from redis-py is in error handling and retries. With the normal redis-py client, if you hit a connection error during a pipeline command it raises the error right there. But we expect redis-cluster to be more resilient to failures. If you hit a connection problem with one of the nodes in the cluster, most likely a stand-by slave will take over for the down master pretty quickly. In this case, we try the commands bound for that particular node to another random node. The other random node will not just blindly accept these commands. It only accepts them if the keys referenced in those commands actually map to that node in the cluster configuration. So most likely it will respond with a MOVED error telling the client the new master for those commands. Our code handles these MOVED commands according to the redis cluster specification and re-issues the commands to the correct server transparently inside of pipeline.execute(). You can disable this behavior if you'd like as well.


# ASKED and MOVED errors

The other tricky part of the redis-cluster specification is that if any command response comes back with an ASK or MOVED error, the command is to be retried against the specified node. In previous versions of redis-py-cluster we were treating ASKED and MOVED errors the same, but they really need to be handled differently. MOVED error means that the client can safely update its own representation of the slots table to point to a new node for all future commands bound for that slot. But an ASK error means the slot is only partially migrated and that the client can only successfully issue that command to the new server if it prefixes the request with an ASKING command first. This lets the new node taking over that slot know that the original server said it was okay to run that command for the given key against the new node even though the slot is not yet completely migrated. Our current implementation now handles this case correctly.


# The philosophy on pipelines

After playing around with pipelines and thinking about possible solutions that could be used in a cluster setting this document will describe how pipelines work, strengths and weaknesses with the implementation that was chosen.
Expand Down
18 changes: 17 additions & 1 deletion docs/Pubsub.md
Expand Up @@ -2,7 +2,7 @@

After testing pubsub in cluster mode one big problem was discovered with the `PUBLISH` command.

According to the current official redis documentation on `PUBLISH` "Integer reply: the number of clients that received the message." it was initially assumed that if we had clients connected to different nodes in the cluster it would still report back the correct number of clients that recieved the message.
According to the current official redis documentation on `PUBLISH` "Integer reply: the number of clients that received the message." it was initially assumed that if we had clients connected to different nodes in the cluster it would still report back the correct number of clients that recieved the message.

However after some testing of this command it was discovered that it would only report the number of clients that have subscribed on the same server the `PUBLISH` command was executed on.

Expand All @@ -18,6 +18,22 @@ Discussion on this topic can be found here: https://groups.google.com/forum/?hl=



# Scalability issues

The following part is from this discussion https://groups.google.com/forum/?hl=sv#!topic/redis-db/B0_fvfDWLGM and it describes the scalability issue that pubsub has and the performance that goes with it when used in a cluster environment.

according to [1] and [2] PubSub works by broadcasting every publish to every other
Redis Cluster node. This limits the PubSub throughput to the bisection bandwidth
of the underlying network infrastructure divided by the number of nodes times
message size. So if a typical message has 1KB, the cluster has 10 nodes and
bandwidth is 1 GBit/s, throughput is already limited to 12.5K RPS. If we increase
the message size to 5 KB and the number of nodes to 50, we only get 500 RPS
much less than a single Redis instance could service (>100K RPS), while putting
maximum pressure on the network. PubSub thus scales linearly wrt. to the cluster size,
but in the the negative direction!



# How pubsub works in StrictRedisCluster

In `0.2.0` a first solution to pubsub problem was implemented, but it contains some limitations.
Expand Down

0 comments on commit 92810eb

Please sign in to comment.