# Redis

REmote DIctionary Service is a key-value database.

- [Official docs](https://redis.io/documentation)
- [Use cases](https://redislabs.com/solutions/use-cases/)
- More about [redis-py](https://github.com/andymccurdy/redis-py)


## Concepts

Redis is a very simple database conceptually. From a programmer perspective, it's as if you can magically persist simple values, dictionaries, sets, lists, and priority queues, so that they are usable from other programs, possibly residing in other computers. The API is simple to use. And it is an in-memory database, hence extremely fast.

More advanced concepts

- Pipelines
- Expiring values
- Publish-subscribe model

## Connect to database

In [None]:
import redis

#### Providing access information

It is common to keep access configuration information to services such as a database or cloud platform in a local file - here we use YAML.

**Note**: This file MUST be listed in `.gitignore` - otherwise anyone with access to your repository knows your password!

In [None]:
%%file redis_auth_config.yaml
# This would normally live on disk and not be in a notebook!

host: 'localhost'
port: 6379
password: 

In [None]:
import yaml

with open('redis_auth_config.yaml') as f:
    auth = yaml.load(f, Loader=yaml.FullLoader)
auth

In [None]:
r = redis.Redis(
    host = auth['host'],
    port = auth['port'],
    password = auth['password']
)

In [None]:
r.ping()

## Clear database

In [None]:
r.flushdb()

## Simple data types

#### Set and get a single value

In [None]:
r.set('a', 'adenosine')

In [None]:
r.get('a')

#### Set and get multiple values

In [None]:
r.mset(dict(c='cytosine', t='thymidine', g='guanosine'))

In [None]:
r.mget(list('tcga'))

#### Deletion

In [None]:
r.delete('a')

In [None]:
r.keys()

In [None]:
r.delete('c', 't', 'g')

In [None]:
r.keys()

### Transactions

Transactions are achieved by creating and executing pipeline. This is useful not just for atomicity, but also to reduce communication costs.

In [None]:
pipe = r.pipeline()
(
pipe.set('a', 0).
    incr('a').
    incr('a').
    incr('a').
    execute()
)

In [None]:
r.get('a')

### Expiring values

You can also find the time to expiry with `ttl` (time-to-live) and convert from volatile to permanent with `persist`

In [None]:
import time

In [None]:
r.setex('foo', 3, 'bar')
print('get', r.get('foo'))
time.sleep(1)
print('ttl', r.ttl('foo'))
time.sleep(1)
print('ttl', r.ttl('foo'))
time.sleep(1)
print('ttl', r.ttl('foo'))
time.sleep(1)
print('get', r.get('foo'))

#### Alternative

In [None]:
r.set('foo', 'bar')
r.expire('foo', 3)
print(r.get('foo'))
time.sleep(3)
print(r.get('foo'))

## Complex data types

In [None]:
import warnings

warnings.simplefilter('ignore', DeprecationWarning)

### Hash

In [None]:
r.hmset('nuc', dict(a='adenosine', c='cytosine', t='thymidine', g='guanosine'))

In [None]:
r.hget('nuc', 'a')

In [None]:
r.hmget('nuc', list('ctg'))

In [None]:
r.hkeys('nuc')

In [None]:
r.hvals('nuc')

### List

In [None]:
r.rpush('xs', 1, 2, 3)

In [None]:
r.lpush('xs', 4, 5, 6)

In [None]:
r.llen('xs')

In [None]:
r.lrange('xs', 0, r.llen('xs'))

In [None]:
r.lrange('xs', 0, -1)

#### Using list as a queue

In [None]:
r.lpush('q', 1, 2, 3)

In [None]:
while r.llen('q'):
    print(r.rpop('q'))

#### Using list as stack

In [None]:
r.lpush('q', 1, 2, 3)

In [None]:
while r.llen('q'):
    print(r.lpop('q'))

#### Transferring values across lists

In [None]:
r.lpush('l1', 1,2,3)

In [None]:
while r.llen('l1'):
    r.rpoplpush('l1', 'l2')
r.llen('l1'), r.llen('l2')

In [None]:
for key in r.scan_iter('l2'):
    print(key)

In [None]:
r.lpush('l1', 1,2,3)

### Sets

In [None]:
r.sadd('s1', 1,2,3)

In [None]:
r.sadd('s1', 2,3,4)

In [None]:
r.smembers('s1')

In [None]:
r.sadd('s2', 4,5,6)

In [None]:
r.sdiff(['s1', 's2'])

In [None]:
r.sinter(['s1', 's2'])

In [None]:
r.sunion(['s1', 's2'])

### Sorted sets

This is equivalent to a priority queue.

In [None]:
r.zadd('jobs', 
       dict(job1=3, 
            job2=7, 
            job3=1, 
            job4=2,
            job5=6)
      )

In [None]:
r.zincrby('jobs', 2, 'job5')

In [None]:
r.zrange('jobs', 0, -1, withscores=True)

In [None]:
r.zrevrange('jobs', 0, -1, withscores=True)

#### Union and intersection store

This just creates new sets from the union and intersection respectively.

In [None]:
s1 = 'time flies like an arrow'
s2 = 'fruit flies like a banana'

In [None]:
from collections import Counter

In [None]:
c1 = Counter(s1.split())

In [None]:
c2 = Counter(s2.split())

In [None]:
r.zadd('c1', c1)

In [None]:
r.zadd('c2', c2)

In [None]:
r.zrange('c1', 0, -1, withscores=True)

In [None]:
r.zrange('c2', 0, -1, withscores=True)

In [None]:
r.zunionstore('c3', ['c1', 'c2'])

In [None]:
r.zrange('c3', 0, -1, withscores=True)

In [None]:
r.zinterstore('c4', ['c1', 'c2'])

In [None]:
r.zrange('c4', 0, -1, withscores=True)

### Publisher/Subscriber

![](https://making.pusher.com/images/2017-03-01-redis-pubsub-under-the-hood/clients.svg)

Source: https://making.pusher.com/redis-pubsub-under-the-hood/

In [None]:
help(r.pubsub)

In [None]:
p = r.pubsub()

#### Channels

In [None]:
p.subscribe('python', 'perl', 'sql')

In [None]:
m = p.get_message()
while m:
    print(m)
    m = p.get_message()

In [None]:
p.channels

In [None]:
p2 = r.pubsub()

In [None]:
p2.psubscribe('p*')

In [None]:
p2.patterns

#### Messages

From [redis-puy](https://github.com/andymccurdy/redis-py)

Every message read from a PubSub instance will be a dictionary with the following keys.

- type: One of the following: 'subscribe', 'unsubscribe', 'psubscribe', 'punsubscribe', 'message', 'pmessage'
- channel: The channel [un]subscribed to or the channel a message was published to
- pattern: The pattern that matched a published message's channel. Will be None in all cases except for 'pmessage' types.
- data: The message data. With [un]subscribe messages, this value will be the number of channels and patterns the connection is currently subscribed to. With [p]message messages, this value will be the actual published message.

In [None]:
r.publish('python', 'use blank spaces')
r.publish('python', 'no semi-colons')
r.publish('perl', 'use spaceship operator')
r.publish('sql', 'select this')
r.publish('haskell', 'functional is cool')

In [None]:
m = p.get_message()
while m:
    print(m)
    m = p.get_message()

In [None]:
p.unsubscribe('python')

In [None]:
p.channels

In [None]:
r.publish('python', 'use blank spaces 2')
r.publish('python', 'no semi-colons 2')
r.publish('perl', 'use spaceship operator 2')
r.publish('sql', 'select this 2')
r.publish('haskell', 'functional is cool 2')

In [None]:
m = p.get_message()
while m:
    print(m)
    m = p.get_message()

In [None]:
m = p2.get_message()
while m:
    print(m)
    m = p2.get_message()

### Multiple databases

In [None]:
r2 = redis.Redis(db=1)
r2.flushdb()

In [None]:
for c in ['c1', 'c2', 'c3', 'c4']:
    r.move(c, 1)

In [None]:
for key in r2.scan_iter('c?'):
    print(r2.zrange(key, 0, -1, withscores=True))

### Clean up

There is  no need to close the connections when we use the `Redis()` object. This is taken care of automatically


```python
def execute_command(self, *args, **options):
    "Execute a command and return a parsed response"
    pool = self.connection_pool
    command_name = args[0]
    connection = pool.get_connection(command_name, **options)
    try: 
        connection.send_command(*args)
        return self.parse_response(connection, command_name, **options)
    except (ConnectionError, TimeoutError) as e:
        connection.disconnect()
        if not connection.retry_on_timeout and isinstance(e, TimeoutError):
            raise
        connection.send_command(*args)
        return self.parse_response(connection, command_name, **options)
    finally:
        pool.release(connection)
 ```

#### Benchmark redis

In [None]:
%%bash 

redis-benchmark -q -n 10000 -c 50