# Introduction:

### Redis:
   <a href="https://redis.io/">Redis</a> Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams.<br/>
   
<img src="https://www.zend.com/sites/zend/files/image/2019-09/logo-redis.jpg" width ="250" >


#### <a href='https://redislabs.com/redis-enterprise/data-structures/'>Redis Data Structures</a>
* Redis is not a plain key-value store, it is actually a data structures server, supporting different kinds of values.
* An introduction to Redis data types and abstractions https://redis.io/topics/data-types-intro
* Redis keys are always strings.


<img src='https://redislabs.com/wp-content/uploads/2020/06/key-value-data-stores-2-v2-920x612.png' >

### How To Query Redis!

- Commands for each data type for common access patterns, with bulk operations, and partial transaction support.

### PreLab

#### 1. Install Redis on Windows
- Redis is a cross-platform DB, We can install it on Linux, or Windows, ..etc.
- There are two ways to install Redis under Windows
    - Download the latest Redis .msi file from https://github.com/MSOpenTech/redis/r... and install it. 
    
    - You can choose either from these sources
        - https://github.com/microsoftarchive/redis/releases or
        - https://github.com/rgl/redis/downloads

- Personally I prepared the first option
- Download Redis-x64-2.8.2104.zip
- Extract the zip to prepared directory
- run redis-server.exe
- then run redis-cli.exe
- For more info follow this setup-video tutourial (https://www.youtube.com/watch?v=188Fy-oCw4w)


#### Linux and Debian 

- Even quicker and dirtier instructions for Debian-based Linux distributions are as follows:
    - download Redis from http://redis.io/download 
    - extract, run make && sudo make install
    - Then run sudo python -m easy_install redis hiredis (hiredis is an optional performance-improving C library).

#### 2. Install the Python Package ("<a href='https://pypi.org/project/redis/'>redis</a>") to connecto to Redis 
- use th command ```pip install redis``` in your command line.


#### (Extra) Accessing Redis from Command Line:
- Add the Redis installation "/home" and "/bin" directories to the enviroment variables.
- start Redis server in one command window(CMD, poweshell, ..etc)using the command ```redis-server```.
- In anoher command window, start your Redis Client using the command ```redis-cli```
- Now you have the Redis Client Shell connected to the default <b>db0</b> DB. 

In [None]:
! pip install redis

In [97]:
import redis
from pprint import pprint
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

##### get a connection to redis server, using the url and the port, and Db

In [2]:
r = redis.Redis(host='localhost', port=6379, db=0)

### First Steps in Redis

#### Demo the string keys

In [3]:
from time import sleep
r.set('language','Python')

True

##### Key expiration (e.g Think about Sessions management )


- By default keys are retained, but we can make our keys (data) vanish after a specified time.
- This can be setup while creating the key, or for already existing keys.

In [4]:
print(r.expire('language', 10))
print(r.ttl('language'))
sleep(3)
print(r.ttl('language'))

True
10
7


#### Setting multiple String Keys, values 

In [5]:
r.mset({"Croatia": "Zagreb", "Bahamas": "Nassau"})

True

In [6]:
r.get("Croatia")

b'Zagreb'

#### Demo The Sets 

In [7]:
### demo the sets

r.sadd('pythonlist',"value1","value2","value3","value4")
r.sadd('powershelllist',"value4","value5","value6","value7")

#Intersction
print(r.sinter('pythonlist','powershelllist'))
#Union
print(r.sunion('pythonlist','powershelllist'))

#Length
print(r.scard('pythonlist'))
#Lenght
print(r.scard('powershelllist'))

{b'value4'}
{b'value3', b'value5', b'value4', b'value7', b'value2', b'value1', b'value6'}
4
4


#### Demo The HashSets 

In [8]:
### demo the hashes

r.hset('Hero','Name','Drow Ranger')
r.hset('Hero','Health','600')
r.hset('Hero','Mana','200')

print(r.hgetall('Hero'))

{b'Name': b'Drow Ranger', b'Health': b'600', b'Mana': b'200'}


### Fuller Example (Hats-Shop Website Scenairo)

It’s time to break out a fuller example. Let’s pretend we’ve decided to start a lucrative websit that sells hats, and hired you to build the site.

* You’ll use Redis to handle some of the product catalog, inventorying, and bot traffic detection for PyHats.com.

* It’s day one for the site, and we’re going to be selling three limited-edition hats. 
* Each hat gets held in a Redis hash of field-value pairs, and the hash has a key that is a prefixed random integer , such as hat:56854717. Using the hat: prefix is Redis convention for creating a sort of namespace within a Redis database:

In [216]:
import random

random.seed(444)
hats = {f"hat:{random.getrandbits(32)}": i for i in (
    {
        "color": "black",
        "price": 49.99,
        "style": "fitted",
        "quantity": 1000,
        "npurchased": 0,
    },
    {
        "color": "maroon",
        "price": 59.99,
        "style": "hipster",
        "quantity": 500,
        "npurchased": 0,
    },
    {
        "color": "green",
        "price": 99.99,
        "style": "baseball",
        "quantity": 200,
        "npurchased": 0,
    })
}


In [217]:
with r.pipeline() as pipe:
    for h_id, hat in hats.items():
        pipe.hmset(h_id, hat)
        
    pipe.execute()

### Query by Hat id (GetAll fields)

In [218]:
pprint(r.hgetall("hat:56854717"))

{b'color': b'green',
 b'npurchased': b'0',
 b'price': b'99.99',
 b'quantity': b'200',
 b'style': b'baseball'}


### Query by Hat id (Get Specific Fields)

In [220]:
pprint(r.hmget("hat:56854717","color", "style", "price"))

[b'green', b'baseball', b'99.99']


### Check Existence of Specific Hat

In [221]:
r.exists("hat:56854717")

1

In [199]:
r.keys()

[b'hat:1236154736', b'hat:56854717', b'hat:1326692461']

In [213]:
r.hincrby("hat:56854717", "quantity", -1)

-1

In [201]:
r.hget("hat:56854717", "quantity")

b'199'

In [202]:
r.hincrby("hat:56854717", "npurchased", 1)

1


Note: HINCRBY still operates on a hash value that is a string, but it tries to interpret the string as a base-10 64-bit signed integer to execute the operation.

This applies to other commands related to incrementing and decrementing for other data structures, namely INCR, INCRBY, INCRBYFLOAT, ZINCRBY, and HINCRBYFLOAT. You’ll get an error if the string at the value can’t be represented as an integer.


In [203]:
for key in r.keys("hat*"):
    print(r.hgetall(key))

{b'color': b'maroon', b'price': b'59.99', b'style': b'hipster', b'quantity': b'500', b'npurchased': b'0'}
{b'color': b'green', b'price': b'99.99', b'style': b'baseball', b'quantity': b'199', b'npurchased': b'1'}
{b'color': b'black', b'price': b'49.99', b'style': b'fitted', b'quantity': b'1000', b'npurchased': b'0'}


#### Insert one more Item(Hash) in the "Hats" HashSet

In [150]:
r.hmset(f"hat:{random.getrandbits(32)}",{
        "color": "black",
        "price": 60.99,
        "style": "fedora",
        "quantity": 50,
        "npurchased": 0,
    })

True

#### Check if the new item ("Hat") is added !

In [151]:
for key in r.keys("hat*"):
    print(r.hgetall(key))

{b'color': b'black', b'price': b'60.99', b'style': b'fedora', b'quantity': b'50', b'npurchased': b'0'}
{b'color': b'maroon', b'price': b'59.99', b'style': b'hipster', b'quantity': b'500', b'npurchased': b'0'}
{b'color': b'green', b'price': b'99.99', b'style': b'baseball', b'quantity': b'199', b'npurchased': b'1'}
{b'color': b'black', b'price': b'49.99', b'style': b'fitted', b'quantity': b'1000', b'npurchased': b'0'}


In [152]:
r.hget("hat:56854717","price")

b'99.99'

###  Count Of Hats 

In [204]:
len(r.keys("hat*"))

3

###  Filtering >> FilterOut Hats with prices less than 60

In [154]:
for key in r.keys("hat*"):
    price = float(r.hvals(key)[1])
    #print(price)
    print(r.hgetall(key)) if (price > 60) else None

{b'color': b'black', b'price': b'60.99', b'style': b'fedora', b'quantity': b'50', b'npurchased': b'0'}
{b'color': b'green', b'price': b'99.99', b'style': b'baseball', b'quantity': b'199', b'npurchased': b'1'}


#### Transactions and Keeping Atomicity in Redis

Redis allows the execution of a group of commands in a single step, with two important guarantees:
* All the commands in a transaction are serialized and executed sequentially. It can never happen that a request issued by another client is served in the middle of the execution of a Redis transaction. This guarantees that the commands are executed as a single isolated operation.

* Either all of the commands or none are processed, so a Redis transaction is also atomic.

In redis-py, Pipeline is a transactional pipeline class by default. This means that, even though the class is actually named for something else (pipelining), it can be used to create a transaction block also.

In Redis, a transaction starts with <b> MULTI </b> and ends with <b>EXEC<b>:
    * Everything in between is executed as one all-or-nothing buffered sequence of commands.
    * Methods that you call on pipe effectively buffer all of the commands into one, and then send them to the server in a single request
    
 

In [206]:
def buyitem(r: redis.Redis, itemid: int) -> None:
    with r.pipeline() as pipe:
        error_count = 0
        while True:
            try:
                # Get available inventory, watching for changes
                # related to this itemid before the transaction
                pipe.watch(itemid)

                nleft: bytes = r.hget(itemid, "quantity")
                if nleft > b"0":
                    pipe.multi()
                    pipe.hincrby(itemid, "quantity", -1)
                    pipe.hincrby(itemid, "npurchased", 1)
                    pipe.execute()
                    break
                else:
                    # Stop watching the itemid and raise to break out
                    pipe.unwatch()
                    raise OutOfStockError(f"Sorry, {itemid} is out of stock!")
            
            except redis.WatchError:
                # Log total num. of errors by this user to buy this item,
                # then try the same process again of WATCH/HGET/MULTI/EXEC

                error_count += 1

                logging.warning("WatchError #%d: %s; retrying",error_count, itemid)

            return None

In [207]:
buyitem(r, "hat:56854717")
buyitem(r, "hat:56854717")
buyitem(r, "hat:56854717")

In [211]:
r.hmget("hat:56854717", "quantity", "npurchased")

[b'0', b'200']

Now, when some poor user is late to the game, they should be met with an <b>"OutOfStockError"</b> that tells our application to render an error message page on the frontend

In [212]:
buyitem(r, "hat:56854717")

NameError: name 'OutOfStockError' is not defined

In [210]:
# Buy remaining 196 hats for item 56854717 and deplete stock to 0
for _ in range(196):
    buyitem(r, "hat:56854717")

r.hmget("hat:56854717", "quantity", "npurchased")

[b'0', b'200']

### Delete elements (Delete hats with 'black' color)

In [155]:
for key in r.keys("hat*"):
    color = r.hvals(key)[0]
    for field in r.hkeys(key):
        r.hdel(key,field) if (color == b'black') else None

In [156]:
for key in r.keys("hat*"):
    print(r.hgetall(key))

{b'color': b'maroon', b'price': b'59.99', b'style': b'hipster', b'quantity': b'500', b'npurchased': b'0'}
{b'color': b'green', b'price': b'99.99', b'style': b'baseball', b'quantity': b'199', b'npurchased': b'1'}


### Delete elements (Delete hats with quantity less than 500)

In [158]:
for key in r.keys("hat*"):
    qty= int( r.hvals(key)[3] )
    r.delete(key) if (qty <500) else None

In [161]:
for key in r.keys("hat*"):
    print(r.hgetall(key))

{b'color': b'maroon', b'price': b'59.99', b'style': b'hipster', b'quantity': b'500', b'npurchased': b'0'}


In [163]:
updatedValue=     {
        "color": "brown",
        "price": 75.90,
        "style": "hipster",
        "quantity": 500,
        "npurchased": 0,
        "updated":1
    }
r.hmset('hat:1236154736', updatedValue)

True

In [164]:
for key in r.keys("hat*"):
    print(r.hgetall(key))

{b'color': b'brown', b'price': b'75.9', b'style': b'hipster', b'quantity': b'500', b'npurchased': b'0', b'updated': b'1'}


In [271]:
r.keys()

[b'test-result']

#### PUB/SUB in REDIS

- <b>"Pub/Sub"</b> aka Publish-Subscribe pattern is a pattern in which there are three main components, **sender**, **receiver** & a **broker**. 
- The communication is processed by the broker, it helps the sender or publisher to publish information and deliver that information to the receiver or subscriber.


#### Example of Consumer
- The following Consumer subscribes to 'Two' Channels:
    - (Tartu), 
    - and to any channle taht starts with the pattern 'DataEng_Students:'
- If the Consumer publishes something related to these channels, this will be listened and sent to this consumer.

In [282]:
# inspired by: https://gist.github.com/jobliz/2596594

import redis
import threading

class Listener(threading.Thread):
    def __init__(self, r):
        threading.Thread.__init__(self)
        self.redis = r
        self.pubsub = self.redis.pubsub()
        
        
        self.pubsub.subscribe(['TartuUniv'])
        self.pubsub.psubscribe(['DE_Students:*'])

    def work(self, item):
        print (item['channel'], ":", item['data'])

    def run(self):
        for item in self.pubsub.listen():
            if item['data'] == "KILL":
                self.pubsub.unsubscribe()
                print (self, "unsubscribed and finished")
                break
            else:
                self.work(item)

if __name__ == '__main__':
    r = redis.Redis('127.0.0.1')
    client = Listener(r)
    client.start()

b'TartuUniv' : 1
b'DE_Students:*' : 2
b'DE_Students:1415' : b'John Doe'
b'DE_Students:jane' : b'Jane Kidman'
b'TartuUniv'b'TartuUniv' : b'KILL'
 : b'KILL'
b'DE_Students:*' : b'KILL'


#### Start Publishing Some data/messages to diiferent Channels
- In a new command window, Run the following commands:
    - (# opens a repl, all subsequent commands should show something in the first terminal)
```
redis-cli 
> publish TartuUniv "new_course_created DataEngineering" 
> publish TartuUniv "new_job_posted Senior_Researcher"
> publish DE_Students:1415 "John Doe"
> publish DE_Students:jane "Jane Kidman"
> publish Tallin "nobody listens"
> publish TartuUniv KILL # terminate (listener.py)
```


##### Wrtie what you could notice:

In [278]:
#YOur Answer Here!!

##### Another Example of Redis Pub/SUB 

Let me explain it with an example. Let’s assume, Joe is the owner of a Music Shop where he sells music of different Genres. Alice is a Musician who publishes/sells her music at Joe’s Shop. And, Bob is Joe’s Customer who buys music from Joe. Joe keeps a list of his customers and their interests, hence he knows Bob likes Classical Music. Whenever Alice composes a Classical Music album in Joe’s shop, Joe delivers it to Bob. The interesting part is Bob doesn’t necessarily have to know who created the Music and Alice also doesn’t have to know who listens to her music. 

In [None]:
import redis

channel='classic'

client =redis.Redis(host='127.0.0.1', port=6379)

p= client.pubsub()
p.subscribe(channel)

while True:
    message=p.get_message()
    if message and not message['data']==1:
        #You can do any kind of computations on the coming/readed messages !!
        message=message['data'].decode('utf-8')
        song,singer=message.split(':')
        #here we just split the message and read it as song,and singer, but you can do any computations
        print("SONG: ", song)
        print("SINGER: ", singer)

### Excercise : 
#### Create a simple Redis DB out of this relational model

- <b>Note:</b> 
- Redis Store is not invented for keeping relational data and posing structured queries, but for fast accesible data. 
- But just for keeping consistency with the our running example of how to represent the relational model in different NoSQL backends, we will try this out.


<b>Hints</b>:
* It’s quite straightforward to map your table to Redis data structures.
* Hash, Sorted Set and Set are the most useful data structures in this effort.
* This means that relationships are typically represented by sets.
* A set can be used to represent a one-way relationship, so you need one set per object to represent a many-to-many relationship.

#### The DataBase Model: 


This is  a toy DB about movies and actors who played roles in these movies. This DB is consisted of  

- A "Person" table who has a unique id, and a name fields.

- Another "Movie" table that has a unique id, a title, a country where it was made, and a year when it was released.

- There is (m-n) or "many-many" relationship between these two tables (i.e basically, many actors can act in many movies, and the movie include many actors)
- Therefore, we use the "Roles" table in which we can deduct which person has acted in which movie, and what role(s) they played.

<img src="RDBSchema.png" alt="3" border="0">

* Notice That we will change the DB to "db(1)"

    - By default there are <b>16</b> databases (indexed from 0 to 15), and you can navigate between them using <b>select</b> command.
    - Number of databases can be changed in the <b>redis config</b> file with databases setting.

    - By default, it selects the <b> database 0 </b>. 
    - To select a specified one, use <b> "redis-cli -n 2 " </b>  or use <b>   ("SELECT 2") </b> if you are already in the command line with one DB and wants to switch to DB 2 . 

### CREATE && INSERT in REDIS

#### 1. Creating a HashSet for the movies

In [98]:
redis1= redis.Redis(host="127.0.0.1",port=6379, db=1)

In [99]:
#Function that insert multiple hashes of (key and value)into a Hashet
def setHash(mkey,mval):
    redis1.hmset(mkey,mval)

##### Movies

In [79]:
# id,title,country,year
# 1,Wall Street,USA,1987
# 2,The American President,USA,1995
# 3,The Shawshank Redemption,USA,1994

moviesLst=[
    
    ( "movie:1",{ 'title': 'Wall Street' , 'country':'USA', 'year':'1987' } ),
    ( "movie:2",{ 'title': 'The American President' , 'country':'USA', 'year':'1995' } ),
    ( "movie:3",{ 'title': 'Shawshank Redemption' , 'country':'USA', 'year':'1994' } )
]

for (key,val) in moviesLst:
    setHash(key,val)

#### Persons

In [80]:
# id,name
# 1,Charlie Sheen
# 2,Michael Douglas
# 3,Martin Sheen
# 4,Morgan Freeman

personsLst=[
    
    ( "person:1",{ 'name': 'Charlie Sheen' } ),
    ( "person:2",{ 'name': 'Michael Douglas' } ),
    ( "person:3",{ 'name': 'Martin Sheen' } ),
    ( "person:4",{ 'name': 'Morgan Freeman' } )
]

for (key,val) in personsLst:
    setHash(key,val)

#### Maintianig  the relationships in Redis

In [81]:
# function that  create a set of a given values

def addSet(mkey,mvals):
    redis1.sadd(mkey,*mvals)

In [82]:
# Let's establish the many-to-many relationship 'roles'

# For each movie, we keep a set of reference on the persons
movie_persons_list=[("movie:1:actors", ["person:1", "person:2", "person:3"] ),
                    ("movie:2:actors", ["person:2", "person:3" ]) ,
                    ("movie:3:actors", ["person:4" ])
                   ]
             
for (mkey,mvals) in movie_persons_list:
    addSet(mkey,mvals)
    
    
    
# For each (person), we keep a set of reference on the (Movies)

person_movies_list = [("person:1:movies", ["movie:1"] ),
                      ("person:2:movies", ["movie:1", "movie:2"] ),
                      ("person:3:movies", ["movie:1", "movie:2"] ),
                      ("person:4:movies", ["movie:3"] ),
                     ]

for (mkey,mvals) in person_movies_list:
    addSet(mkey,mvals)


# For each (person), We keep a set of reference on the (Movies)    

person_movies_roles =[("person:1:movie:1:roles", ["Bud Fox"]),
                      ("person:2:movie:1:roles", ["Gordon Gekko"]),
                      ("person:2:movie:2:roles", ["President Andrew Shepherd"]),
                      ("person:3:movie:1:roles", ["Carl Fox"]),
                      ("person:3:movie:2:roles", ["A.J. MacInerney"]),
                      ("person:4:movie:3:roles", ["Ellis Boyd 'Red' Redding"])
]


for (mkey,mvals) in person_movies_roles:
    addSet(mkey,mvals)


### Querying our Data

#### <font color= 'red'>Important to remeber:</font>
- The only way to query Redis is using the "Keys", so we mainly depend on the programming cabailities of python, and the datascience cabailities of Pandas to make the data science part :)
- One of the Cons of Redis is that youy can't query on values!!


#### Get Persons in your Redis DB

In [100]:
for person in redis1.keys("person:?"):
    print(redis1.hgetall(person))

{b'name': b'Martin Sheen'}
{b'name': b'Charlie Sheen'}
{b'name': b'Michael Douglas'}
{b'name': b'Morgan Freeman'}


#### We can use pandas for showing the results in a better way !

In [127]:
data=[]
for key in redis1.keys("person:?"):
    name = redis1.hvals(key)[0].decode()
    data. append({'name': name})


df=pd.DataFrame(data)
      
display(df)

Unnamed: 0,name
0,Martin Sheen
1,Charlie Sheen
2,Michael Douglas
3,Morgan Freeman


#### get persons with names start with 'C' letter

In [132]:
data=[]
for key in redis1.keys("person:?"):
    name = redis1.hvals(key)[0].decode()
    print(redis1.hgetall(key)) if (name.startswith('C')) else None

{b'name': b'Charlie Sheen'}


In [133]:
# Again, if tou want to use Pandas to show it in abetter way
data=[]
for key in redis1.keys("person:?"):
    name = redis1.hvals(key)[0].decode()
    data. append({'name': name}) if (name.startswith('C')) else None
    #print(redis1.hgetall(key)) if (name.startswith('C')) else None
df=pd.DataFrame(data)

display(df)

Unnamed: 0,name
0,Charlie Sheen


### Sorting the Results
#### Get All Movies , sorted from recent to old

In [141]:
movieData=[]
for key in redis1.keys("movie:?"):
    title = str(redis1.hvals(key)[0].decode())
    #country = str(redis1.hvals(key)[1].decode())
    year = int(redis1.hvals(key)[2])
    movieData.append({'title': title, 
                      'year':year })

#use sort_values for sorting the dataframe 
df=pd.DataFrame(movieData).sort_values(by=['year'],ascending=False)

display(df)

Unnamed: 0,title,year
2,Wall Street,2000
1,The American President,1995
0,Shawshank Redemption,1994


### Filtering the Results
#### Get All Movies released in the 90s (after year (1990) and before 2000)

In [85]:
for key in redis1.keys("movie:?"):
    year = int(redis1.hvals(key)[2])
    print(redis1.hgetall(key)) if (year>1990 and year <2000) else None

{b'title': b'Shawshank Redemption', b'country': b'USA', b'year': b'1994'}
{b'title': b'The American President', b'country': b'USA', b'year': b'1995'}


In [144]:
#### using pandas DFs

movieData=[]
for key in redis1.keys("movie:?"):
    title = str(redis1.hvals(key)[0].decode())
    #country = str(redis1.hvals(key)[1].decode())
    year = int(redis1.hvals(key)[2])
    movieData.append({'title': title, 
                      'year':year })
    
df=pd.DataFrame(movieData)
filteredDF=df[(df.year > 1990) & (df.year <2000)]


display(filteredDF)

Unnamed: 0,title,year
0,Shawshank Redemption,1994
1,The American President,1995


### Querying from multiple tables

#### Get Movies and Actors from the DB


In [145]:
for key in redis1.keys("person:?:movies"):
    # personKey deserliaze the key get 'person:1', 'person:2',...
    personKey=key.decode().split(':')[0]+ ":" +key.decode().split(':')[1]
    # get the persons from the hashsets using the above key
    person=redis1.hgetall(personKey)
    # SMEMBERS  get the movies of each person using the 'key' in the beginning
    personMoviesKeys=redis1.smembers(key)
    for movKey in personMoviesKeys:
        print(person.get(b'name').decode(),":   ", redis1.hgetall(movKey).get(b'title').decode())  

Charlie Sheen :    Wall Street
Michael Douglas :    Wall Street
Michael Douglas :    The American President
Morgan Freeman :    Shawshank Redemption
Martin Sheen :    Wall Street
Martin Sheen :    The American President


In [153]:
#using the Pandas DFs
person_movies=[]

for key in redis1.keys("person:?:movies"):
    personKey=key.decode().split(':')[0]+ ":" +key.decode().split(':')[1]
    
    person=redis1.hgetall(personKey)
    
    personMoviesKeys=redis1.smembers(key)
    for movKey in personMoviesKeys:
        person_movies.append({"Actor": person.get(b'name').decode(), "title":redis1.hgetall(movKey).get(b'title').decode()})
        
df1= pd.DataFrame(person_movies)
display(df1)


Unnamed: 0,Actor,title
0,Charlie Sheen,Wall Street
1,Michael Douglas,Wall Street
2,Michael Douglas,The American President
3,Morgan Freeman,Shawshank Redemption
4,Martin Sheen,Wall Street
5,Martin Sheen,The American President


### Aggregations in Redis

#### Get count of "Movies" in your DB

- Note: The best way is to store the sum as a separate key, and to update whenever you add/remove a value from your set/hash/zset.

In [87]:
print("Count Of Movies: ",len(redis1.keys("movie:?")) )

Count Of Movies:  3


#### In this DB, for every "Actor" get the number of movies he played

In [94]:
for key in redis1.keys("person:?:movies"):
    # personKey deserliaze the key get 'person:1', 'person:2',...
    personKey=key.decode().split(':')[0]+ ":" +key.decode().split(':')[1]
    # get the persons from the hashsets using the above key
    person=redis1.hgetall(personKey)
    # get Cardinality of the Set of movies for each person
    personMoviesCount= redis1.scard(key)
    
    print(person.get(b'name').decode(),":(",str(personMoviesCount)+")")
    

Charlie Sheen :( 1)
Michael Douglas :( 2)
Morgan Freeman :( 1)
Martin Sheen :( 2)


In [155]:
## using Pnadas DFs
person_countmovies=[]
for key in redis1.keys("person:?:movies"):
    # personKey deserliaze the key get 'person:1', 'person:2',...
    personKey=key.decode().split(':')[0]+ ":" +key.decode().split(':')[1]
    # get the persons from the hashsets using the above key
    person=redis1.hgetall(personKey)
    # get Cardinality of the Set of movies for each person
    personMoviesCount= redis1.scard(key)
    
    person_countmovies.append({"Actor": person.get(b'name').decode(), "Movies_Count" :str(personMoviesCount)})

display(pd.DataFrame(person_countmovies))    

Unnamed: 0,Actor,Movies_Count
0,Charlie Sheen,1
1,Michael Douglas,2
2,Morgan Freeman,1
3,Martin Sheen,2


#### In this DB, List the movies that every Actor Played

In [95]:
for key in redis1.keys("person:?:movies"):
    # personKey deserliaze the key get 'person:1', 'person:2',...
    personKey=key.decode().split(':')[0]+ ":" +key.decode().split(':')[1]
    # get the persons from the hashsets using the above key
    person=redis1.hgetall(personKey)
    # get the movies of each person using the 'key' in the beginning
    personMoviesKeys=redis1.smembers(key)
    
    personMovieLst=[]
    for movKey in personMoviesKeys:
        personMovieLst.append (redis1.hgetall(movKey).get(b'title').decode() )
    
    print(person.get(b'name').decode(), ": ", personMovieLst)

Charlie Sheen :  ['Wall Street']
Michael Douglas :  ['Wall Street', 'The American President']
Morgan Freeman :  ['Shawshank Redemption']
Martin Sheen :  ['Wall Street', 'The American President']


In [270]:
##using Pandas, SQL
import pandasql as ps
import re 

person_movies_data=[]
persons_data=[]
movies_data=[]

for movieKey in redis1.keys("movie:?"):
    movies_data.append({"movie_key":movieKey.decode(), "title":redis1.hgetall(movieKey).get(b'title').decode()})

movies_DF= pd.DataFrame(movies_data)    

for key in redis1.keys("person:?:movies"):

    personKey=key.decode().split(':')[0]+ ":" +key.decode().split(':')[1]
    person=redis1.hgetall(personKey)
    persons_data.append({"personKEY":personKey, "name": person.get(b'name').decode() })

    person_movies_data.append({"personKEY": personKey, "movies":  str(redis1.smembers(key) )})
    
    
persons_DF=pd.DataFrame(persons_data, columns=['personKEY','name'])
person_movies_DF=pd.DataFrame(person_movies_data,columns=['personKEY','movies'])

#display(person_movies_DF)
#display(persons_DF)


query="SELECT name, movies FROM  persons_DF JOIN person_movies_DF ON  persons_DF.personKEY=person_movies_DF.personKEY"
#query="SELECT personKEY,movies FROM person_movies_DF"
result= ps.sqldf(query, locals())

movieLST=[]
for row in result.values:
    movies= row[1].split(',')
    intermediateLst=[]
    for mov in movies:
        #pattern matching to get the keys out of the format example {b'movie:1'} -to-> 'movie:1'
        movK=str(re.search('movie:\d+',mov).group(0))
        intermediateLst.append(redis1.hgetall(movK).get(b'title').decode())
    movieLST.append(intermediateLst)

result.movies=(movieLST)
display(result)

Unnamed: 0,name,movies
0,Charlie Sheen,[Wall Street]
1,Michael Douglas,"[Wall Street, The American President]"
2,Morgan Freeman,[Shawshank Redemption]
3,Martin Sheen,"[Wall Street, The American President]"


### Updating Redis Data
* If the key or hash field already exists in the hash, they are overwritten.


In [96]:

print("'Wall Street' movie  Before updating: ")
wallStreet=redis1.hgetall("movie:1")
pprint(wallStreet)

#update the year the wallstreet movie was released in to be 2000, which is not true BTW :) 
updatedWallStreet = redis1.hset("movie:1", "year", "2000")

print("\n 'Wall Street' movie  After updating the document: ")
wallStreet=redis1.hgetall("movie:1")
pprint(wallStreet)


'Wall Street' movie  Before updating: 
{b'country': b'USA', b'title': b'Wall Street', b'year': b'1987'}

 'Wall Street' movie  After updating the document: 
{b'country': b'USA', b'title': b'Wall Street', b'year': b'2000'}


### Deletions in Redis

####  Delete all the persons with names start with 'M' letter.

In [74]:
for personKey in redis1.keys("person:?"):
    name=redis1.hvals(personKey)[0].decode()
    redis1.delete(personKey) if (name.startswith('M')) else None  