# Week 3: Key-Value Stores (Redis)
### Student ID: B96323
### Subtasks Done: [1,2,3,4]

# Introduction:

### Redis:
   <a href="https://redis.io/">Redis</a> Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams.<br/>
   
<img src="https://www.zend.com/sites/zend/files/image/2019-09/logo-redis.jpg" width ="250" >


#### <a href='https://redislabs.com/redis-enterprise/data-structures/'>Redis Data Structures</a>
* Redis is not a plain key-value store, it is actually a data structures server, supporting different kinds of values.
* An introduction to Redis data types and abstractions https://redis.io/topics/data-types-intro
* Redis keys are always strings.


<img src='https://redislabs.com/wp-content/uploads/2020/06/key-value-data-stores-2-v2-920x612.png' width='500' >

### How To Query Redis!

- Commands for each data type for common access patterns, with bulk operations, and partial transaction support.

### PreLab

#### 1. Install Redis on Windows
- Redis is a cross-platform DB, We can install it on Linux, or Windows, ..etc.
- There are two ways to install Redis under Windows
    - Download the latest Redis .msi file from https://github.com/MSOpenTech/redis/r... and install it. 
    
    - You can choose either from these sources
        - https://github.com/microsoftarchive/redis/releases or
        - https://github.com/rgl/redis/downloads

- Personally I prepared the first option
- Download Redis-x64-2.8.2104.zip
- Extract the zip to the prepared directory
- Run redis-server.exe
- Run redis-cli.exe
- For more info follow this setup-video tutorial (https://www.youtube.com/watch?v=188Fy-oCw4w)


#### Linux and Debian 

- Even quicker and dirtier instructions for Debian-based Linux distributions are as follows:
    - download Redis from http://redis.io/download 
    - extract, run make && sudo make install
    - Then run sudo python -m easy_install redis hiredis (hiredis is an optional performance-improving C library).

#### 2. Install the Python Package ("<a href='https://pypi.org/project/redis/'>redis</a>") to connecto to Redis 
- use th command ```pip install redis``` in your command line.


#### (more) Accessing Redis from Command Line:
- Add the Redis installation "/home" and "/bin" directories to the environment variables.
- start Redis server in one command window(CMD, poweshell, ..etc)using the command ```redis-server```.
- In another command window, start your Redis Client using the command ```redis-cli```
- Now you have the Redis Client Shell connected to the default <b>db0</b> DB. 

In [None]:
! pip install redis

In [1]:
import redis
from pprint import pprint
import pandas as pd
from time import sleep

import warnings
warnings.filterwarnings('ignore')

##### Get a client connection to redis server, using the url and the port, and Db

In [2]:
r = redis.Redis(host='localhost', port=6379, db=0)

## Task 0: First Steps in Redis

### Demo the string keys

In [20]:
r.set('language','Python')

True

In [21]:
r.get('language')

b'Python'

##### Key expiration (e.g Think about Sessions management )


- By default, keys are retained, but we can make our keys (data) vanish after a specified time.
- This can be set while creating the key, or for already existing keys.

In [4]:
## use expire(key,time in_secs), after this time key will vanish
print(r.expire('language', 6))
#ttl(key) time_to_live, checks remaining time to live! 
print(r.ttl('language'))
sleep(3)
print(r.ttl('language'))

True
6
3


#### Check if the key already expired!
- Use exists(key) function

In [8]:
###YOUR CODE HERE
print(r.exists('Python'))

0


#### Setting multiple String Keys, values 

In [10]:
r.mset({"Croatia": "Zagreb", "Bahamas": "Nassau"})

True

#### Get the value of the key 'Croatia'

In [17]:
###YOUR CODE HERE
r.get('Croatia')

b'Zagreb'

#### Set String as JSON value 

In [18]:
r.set('myJsonData' , '{"name": "Ragab", "age": 40}')

True

#### Get the previous JSON value

In [22]:
###YOUR CODE HERE
r.get('myJsonData')

b'{"name": "Ragab", "age": 40}'

#### Rename the key 'myjsonData' into 'myJsonInfo'

In [23]:
r.rename('myJsonData', 'myJsonInfo')

True

#### Delete 'myJsonInfo' key-value pair.

In [24]:
###YOUR CODE HERE
r.delete('myJsonInfo')

1

#### Check if it's deleted already !

In [25]:
# we try to get the value of that key!
print(r.get('myJsonInfo'))

#or we can check if it's not existing any more !
print(r.exists ('myJsonInfo'))

None
0


### Demo The Lists

- Think of Lists as ordered sequence of strings like java ArrayList, javasrcript array, or python n lists.
- We can use lists to implement stacks and queues.
    - If you need a **Queue**, just use **RPUSH** and **LPOP**.
    - If you need a **Stack**, just use **RPUSH** and **RPOP**.
- Lists can **accept duplicates**.
- A single Resid List can hold over **4B** entries!!


#### Create a List of customers, and add elements to it!

In [26]:
r.lpush('customers','Ragab')

1

This will return "1" meaning the list now contains only one element

In [27]:
r.lpush('customers','Riccardo')

2

In [28]:
r.lpush('customers','Riccardo')

3

#### Get current memebrs of the list
- Hint: use <code>lrange</code> function

In [30]:
###YOUR CODE HERE
r.lrange('customers', 0, -1)

[b'Riccardo', b'Riccardo', b'Ragab']

* We can clearly notice that **LPUSH** function/command adds elements to the left of the list.
* and that Lists also **accept duplicate** items.

#### Adding element to the Right of the customers list
- Add customer 'Kim' to the right of the list
- Use <code>rpush</code> function

In [31]:
###YOUR CODE HERE
r.rpush('customers','Kim')

4

In [32]:
#check the list elements again
r.lrange('customers',0,-1)

[b'Riccardo', b'Riccardo', b'Ragab', b'Kim']

#### Insert 'Jan' between Riccardo and Ragab

In [33]:
r.linsert('customers','BEFORE','Ragab','Jan')

5

In [34]:
r.lrange('customers',0,-1)

[b'Riccardo', b'Riccardo', b'Jan', b'Ragab', b'Kim']

#### Get only the first element 

In [35]:
###YOUR CODE HERE
r.lrange('customers',0,0)

[b'Riccardo']

#### Get only the first 3 elements  

In [36]:
r.lrange('customers',0,2)

[b'Riccardo', b'Riccardo', b'Jan']

#### Get the Length of the List
- Use <code>llen</code> function

In [37]:
###YOUR CODE HERE
r.llen('customers')

5

#### Delete the first element on the left

In [38]:
r.lpop('customers')

b'Riccardo'

In [39]:
#check the list elements again
r.lrange('customers',0,-1)

[b'Riccardo', b'Jan', b'Ragab', b'Kim']

#### Delete the first element on the right

In [40]:
###YOUR CODE HERE
r.rpop('customers')

b'Kim'

In [41]:
#check the list elements again
r.lrange('customers',0,-1)

[b'Riccardo', b'Jan', b'Ragab']

#### Notes on the performance of Lists:
- **LPOP**, **LPUSH**, and **LLEN** commands are all **O(n)** cosnstant time operations. 
    - Their performance is independent of the lenght of the list.
- **LRANGE** is **O(s+n)**, such that **s** is the distance of the start offest from the head, and **n** is the number of the elements in the specified range.
    - Thus, we need to be careful with LRANGE especially with extra long lists, or when we retrieve thousands or more elements!!
<img src='ListsPerformance.JPG' width= '200'>

### Demo The Sets

- **Unordered** collection of strings.
- Contains **no duplicates**.
    - This makes Sets supernatual option for **de-dupication** applications.
- Questions that we can Answer using Sets:
    - **Did I see this IP address in the last hour?**
    - **Is this user online?**
    - **Has this URL been balcklisted?**
- All of these questions can be answered in **O(1)** time.
- Sets support standard operations:
    - **Intersection** 
    - **Difference**
    - **Union**

#### Create a Set of online players with the key "players:online" ["Riccardo", and "Ragab"]

In [73]:
r.sadd('players:online',"Riccardo","Ragab")

0

#### Try to another Online-player "Ragab" in the Set
- Write down what did you noticed?!

*Set will not accept the same value "Ragab"*

In [74]:
###YOUR CODE HERE
r.sadd('players:online',"Ragab")

0

#### Check if the Set of Online Players contain the player "Riccardo"

In [75]:
r.sismember('players:online',"Riccardo")

True

#### Check if the Set of Online Players contain the player "Fabiano"

In [76]:
r.sismember('players:online',"Fabiano")

False

#### Create Another Set with the key-name ("Friends") that has ["Riccardo", "Fabiano", "Hassan"]

In [77]:
###YOUR CODE HERE
r.sadd('friends',"Riccardo","Fabiano","Hassan")

1

#### Get the two lists memebers  

In [78]:
print ("FirstSet:" ,r.smembers('players:online'))
print ("SecondSet:",r.smembers('Friends'))

FirstSet: {b'Ragab', b'Riccardo'}
SecondSet: {b'Hassan', b'Fabiano', b'Riccardo'}


#### Get the intersection of these two sets 

In [79]:
#Intersction
print(r.sinter('players:online','friends'))

{b'Riccardo'}


#### Get the Union of these two sets 

In [80]:
###YOUR CODE HERE
print(r.sunion('players:online','friends'))

{b'Hassan', b'Ragab', b'Riccardo', b'Fabiano'}


#### Get the Length of the two Sets 

In [81]:
#Length
print(r.scard('friends'))
print(r.scard('players:online'))

3
2


#### Move "Fabiano" to the Online Players Set

In [82]:
r.smove('friends', 'players','Fabiano')

True

#### Get the Length of the two Sets After this move!

In [83]:
###YOUR CODE HERE
#Length
print(r.scard('friends'))
print(r.scard('players:online'))

2
2


#### Remove "Ragab" from the players Set, and show the palyers Set after this removal

In [84]:
r.srem('players:online',"Ragab")
r.smembers('players:online')

{b'Riccardo'}

### Demo The SORTED SETS
- REDIS sorted sets are **ordered** collections of unique members.
- These memebrs are ordered according to their **asociated score**.
- Whenever you add to the sorted set, you are specifying a **memeber** and a **score**.
- Sorted Sets keep every thing sorted from the begininng.
- Sorted sets are good choice for:
    - **priority queues**
    - **Low-latency leaderboards**
    - **Secondary indexing**
- Questions that we can Answer using Sorted Sets (e.g, in an online-game ):
    - **Who are the top 10 players?
    - **what is the rank of a specific Player?
    - **what is the current score of the player?

#### Let's Create our Leaderboard
- In the scenario of online game, each player will have a score of '**experience**' achieving some tasks/goals,..etc.

#### Initially, Let's give a score of 0 experience to all of our players:
- We have three Players ("Ragab", "Fabianno", and "Riccardo")

In [85]:
r.zadd('players:exp',{'Ragab':0})
r.zadd('players:exp',{'Riccardo':0})
r.zadd('players:exp',{'Fabiano':0})

1

#### Increment the experiernce score of our players
Let's pretend that our players have copleted some missions and they got these experience points 40,60,80 for "Ragab", "Riccardo", and "Fabiano" respectively

In [86]:
print(r.zincrby('players:exp',40,'Ragab'))
print(r.zincrby('players:exp',60,'Riccardo'))
print(r.zincrby('players:exp',80,'Fabiano'))

40.0
60.0
80.0


#### Let's Punish one of the players penalizing him with 5 points of experience

In [87]:
###YOUR CODE HERE
print(r.zincrby('players:exp', -5, 'Ragab'))

35.0


#### Get the Top 3 players in our game

In [88]:
r.zrevrange('players:exp',0,2)

[b'Fabiano', b'Riccardo', b'Ragab']

#### GET the Top 3 players in our game showing their scores

In [89]:
###YOUR CODE HERE
r.zrevrange('players:exp',0,2,'WITHSCORES')

[(b'Fabiano', 80.0), (b'Riccardo', 60.0), (b'Ragab', 35.0)]

#### Get the Ranak of the players "Ragab", and "Fabiano"
- Look at the difference between **Zrank**, and **zrevrank**

In [90]:
print(r.zrevrank('players:exp', 'Ragab'))
print(r.zrevrank('players:exp', 'Fabiano'))

2
0


#### Get the score of the player "Riccardo"

In [None]:
print (r.zscore('players:exp', 'Riccardo'))

### Demo the HASHES 
- Hashes are one of the most useful Redis data structures.
- Hashes are collections of field-value pairs.

#### Let's Create a Hash of Players in an online game

- Each player has the following fields:
    - NAME
    - RACE
    - LEVEL
    - HEALTH
    - GOLD

#### Let's Create our first player ('player:101')

In [91]:
r.hset('player:101','name','Cyclops')
r.hset('player:101','race','Elf')
r.hset('player:101','level',4)
r.hset('player:101','health',20)
r.hset('player:101','gold',500)

1

#### Adding another player (player:102)
- We can use <code>HMSET</code> value pairs to the Hash.

In [92]:
player2 = {"name":"Wolverine", 
 "race":"Elf", 
 "level":6, 
 "health":200, 
 "gold":4000}
#We use HMSET for ading multi-field value pairs to the Hash
r.hmset('player:102', player2)

True

#### Get the information of the player Hashes ('player:101', 'player:102' )

In [93]:
pprint(r.hgetall('player:101'))
print("\n")
pprint(r.hgetall('player:102'))

{b'gold': b'500',
 b'health': b'20',
 b'level': b'4',
 b'name': b'Cyclops',
 b'race': b'Elf'}


{b'gold': b'4000',
 b'health': b'200',
 b'level': b'6',
 b'name': b'Wolverine',
 b'race': b'Elf'}


#### Get the **name** of the second player
- Use <code>hget(Hash_key,field)</code>

In [95]:
###YOUR CODE HERE
r.hget('player:102','name')

b'Wolverine'

#### Get the name, level and the race of the second player
- Use <code>hmget(Hash_key,field1,field2,..)</code>

In [96]:
###YOUR CODE HERE
r.hmget('player:102','name','level','race')

[b'Wolverine', b'6', b'Elf']

#### updating the Hash with adding a new field
- For player ('player:101'), add the **status** as '**Killed**' 

In [97]:
###YOUR CODE HERE
r.hset('player:101','status','Killed')

1

#### Check if added field 'status' to the first player ('player:101')

In [98]:
pprint(r.hgetall('player:101'))

{b'gold': b'500',
 b'health': b'20',
 b'level': b'4',
 b'name': b'Cyclops',
 b'race': b'Elf',
 b'status': b'Killed'}


#### updating the Hash with deleting the  'status' field for 'player:101'

In [99]:
r.hdel('player:101','status')

1

#### Check if the  field 'status' is deleted from the first player ('player:101')

In [100]:
pprint(r.hgetall('player:101'))

{b'gold': b'500',
 b'health': b'20',
 b'level': b'4',
 b'name': b'Cyclops',
 b'race': b'Elf'}


#### In such games, players recieve gold points, after completing objectives, or defeating enemies.
- Let's add some gold points to player:102 "Wolverine", Increase him by 25 points. 

In [101]:
print("Gold Before: ")
pprint(r.hget('player:102', 'gold'))

r.hincrby('player:102', 'gold', 25)

print("\nAfter: ")
pprint(r.hget('player:102', 'gold'))

Gold Before: 
b'4000'

After: 
b'4025'


**Notes on incrementing & decrementing Hash values:** 
- HINCRBY still operates on a hash value that is a string, but it tries to interpret the string as a **base-10 64-bit signed integer** to execute the operation.

- This applies to other commands related to incrementing and decrementing other data structures, namely **INCR**, **INCRBY**, **INCRBYFLOAT**, **ZINCRBY**, and **HINCRBYFLOAT**.

- You’ll get an error if the string at the value can’t be represented as an integer.


#### Notes on the performance of Redis Hashes:
- HGET, HSET, HINCRBY, HDEL are O(1) constant time opeations regardless of the size of the Hash.
- Whreas, HGETALL is O(n),with n being the number of fields in the Hash.
    - In Big Hashes of Thousands of fields, it's usually more effiecient to specify the fields you want, rather than retreiving all of the fields.
    
<img src='HashPerformance.JPG' width= '200'>

## Task 1: PUB/SUB in REDIS

- <b>"Pub/Sub"</b> aka Publish-Subscribe pattern is a pattern in which there are three main components, **sender**, **receiver** & **broker**.
- It is mainly characterized by **listeners subscribing to channels**, with **publishers** sending **binary string** messages to channels.
- The communication is processed by the broker, it helps the sender or publisher to publish information and deliver that information to the receiver or subscriber.


#### Example of Consumer

- The following Consumer subscribes to 'Two' Channels:
    - (Tartu), and  
    - To any channel that starts with the pattern 'DataEng_Students:'
- If the Producer publishes something related to these channels, this will be listened to and sent to this consumer.

In [110]:
# inspired by: https://gist.github.com/jobliz/2596594

import threading

class Listener(threading.Thread):
    def __init__(self, r):
        threading.Thread.__init__(self)
        self.redis = r
        self.pubsub = self.redis.pubsub()
        
        #Subscribe to these channels
        
        #YOUR CODE LINE HERE to SUBSCRIBE to 'tartuuniv' Channel
        self.pubsub.subscribe(['tartuuniv'])
        
        #YOUR CODE LINE HERE to SUBSCRIBE to channels start with the pattern'destudents:'
        self.pubsub.psubscribe(['destudents:*'])

    def work(self, item):
        print (item['channel'], ":", item['data'])

    def run(self):
        for item in self.pubsub.listen():
            if item['data'] == "KILL":
                self.pubsub.unsubscribe()
                print (self, "unsubscribed and finished")
                break
            else:
                self.work(item)

if __name__ == '__main__':
    r = redis.Redis('localhost')
    client = Listener(r)
    client.start()

b'tartuuniv' : 1
b'destudents:*' : 2
b'tartuuniv'b'tartuuniv'b'tartuuniv'b'tartuuniv' : b'new_course_created DataEngineering2'
 : b'new_course_created DataEngineering2'
 : b'new_course_created DataEngineering2'
 : b'new_course_created DataEngineering2'


#### Start Publishing Some data/messages to different Channels
- In a new command window, run the following commands:
    - (# opens a repl, all subsequent commands should show something in the first terminal)
```
redis-cli 
> publish tartuuniv "new_course_created DataEngineering" 
> publish tartuuniv "new_job_posted Senior_Researcher"
> publish destudents:1415 "John Doe"
> publish destudents:jane "Jane Kidman"
> publish Tallin "nobody listens"
> publish tartuuniv KILL   # this should terminate (listener.py)
```

##### Wrtie what you could notice from the previous Example:

- `publish tartuuniv <msg>` and `publish destudents:* <msg>` will works normaly, because we have both channels available
- `publish Tallin <msg>` will not work, because the channel `Tallin` was not available
- `publish tartuuniv KILL` will terminate the connection

##### Another Example of Redis Pub/SUB 

Let me explain it with an example. Let’s assume, Joe is the owner of a Music Shop where he sells music of different Genres. Alice is a Musician who publishes/sells her music at Joe’s Shop. And, Bob is Joe’s Customer who buys music from Joe. Joe keeps a list of his customers and their interests, hence he knows Bob likes Classical Music. Whenever Alice composes a Classical Music album in Joe’s shop, Joe delivers it to Bob. The interesting part is Bob doesn’t necessarily have to know who created the Music and Alice also doesn’t have to know who listens to her music. 

In [18]:
import redis

#YOUR CODE LINEs HERE to SUBSCRIBE to 'classic' channel 
channel='classic'
client = redis.Redis(host='127.0.0.1', port = 6379)

p = client.pubsub()
p.subscribe(channel)

while True:
    message=p.get_message()
    if message and not message['data']==1:
        #You can do any kind of computations on the coming/readed messages !!
        message=message['data'].decode('utf-8')
        song,singer=message.split(':')
        #here we just split the message and read it as song,and singer, but you can do any computations
        print("SONG: ", song)
        print("SINGER: ", singer)

SONG:  New Light
SINGER:  John Mayer
SONG:  Supermarket Flower
SINGER:  Ed Sheeran


KeyboardInterrupt: 

## Task3: Hats-Shop Website Scenario: Example

It’s time to break out a fuller example. Let’s pretend we’ve decided to start a lucrative website that sells hats, and hired you to build the site.

* You’ll use Redis to handle some of the product catalog, inventory, and bot traffic detection for our website.
* It’s day one for the site, and we’re going to be selling **three** limited-edition hats. 
* Each hat gets held in a Redis hash of field-value pairs, and the hash has a key that is a prefixed random integer , such as **hat:56854717**. 
* Using the **hat:prefix** is Redis convention for creating a sort of **namespace** within a Redis database:

In [19]:
import random

#we use random to get random prefixes
random.seed(444)
hats = {f"hat:{random.getrandbits(32)}": i for i in (
    {
        "color": "black",
        "price": 49.99,
        "style": "fitted",
        "quantity": 1000,
        "npurchased": 0,
    },
    {
        "color": "maroon",
        "price": 59.99,
        "style": "hipster",
        "quantity": 500,
        "npurchased": 0,
    },
    {
        "color": "green",
        "price": 99.99,
        "style": "baseball",
        "quantity": 200,
        "npurchased": 0,
    })
}

#### Writing Hats data to Redis & Pipelining
- To do an initial write of this data into Redis, we can use .hmset() (hash multi-set), calling it for each dictionary.
- The code block above also introduces the concept of Redis pipelining, which is a way to cut down the number of round-trip transactions that you need to write or read data from your Redis server.

In [20]:
with r.pipeline() as pipe:
    for h_id, hat in hats.items():
        pipe.hmset(h_id, hat)      
    print(pipe.execute())

[True, True, True]


- With a pipeline, all the commands are buffered on the client side and then sent at once, in one fell swoop, using pipe.hmset() in Line 3.
- This is why the three True responses are all returned at once, when you call pipe.execute() in Line 4.

#### Check Existence of Specific Hat with key('hat:56854717')

In [21]:
#YOUR CODE HERE
r.exists("hat:56854717")

1

### Query by Hat id (GetAll fields)
- Get all fields of the hat ('hat:56854717')

In [22]:
#YOUR CODE HERE
pprint(r.hgetall("hat:56854717"))

{b'color': b'green',
 b'npurchased': b'0',
 b'price': b'99.99',
 b'quantity': b'200',
 b'style': b'baseball'}


### Query by Hat id (Get Specific Fields)
- Get the color, style, and price of ('hat:56854717')

In [23]:
#YOUR CODE HERE
pprint(r.hmget("hat:56854717","color","style","price"))

[b'green', b'baseball', b'99.99']


#### Get all the Hats in your DB
- Hint: use the pattern '**hat***' as the parmater of **r.keys()** function to specify only the hat keys.

In [24]:
#YOUR CODE HERE
for key in r.keys("hat*"):
    pprint(r.hgetall(key))
    print("\n")

{b'color': b'black',
 b'npurchased': b'0',
 b'price': b'49.99',
 b'quantity': b'1000',
 b'style': b'fitted'}


{b'color': b'green',
 b'npurchased': b'0',
 b'price': b'99.99',
 b'quantity': b'200',
 b'style': b'baseball'}


{b'color': b'maroon',
 b'npurchased': b'0',
 b'price': b'59.99',
 b'quantity': b'500',
 b'style': b'hipster'}




#### Insert one more Item(Hash) in the "Hats" HashSet
- {"hat:random Prefix As Seen before, 
   "color": "black",
   "price": 60.99,
   "style": "fedora",
   "quantity": 50,
   "npurchased": 0}

In [25]:
r.hmset(f"hat:{random.getrandbits(32)}",{
        "color": "black",
        "price": 60.99,
        "style": "fedora",
        "quantity": 50,
        "npurchased": 0
    })

True

#### Check if the new item ("Hat") is added !
- You can get all the hats again!

In [37]:
#YOUR CODE HERE
for key in r.keys("hat*"):
    pprint(r.hgetall(key))
    print("\n")

{b'color': b'black',
 b'npurchased': b'0',
 b'price': b'49.99',
 b'quantity': b'1000',
 b'style': b'fitted'}


{b'color': b'green',
 b'npurchased': b'0',
 b'price': b'99.99',
 b'quantity': b'200',
 b'style': b'baseball'}


{b'color': b'maroon',
 b'npurchased': b'0',
 b'price': b'59.99',
 b'quantity': b'500',
 b'style': b'hipster'}


{b'color': b'black',
 b'npurchased': b'0',
 b'price': b'60.99',
 b'quantity': b'50',
 b'style': b'fedora'}




####  Get the count of Hats in your DB 

In [47]:
#YOUR CODE HERE
len(r.keys('hat*'))

4

In [51]:
r.keys('hat*')

[b'hat:1326692461', b'hat:56854717', b'hat:1236154736', b'hat:1327727452']

#### Filter out Hats with prices less than 60

In [109]:
for key in r.keys("hat*"):
    if float(r.hget(key, 'price').decode()) < 60.00:
        pprint(r.hgetall(key))
#         print(r.hget(key, 'price'))
        print("\n")

{b'color': b'black',
 b'npurchased': b'0',
 b'price': b'49.99',
 b'quantity': b'1000',
 b'style': b'fitted'}


{b'color': b'maroon',
 b'npurchased': b'0',
 b'price': b'59.99',
 b'quantity': b'500',
 b'style': b'hipster'}




### Transactions and Keeping Atomicity in Redis

Redis allows the execution of a group of commands in a single step, with two important guarantees:
* All the commands in a transaction are serialized and executed sequentially. It can never happen that a request issued by another client is served in the middle of the execution of a Redis transaction. This guarantees that the commands are executed as a single isolated operation.

* Either all of the commands or none are processed, so a Redis transaction is also atomic.

In redis-py, Pipeline is a transactional pipeline class by default. This means that, even though the class is actually named for something else (pipelining), it can be used to create a transaction block also.

In Redis, a transaction starts with <b> MULTI </b> and ends with <b>EXEC<b>:
    * Everything in between is executed as one all-or-nothing buffered sequence of commands.
    * Methods that you call on pipe effectively buffer all of the commands into one, and then send them to the server in a single request
    
 

In [110]:
def buyitem(r: redis.Redis, itemid: int) -> None:
    with r.pipeline() as pipe:
        error_count = 0
        while True:
            try:
                # Get available inventory, watching for changes
                # related to this itemid before the transaction
                pipe.watch(itemid)

                nleft: bytes = r.hget(itemid, "quantity")
                if nleft > b"0":
                    pipe.multi()
                    pipe.hincrby(itemid, "quantity", -1)
                    pipe.hincrby(itemid, "npurchased", 1)
                    pipe.execute()
                    break
                else:
                    # Stop watching the itemid and raise to break out
                    pipe.unwatch()
                    raise OutOfStockError(f"Sorry, {itemid} is out of stock!")
            
            except redis.WatchError:
                # Log total num. of errors by this user to buy this item,
                # then try the same process again for WATCH/HGET/MULTI/EXEC

                error_count += 1

                logging.warning("WatchError #%d: %s; retrying",error_count, itemid)

            return None

#### Call the function buyitem, and buy three hats of the hat 'hat:56854717'

In [111]:
#YOUR CODE HERE
for _ in range(3):
    buyitem(r, "hat:56854717")

#### Check the quantity and npurchased feilds of the Hat hash ('hat:56854717')

In [115]:
r.hmget("hat:56854717", "quantity", "npurchased")

[b'0', b'200']

Now, when some poor user is late to the game, they should be met with an <b>"OutOfStockError"</b> that tells our application to render an error message page on the frontend
- Buy remaining 196 hats for item hat:56854717, the stock will be 0!!

In [116]:
# for _ in range(196):
# for _ in range(189):
#     buyitem(r, "hat:56854717")

r.hmget("hat:56854717", "quantity", "npurchased")

[b'0', b'200']

#### Write down what will happen when you try to buy one more hat of the same key.

In [117]:
try:
    buyitem(r, "hat:56854717")
except:
    print("OutofStockError")

OutofStockError


<font color='red'>Answer:</font>
- It will called exception handler because `hat:56854717` doesn't have stock anymore

#### Delete elements (delete hats with 'black' color)

In [120]:
#YOUR CODE HERE
for key in r.keys("hat*"):
    if r.hget(key, 'color').decode() == 'black':
        r.delete(key)
        

#### Check if the 'black' hats are already deleted

In [121]:
for key in r.keys("hat*"):
    pprint(r.hgetall(key))
    print("\n")

{b'color': b'green',
 b'npurchased': b'200',
 b'price': b'99.99',
 b'quantity': b'0',
 b'style': b'baseball'}


{b'color': b'maroon',
 b'npurchased': b'0',
 b'price': b'59.99',
 b'quantity': b'500',
 b'style': b'hipster'}




#### Delete elements (Delete hats with quantity less than 500)

In [122]:
#YOUR CODE HERE
for key in r.keys("hat*"):
    if int(r.hget(key, 'quantity').decode()) < 500:
        r.delete(key)

#### Check if the hats with quantity less than 500 are already deleted

In [130]:
for key in r.keys("hat*"):
    pprint(r.hgetall(key))

{b'color': b'maroon',
 b'npurchased': b'0',
 b'price': b'59.99',
 b'quantity': b'500',
 b'style': b'hipster'}


#### Update the values of 'hat:1236154736' , and show the remaining hats before and after this update
- By changing its color to '**brown**'
- and by adding '**updated**' flag/field to its hash values.

In [131]:
#YOUR CODE HERE
r.hset('hat:1236154736','color','brown')

0

In [132]:
for key in r.keys("hat*"):
    pprint(r.hgetall(key))

{b'color': b'brown',
 b'npurchased': b'0',
 b'price': b'59.99',
 b'quantity': b'500',
 b'style': b'hipster'}


## Task 4:

#### Create a simple Redis DB out of this relational model

- <b>Notes before starting this task:</b> 
- Redis Store is not invented for keeping relational data and posing structured queries, but for fast accesible data. 
- But just for keeping consistency with the our running example of how to represent the relational model in different NoSQL backends, we will try this out.


<b>Hints</b>:
* It’s quite straightforward to map your relational table into Redis data structures.
* **Hash**, **Sorted Set** and **Set** are the most useful data structures in this effort.
* This means that relationships are typically represented by **sets**.
* A set can be used to represent a **one-way relationship**, so you need one set per object to represent a **many-to-many** relationship.

#### The DataBase Model: 


This is  a toy DB about movies and actors who played roles in these movies. This DB is consisted of  

- A "Person" table which has a unique id, and a name fields.

- Another "Movie" table that has a unique id, a title, a country where it was made, and a year when it was released.

- There is (m-n) or "many-many" relationship between these two tables (i.e basically, many actors can act in many movies, and the movie include many actors)
- Therefore, we use the "Roles" table in which we can deduct which person has acted in which movie, and what role(s) they played.

<img src="RDBSchema.png" alt="3" border="0">

* **Notice** that we will change the Redis DB to "**db(1)**"

    - By default there are <b>16</b> databases (indexed from 0 to 15), and you can navigate between them using <b>select</b> command.
    - Number of databases can be changed in the <b>redis config</b> file with databases setting.

    - By default, it selects the <b> database 0 </b>. 
    - To select a specified one, use <b> "redis-cli -n 1 " </b>  or use <b>   ("SELECT 1") </b> if you are already in the command line with one DB and wants to switch to DB 1. 

In [133]:
redis1= redis.Redis(host="localhost",port=6379, db=1)

### CREATE && INSERT in REDIS

#### 1. Creating a HashSet for the movies

In [134]:
#Helping function that insert multiple hashes of (key and value)into a Hashet
def setHash(mkey,mval):
    redis1.hmset(mkey,mval)

##### Movies

In [140]:
moviesLst=[
    
    ( "movie:1",{ 'title': 'Wall Street' , 'country':'USA', 'year':'1987' } ),
    ( "movie:2",{ 'title': 'The American President' , 'country':'USA', 'year':'1995' } ),
    ( "movie:3",{ 'title': 'Shawshank Redemption' , 'country':'USA', 'year':'1994' } )
]

#YOUR CODE LINES HERE to ADD this movie Hashes to REDIS.
for key, val in moviesLst:
    setHash(key,val)

#### Persons

In [141]:
personsLst=[
    
    ( "person:1",{ 'name': 'Charlie Sheen' } ),
    ( "person:2",{ 'name': 'Michael Douglas' } ),
    ( "person:3",{ 'name': 'Martin Sheen' } ),
    ( "person:4",{ 'name': 'Morgan Freeman' } )
]

#YOUR CODE LINES HERE to ADD this person Hashes to REDIS.
for key, val in personsLst:
    setHash(key,val)

#### Maintianig  the relationships in Redis

In [142]:
# Helping function that  create a set of a given values
def addSet(mkey,mvals):
    redis1.sadd(mkey,*mvals)

In [143]:
# Let's establish the many-to-many relationship 'roles'

# 1. For each movie, we keep a set of reference on the persons
movie_persons_list=[("movie:1:actors", ["person:1", "person:2", "person:3"] ),
                    ("movie:2:actors", ["person:2", "person:3" ]) ,
                    ("movie:3:actors", ["person:4" ])
                   ]
             
for (mkey,mvals) in movie_persons_list:
    addSet(mkey,mvals)
    
    
    
# 2. For each (person), we keep a set of reference on the (Movies)

#ADD YOUR CODE LINES HERE 
person_movies_list=[("person:1:movies", ["movie:1"]),
                    ("person:2:movies", ["movie:1", "movie:2"]),
                    ("person:3:movies", ["movie:1", "movie:2"]),
                    ("person:4:movies", ["movie:3"])
                   ]

for (mkey,mvals) in person_movies_list:
    addSet(mkey,mvals)

# 3. For each (person_movie Acted_in relationship), We keep a set of roles made by the actor in each movie

#ADD YOUR CODE LINES HERE 
# person_movies_list=[("person:1:movies", ["movie:1"]),
#                     ("person:2:movies", ["movie:1", "movie:2"]),
#                     ("person:3:movies", ["movie:1", "movie:2"]),
#                     ("person:4:movies", ["movie:3"])
#                    ]

# for (mkey,mvals) in person_movies_list:
#     addSet(mkey,mvals)

### Querying our REDIS Data

#### <font color= 'red'>Important to remeber:</font>
- The only way to query Redis is using the **"Keys"**, so we mainly depend on the programming cabailities of python, and the datascience cabailities of **Pandas** to make the data science part :)
- One of the Cons of Redis is that youy can't query on **values**!!

#### Get Persons in your Redis DB

In [165]:
#YOUR CODE HERE
for key in redis1.keys("person:?"):
    pprint(redis1.hgetall(key))

{b'name': b'Michael Douglas'}
{b'name': b'Charlie Sheen'}
{b'name': b'Martin Sheen'}
{b'name': b'Morgan Freeman'}


#### We can use pandas for showing the results in a better way !

In [166]:
data=[]
for key in redis1.keys("person:?"):
    name = redis1.hvals(key)[0].decode()
    data. append({'name': name})
    
df=pd.DataFrame(data)
display(df)

Unnamed: 0,name
0,Michael Douglas
1,Charlie Sheen
2,Martin Sheen
3,Morgan Freeman


#### Get persons with names start with 'C' letter

In [175]:
redis1.hget(key, 'name').decode()

'Charlie Sheen'

In [179]:
#YOUR CODE HERE
for key in redis1.keys("person:?"):
    if str(redis1.hget(key, 'name').decode()).startswith('C'):
        print(redis1.hget(key, 'name'))

b'Charlie Sheen'


#### Use Pandas to show the result in a better way!

In [180]:
#YOUR CODE HERE
data=[]
for key in redis1.keys("person:?"):
    name = redis1.hvals(key)[0].decode()
    if name.startswith('C'):
        data.append({'name': name})
    
df=pd.DataFrame(data)
display(df)

Unnamed: 0,name
0,Charlie Sheen


#### Get All Movies , sorted from recent to old

In [192]:
#YOUR CODE HERE
data=[]
for key in redis1.keys("movie:?"):
    name = redis1.hvals(key)[0].decode()
    country = redis1.hvals(key)[1].decode()
    year = int(redis1.hvals(key)[2].decode())
    data.append({'name': name, 'country':country, 'year': year})
    
df = pd.DataFrame(data)
df = df.sort_values('year').reset_index(drop=True)
display(df)

Unnamed: 0,name,country,year
0,Wall Street,USA,1987
1,Shawshank Redemption,USA,1994
2,The American President,USA,1995


#### Get All Movies released in the 90s (after year (1990) and before 2000)

In [205]:
#YOUR CODE HERE
data=[]
for key in redis1.keys("movie:?"):
    name = redis1.hvals(key)[0].decode()
    country = redis1.hvals(key)[1].decode()
    year = int(redis1.hvals(key)[2].decode())
    data.append({'name': name, 'country':country, 'year': year})
    
df = pd.DataFrame(data)
df = df[(df['year'] > 1990) & (df['year'] < 2000)]
display(df)

Unnamed: 0,name,country,year
1,The American President,USA,1995
2,Shawshank Redemption,USA,1994


### Querying from multiple tables

#### Get Movies and Actors from the DB


In [206]:
for key in redis1.keys("person:?:movies"):
    # personKey deserliaze the key get 'person:1', 'person:2',...
    personKey=key.decode().split(':')[0]+ ":" +key.decode().split(':')[1]
    # get the persons from the hashsets using the above key
    person=redis1.hgetall(personKey)
    # SMEMBERS  get the movies of each person using the 'key' in the beginning
    personMoviesKeys=redis1.smembers(key)
    for movKey in personMoviesKeys:
        print(person.get(b'name').decode(),":   ", redis1.hgetall(movKey).get(b'title').decode())  

Martin Sheen :    The American President
Martin Sheen :    Wall Street
Morgan Freeman :    Shawshank Redemption
Michael Douglas :    The American President
Michael Douglas :    Wall Street
Charlie Sheen :    Wall Street


In [220]:
#or using the Pandas DFs

#YOUR CODE HERE
data=[]
for key in redis1.keys("person:?:movies"):
    # personKey deserliaze the key get 'person:1', 'person:2',...
    personKey=key.decode().split(':')[0]+ ":" +key.decode().split(':')[1]
    # get the persons from the hashsets using the above key
    person=redis1.hgetall(personKey)
    # SMEMBERS  get the movies of each person using the 'key' in the beginning
    personMoviesKeys=redis1.smembers(key)
    for movKey in personMoviesKeys:
        name = person.get(b'name').decode()
        movie = redis1.hgetall(movKey).get(b'title').decode()
        data.append({'name': name, 'movie': movie})

df = pd.DataFrame(data)
display(df)

Unnamed: 0,name,movie
0,Martin Sheen,The American President
1,Martin Sheen,Wall Street
2,Morgan Freeman,Shawshank Redemption
3,Michael Douglas,The American President
4,Michael Douglas,Wall Street
5,Charlie Sheen,Wall Street


#### Get count of "Movies" in your DB

- Note: The best way is to store the sum as a separate key, and to update whenever you add/remove a value from your set/hash/zset.

In [212]:
#YOUR CODE HERE
data=[]
for key in redis1.keys("movie:?"):
    name = redis1.hvals(key)[0].decode()
    country = redis1.hvals(key)[1].decode()
    year = int(redis1.hvals(key)[2].decode())
    data.append({'name': name, 'country':country, 'year': year})
    
df = pd.DataFrame(data)
# display(df)
print('movie count: ' + str(len(df)))

movie count: 3


#### In this DB, for every "Actor" get the number of movies he played

In [222]:
#YOUR CODE HERE

In [243]:
#YOUR CODE HERE    
#or using the Pandas DFs

data=[]
for key in redis1.keys("person:?:movies"):
    # personKey deserliaze the key get 'person:1', 'person:2',...
    personKey=key.decode().split(':')[0]+ ":" +key.decode().split(':')[1]
    # get the persons from the hashsets using the above key
    person=redis1.hgetall(personKey)
    # SMEMBERS  get the movies of each person using the 'key' in the beginning
    personMoviesKeys=redis1.smembers(key)
    for movKey in personMoviesKeys:
        name = person.get(b'name').decode()
        movie = redis1.hgetall(movKey).get(b'title').decode()
        data.append({'name': name, 'movie': movie})

df = pd.DataFrame(data)
df = df.groupby('name')['movie'].agg(['count']).reset_index()
df = df.rename(columns = {'count':'total_movies_played'})
display(df)


Unnamed: 0,name,total_movies_played
0,Charlie Sheen,1
1,Martin Sheen,2
2,Michael Douglas,2
3,Morgan Freeman,1


#### In this DB, List the movies that every Actor Played

In [245]:
#YOUR CODE HERE
data=[]
for key in redis1.keys("person:?:movies"):
    # personKey deserliaze the key get 'person:1', 'person:2',...
    personKey=key.decode().split(':')[0]+ ":" +key.decode().split(':')[1]
    # get the persons from the hashsets using the above key
    person=redis1.hgetall(personKey)
    # SMEMBERS  get the movies of each person using the 'key' in the beginning
    personMoviesKeys=redis1.smembers(key)
    for movKey in personMoviesKeys:
        name = person.get(b'name').decode()
        movie = redis1.hgetall(movKey).get(b'title').decode()
        data.append({'name': name, 'movie': movie})

df = pd.DataFrame(data)
display(df)

Unnamed: 0,name,movie
0,Martin Sheen,The American President
1,Martin Sheen,Wall Street
2,Morgan Freeman,Shawshank Redemption
3,Michael Douglas,The American President
4,Michael Douglas,Wall Street
5,Charlie Sheen,Wall Street


In [244]:
##or using Pandas,and SQL
import pandasql as ps
import re 

#YOUR CODE HERE

### Updating Redis Data
* If the key or hash field already exists in the hash, they are overwritten.
- update the year the wallstreet movie was released in to be 2000, which is not true BTW :)
- Show that movie before and After updating it

In [256]:
#YOUR CODE HERE
#BEFORE
data=[]
for key in redis1.keys("movie:?"):
    name = redis1.hvals(key)[0].decode()
    country = redis1.hvals(key)[1].decode()
    year = int(redis1.hvals(key)[2].decode())
    data.append({'name': name, 'country':country, 'year': year})
    
df = pd.DataFrame(data)
display(df)

Unnamed: 0,name,country,year
0,Wall Street,USA,1987
1,The American President,USA,1995
2,Shawshank Redemption,USA,1994


In [258]:
#YOUR CODE HERE
redis1.hset('movie:1','year',2000)

0

In [259]:
#AFTER
data=[]
for key in redis1.keys("movie:?"):
    name = redis1.hvals(key)[0].decode()
    country = redis1.hvals(key)[1].decode()
    year = int(redis1.hvals(key)[2].decode())
    data.append({'name': name, 'country':country, 'year': year})
    
df = pd.DataFrame(data)
display(df)

Unnamed: 0,name,country,year
0,Wall Street,USA,2000
1,The American President,USA,1995
2,Shawshank Redemption,USA,1994


####  Delete all the persons with names start with 'M' letter.

In [270]:
#YOUR CODE HERE
data=[]
for key in redis1.keys("person:?:movies"):
    # personKey deserliaze the key get 'person:1', 'person:2',...
    personKey=key.decode().split(':')[0]+ ":" +key.decode().split(':')[1]
    # get the persons from the hashsets using the above key
    person=redis1.hgetall(personKey)
    # SMEMBERS  get the movies of each person using the 'key' in the beginning
    personMoviesKeys=redis1.smembers(key)
    for movKey in personMoviesKeys:
        name = person.get(b'name').decode()
        movie = redis1.hgetall(movKey).get(b'title').decode()
        
#         Delete all the persons with names start with 'M' letter.
        if not name.startswith('M'):
            data.append({'name': name, 'movie': movie})

df = pd.DataFrame(data)
display(df)

Unnamed: 0,name,movie
0,Charlie Sheen,Wall Street


In [269]:
#YOUR CODE HERE
# redis1.del('movie:1','year',2000)

 ## How long did it take you to solve the homework?
 
Please answer as precisely as you can. It does not affect your points or grade in any way. It is okey, if it took 0.5 hours or 24 hours. The collected information will be used to improve future homeworks.

<font color="red"><b>Answer:</b></font>
5 hours

**<center> <font color='red'>THANK YOU FOR YOUR EFFORT!</font></center>**