## Project #2: Redis/Key-Value Stores
**Implementation Report**

Big Data Management Systems (Associate Prof. Damianos Chatziantoniou)\
Theodoros Plessas, 8160192 (t8160192@aueb.gr)

## 1. Introduction
Redis is an in-memory key-data structure store used in a plethora of software due to its high performance compared to alternative storage solutions (e.g. on-disk relational databases).

In this report I demonstrate proficiency in using Redis by developing a basic backend for a theoretical teleconferencing application in Python. Redis implements persistence on disk by periodically snapshotting the memory state, fully replacing the need for a traditional database, though in a real application this would be far from ideal - Redis would probably be used for regional/customer-dependent caching due to its high memory usage (the most expensive part of a server), streaming data from a traditional database according to current needs.

## 2. Storing data in Redis: conventions
**User**: A hash with multiple fields. The users' e-mail addresses are also stored in a secondary hash to enable backreferencing.

The following commands would be used to store a user in Redis:

    HMSET user:{userID} name {name} age {age} gender {gender} email {email}
    HSET users:by:email {email} {id}
    
**Meeting**: A hash with multiple fields. The audience members' e-mail addresses are stored separately in a set when they join an active instance, to be removed when they leave. The ispublic field may take the values 0/1, corresponding to its not being public or public.

The following command would be used to store a meeting in Redis:

    HMSET meeting:{meetingID} title {title} description {description} ispublic {isPublic}
    
When a user joins a non-public meeting this is stored in Redis using the following commands:

    # user joins private meeting
    # get user e-mail using HGET user:{userID} email
    SADD meeting:{meetingID}:audience {email}
    
**Meeting instance**: A hash with multiple fields. Active instances are stored with their ID in a set.

The following command would be used to store a meeting instance in Redis:

    HMSET instance:{orderID} meeting {meetingID} from {fromdatetime} to {todatetime}
    
The following command would be used to designate an instance as active, add a backreference to the relevant meeting and vice-versa:

    SADD instances:active {orderID}
    HSET instances:active:meetings {orderID} {meetingID}
    HSET meetings:active:instances {meetingID} {orderID}
    
**Event log**: Each event is a hash with multiple fields, its ID determined by an incrementing counter. All events are, after being stored, left-pushed to a list. Additionally, each user has a personal event log list. For system actions (not initiated by a user) the userID "0" is used.

The following commands would be used to store an event in Redis:

    # get event_id using INCR eventid
    event:{event_id} user {userID} type {event_type} time {timestamp}
    LPUSH events {event_id}
    LPUSH user:{userID}:events {event_id}
    
event_type is a string in the form "{activateInstance/endMeeting/joinMeeting/leaveMeeting}\_{orderID/meetingID}".

## 3. Setup
This Python program was developed and tested on a Fedora 33 machine running [Redis 6.2.3](https://download.redis.io/releases/redis-6.2.3.tar.gz) in standalone mode. All functions have been written to reflect the specification as close as possible.

Interfacing with the local Redis server is enabled by the [redis-py](https://github.com/andymccurdy/redis-py) client, which you might also need to install before execution.

Make sure to FLUSHDB 0 before and after running!

In [254]:
# imports
from datetime import datetime, timedelta
import redis
import sys

In [255]:
# establish connection to local redis server
# default port number used, enter your own if different
# decode_responses=True used to get strings instead of bytestrings
r = redis.Redis(host='localhost', port=6379, db=0, decode_responses=True)

For testing purposes some example objects have to be stored in Redis.

In [333]:
# user objects

user0 = { # system user
    "name" : "SYSTEM",
    "age" : "1621016560",
    "gender" : "AArch64",
    "email" : "127.0.0.1:993"
}

r.hset("user:0", mapping=user0)
r.hset("users:by:email", user0.get("email"), "0")

user1 = {
    "name" : "Mylonas, Michail",
    "age" : "23",
    "gender" : "Male",
    "email" : "mike@mmylonas.gr"
}

r.hset("user:1", mapping=user1)
r.hset("users:by:email", user1.get("email"), "1")

user2 = {
    "name" : "Feigl, Felicia",
    "age" : "21",
    "gender" : "Female",
    "email" : "ffeigl@fritzmail.de"
}

r.hset("user:2", mapping=user2)
r.hset("users:by:email", user2.get("email"), "2")

user3 = {
    "name" : "Vader, Darth",
    "age" : "44",
    "gender" : "Other",
    "email" : "darthy@yahoo.co.jp"
}

r.hset("user:3", mapping=user3)
r.hset("users:by:email", user3.get("email"), "3")

print(r.hgetall("user:1"))
print(r.hget("users:by:email", "mike@mmylonas.gr"))
print(r.hgetall("user:2"))
print(r.hget("users:by:email", "ffeigl@fritzmail.de"))
print(r.hgetall("user:3"))
print(r.hget("users:by:email", "darthy@yahoo.co.jp"))

{'name': 'Mylonas, Michail', 'age': '23', 'gender': 'Male', 'email': 'mike@mmylonas.gr'}
1
{'name': 'Feigl, Felicia', 'age': '21', 'gender': 'Female', 'email': 'ffeigl@fritzmail.de'}
2
{'name': 'Vader, Darth', 'age': '44', 'gender': 'Other', 'email': 'darthy@yahoo.co.jp'}
3


In [334]:
# meeting objects
meeting1 = {
    "title" : "Cooking with Paula",
    "description" : "British cuisine made easy! Drop the spices, grab the butter.",
    "ispublic" : "1"
}

r.hset("meeting:1", mapping=meeting1)

meeting2 = {
    "title" : "SuperMegaCorp Investors Keynote Series",
    "description" : "Please check your e-mail for the schedule.",
    "ispublic" : "0"
}

r.hset("meeting:2", mapping=meeting2)

meeting3 = { # meeting without instances - guaranteed to be inactive for testing
    "title" : "Pod Bay Door Opening Live Stream",
    "description" : "I'm sorry, Dave. I'm afraid I can't do that.",
    "ispublic" : "0"
}

r.hset("meeting:3", mapping=meeting3)

print(r.hgetall("meeting:1"))
print(r.hgetall("meeting:2"))
print(r.hgetall("meeting:3"))

{'title': 'Cooking with Paula', 'description': 'British cuisine made easy! Drop the spices, grab the butter.', 'ispublic': '1'}
{'title': 'SuperMegaCorp Investors Keynote Series', 'description': 'Please check your e-mail for the schedule.', 'ispublic': '0'}
{'title': 'Pod Bay Door Opening Live Stream', 'description': "I'm sorry, Dave. I'm afraid I can't do that.", 'ispublic': '0'}


In [335]:
# instance objects
    
instance1 = {
    "meeting" : "1",
    "from" : datetime.now().strftime("%d/%m/%Y, %H:%M:%S"),
    "to" : (datetime.now() + timedelta(hours=2)).strftime("%d/%m/%Y, %H:%M:%S")
}

r.hset("instance:1", mapping=instance1)
print(r.hgetall("instance:1"))

instance2 = {
    "meeting" : "2",
    "from" : "15/06/2021, 18:00:00",
    "to" : "15/06/2021, 22:00:00"
}

r.hset("instance:2", mapping=instance2)
print(r.hgetall("instance:2"))

instance3 = {
    "meeting" : "2",
    "from" : datetime.now().strftime("%d/%m/%Y, %H:%M:%S"),
    "to" : (datetime.now() + timedelta(hours=4)).strftime("%d/%m/%Y, %H:%M:%S")
}

r.hset("instance:3", mapping=instance3)
print(r.hgetall("instance:3"))

{'meeting': '1', 'from': '14/05/2021, 22:54:27', 'to': '15/05/2021, 00:54:27'}
{'meeting': '2', 'from': '15/06/2021, 18:00:00', 'to': '15/06/2021, 22:00:00'}
{'meeting': '2', 'from': '14/05/2021, 22:54:27', 'to': '15/05/2021, 02:54:27'}


## 4. The activateInstance() function

In [336]:
# Function to activate a meeting instance.
# Checks if, for given orderID, current datetime is between the from-to datetimes.
# Input: a valid orderID
# Effect: Instance becomes active, action is logged
# Errors: orderID invalid, instance is in past/future, instance already active.
def activateInstance(orderid):
    # get current timestamp
    now = datetime.now()
    
    # get from and to datetimes, convert from string
    # raise exception if orderID does not exist
    try:
        begin = r.hget("instance:" + orderid, "from")
        if not begin:
            raise Exception
    except Exception:
        sys.stderr.write("Invalid orderID\n")
        return
    begin = datetime.strptime(begin, "%d/%m/%Y, %H:%M:%S")
    end = r.hget("instance:" + orderid, "to")
    end = datetime.strptime(end, "%d/%m/%Y, %H:%M:%S")
    
    # check if event is in past/future
    try:
        if now < begin:
            raise Exception
    except Exception:
        sys.stderr.write("Instance in future\n")
        return
    try:
        if now > end:
            raise Exception
    except Exception:
        sys.stderr.write("Instance in past\n")
        return
    
    # raise exception if instance is already active
    try:
        active = r.smembers("instances:active")
        if orderid in active:
            raise Exception
    except Exception:
        sys.stderr.write("Instance already active\n")
        return
    
    # add to active instances set
    r.sadd("instances:active", orderid)
    # create backreference to meetingid
    meetingid = r.hget("instance:" + orderid, "meeting")
    r.hset("instances:active:meetings", orderid, meetingid)
    # create meetingid reference to active instance
    r.hset("meetings:active:instances", meetingid, orderid)
    
    # log in main and system event logs
    eventid = str(r.incr("eventid"))
    event = {
        "user" : "0",
        "type" : "activate_" + orderid,
        "time" : now.strftime("%d/%m/%Y, %H:%M:%S")
    }
    r.hset("event:" + eventid, mapping=event)
    r.lpush("events", eventid)
    r.lpush("user:0:events", eventid)
    
    return

# Successful calls
activateInstance("1")
activateInstance("3")

# View effects of calls
print(r.smembers("instances:active"))
print(r.hget("instances:active:meetings", "1"))
print(r.hget("instances:active:meetings", "3"))
print(r.lrange("events", 0, -1))
userevents = r.lrange("user:0:events", 0, -1)
print(userevents)
for event in userevents:
    print(r.hgetall("event:" + event))

# Exception producing calls
activateInstance("2")
activateInstance("5")
activateInstance("1")

{'3', '1'}
1
2
['2', '1']
['2', '1']
{'user': '0', 'type': 'activate_3', 'time': '14/05/2021, 22:54:27'}
{'user': '0', 'type': 'activate_1', 'time': '14/05/2021, 22:54:27'}


Instance in future
Invalid orderID
Instance already active


## 5. The joinInstance() function

In [337]:
# Function for user to join active instance
# Checks set of active instances for given orderid.
# If active and non-public user e-mail address is added to audience.
# Input: valid userID, valid orderID
# Effects: if meeting is private user e-mail added to audience,
# action is logged
# Errors: Invalid userid, invalid orderid, meeting not active, user already joined.
def joinInstance(userid, orderid):
    # get user email and meeting to be joined
    # raise exception if one of them does not exist
    try:
        email = r.hget("user:" + userid, "email")
        if not email:
            raise Exception
    except Exception:
        sys.stderr.write("Invalid userID\n")
        return
    try:
        meeting = r.hget("instance:" + orderid, "meeting")
        if not meeting:
            raise Exception
    except Exception:
        sys.stderr.write("Invalid orderID\n")
        return
    
    # raise exception if instance exists but is inactive
    try:
        active = r.smembers("instances:active")
        if orderid not in active:
            raise Exception
    except Exception:
        sys.stderr.write("Inactive instance\n")
        return
    
    # check if instance is private, then check if user is already in audience
    # if not add user email to meeting audience
    ispublic = r.hget("meeting:" + meeting, "ispublic")
    if ispublic == "0":
        try:
            audience = r.smembers("meeting:" + meeting + ":audience")
            if email in audience:
                raise Exception
        except Exception:
            sys.stderr.write("User already in meeting\n")
            return
        
        r.sadd("meeting:" + meeting + ":audience", email)
        
        # log in main and user event logs
        now = datetime.now() # get current timestamp
        
        eventid = str(r.incr("eventid"))
        joinmeeting = {
            "user" : userid,
            "type" : "joinMeeting_" + meeting,
            "time" : now.strftime("%d/%m/%Y, %H:%M:%S")
        }
        r.hset("event:" + eventid, mapping=joinmeeting)
        r.lpush("events", eventid)
        r.lpush("user:" + userid + ":events", eventid)
        
        # Print to view effects
        print(r.hgetall("event:" + eventid))
        print(r.lrange("events", 0, -1))
        print(r.lrange("user:" + userid + ":events", 0, -1))
    
    return
    
# Successful calls
joinInstance("1", "1")
joinInstance("1", "3") # orderID = 3; meetingID = 2
joinInstance("2", "3")
joinInstance("3", "3")

# View effects of calls
print(showParticipants("1")) # empty - meeting public
print(showParticipants("2"))

# Exception producing calls
joinInstance("4", "3")
joinInstance("3", "5")
joinInstance("1", "2")

{'user': '1', 'type': 'joinMeeting_2', 'time': '14/05/2021, 22:54:28'}
['3', '2', '1']
['3']
{'user': '2', 'type': 'joinMeeting_2', 'time': '14/05/2021, 22:54:28'}
['4', '3', '2', '1']
['4']
{'user': '3', 'type': 'joinMeeting_2', 'time': '14/05/2021, 22:54:28'}
['5', '4', '3', '2', '1']
['5']
None
{'mike@mmylonas.gr', 'ffeigl@fritzmail.de', 'darthy@yahoo.co.jp'}


Public meeting - attendance not logged
Invalid userID
Invalid orderID
Inactive instance


## 6. The leaveMeeting() function

In [338]:
# Function for user to leave active instance
# Checks attendees for email of given userID.
# If active and non-public user e-mail address is removed.
# Input: valid userID, valid meetingID.
# Effects: user e-mail removed from audience,
# action is logged
# Errors: Invalid userid, invalid meeting, public meeting, meeting not active, user already left.
def leaveMeeting(userid, meetingid):
    # get user email and meeting to be left
    # raise exception if one of them does not exist
    try:
        email = r.hget("user:" + userid, "email")
        if not email:
            raise Exception
    except Exception:
        sys.stderr.write("Invalid userID\n")
        return
    try:
        ispublic = r.hget("meeting:" + meetingid, "ispublic")
        if not ispublic:
            raise Exception
    except Exception:
        sys.stderr.write("Invalid meetingID\n")
        return
    
    # check if meeting is public (no audience list)
    # if yes raise exception
    try:
        if ispublic == "1":
            raise Exception
    except Exception:
        sys.stderr.write("Public meeting - attendance not logged\n")
        return
    
    # check if meeting is active by checking audience set
    # if nil raise exception
    try:
        audience = r.smembers("meeting:" + meetingid + ":audience")
        if not audience:
            raise Exception
    except Exception:
        sys.stderr.write("Inactive meeting\n")
        return
    
    # check if user in audience
    # if yes remove, else raise exception
    try:
        if email in audience:
            r.srem("meeting:" + meetingid + ":audience", email)
            
            # log in main and user event logs
            now = datetime.now() # get current timestamp

            eventid = str(r.incr("eventid"))
            leavemeeting = {
                "user" : userid,
                "type" : "leaveMeeting_" + meetingid,
                "time" : now.strftime("%d/%m/%Y, %H:%M:%S")
            }
            r.hset("event:" + eventid, mapping=leavemeeting)
            r.lpush("events", eventid)
            r.lpush("user:" + userid + ":events", eventid)
            
            # Print to view effects
            print(r.hgetall("event:" + eventid))
            print(r.lrange("events", 0, -1))
            print(r.lrange("user:" + userid + ":events", 0, -1))
        else:
            raise Exception
    except Exception:
        sys.stderr.write("User not in meeting\n")
    
    return

# Successful call
leaveMeeting("1", "2")

# View effect of call
print(showParticipants("2"))

# Exception producing calls
leaveMeeting("5", "2")
leaveMeeting("2", "4")
leaveMeeting("1", "1")
leaveMeeting("1", "3")
leaveMeeting("1", "2")

{'user': '1', 'type': 'leaveMeeting_2', 'time': '14/05/2021, 22:54:28'}
['6', '5', '4', '3', '2', '1']
['6', '3']
{'darthy@yahoo.co.jp', 'ffeigl@fritzmail.de'}


Invalid userID
Invalid meetingID
Public meeting - attendance not logged
Inactive meeting
User not in meeting


## 7. The showParticipants() function

In [339]:
# Function to show active meeting participants
# Input: valid meetingID.
# Output: audience list of meetingID
# Errors: invalid meetingID, public meeting, meeting not active
def showParticipants(meetingid):
    # check if meeting exists
    # if not raise exception
    try:
        ispublic = r.hget("meeting:" + meetingid, "ispublic")
        if not ispublic:
            raise Exception
    except Exception:
        sys.stderr.write("Invalid meetingID\n")
        return
    
    # check if meeting is private (attendance logged)
    # if not raise exception
    try:
        if ispublic == "1":
            raise Exception
    except Exception:
        sys.stderr.write("Public meeting - attendance not logged\n")
        return
    
    # check if meeting is active by checking audience set
    # if nil raise exception
    try:
        audience = r.smembers("meeting:" + meetingid + ":audience")
        if not audience:
            raise Exception
    except Exception:
        sys.stderr.write("Inactive meeting\n")
        return
    
    return audience

# Successful calls
print(showParticipants("2"))

# Exception producing calls
showParticipants("5")
showParticipants("1")
showParticipants("3")

{'darthy@yahoo.co.jp', 'ffeigl@fritzmail.de'}


Invalid meetingID
Public meeting - attendance not logged
Inactive meeting


## 8. The showActiveMeetings() function

In [340]:
# Function to show active meetings using backreferencing
# Input: none
# Output: set of active meetings
def showActiveMeetings():
    active = r.smembers("instances:active")
    meetings = set()
    for instance in active:
        meetings.add(r.hget("instances:active:meetings", instance))
        
    return meetings

print(showActiveMeetings())

{'2', '1'}


## 9. The endMeeting() function

In [341]:
def endMeeting(meetingid):
    
    #get orderid of instance and ispublic (to remove audience members if needed)
    orderid = r.hget("meetings:active:instances", meetingid)
    ispublic = r.hget("meeting:" + meetingid, "ispublic")
    
    if ispublic == "0":
        try:
            audience = showParticipants(meetingid)
            if not audience:
                raise Exception
        except Exception:
            return

        for email in audience:
            userid = r.hget("users:by:email", email)
            # log exit in main and user event logs
            now = datetime.now() # get current timestamp

            eventid = str(r.incr("eventid"))
            leavemeeting = {
                "user" : userid,
                "type" : "leaveMeeting_" + meetingid,
                "time" : now.strftime("%d/%m/%Y, %H:%M:%S")
            }
            r.hset("event:" + eventid, mapping=leavemeeting)
            r.lpush("events", eventid)
            r.lpush("user:" + userid + ":events", eventid)

            # Print to view effects
            print(r.hgetall("event:" + eventid))
            print(r.lrange("events", 0, -1))
            print(r.lrange("user:" + userid + ":events", 0, -1))

    r.delete("meeting:" + meetingid + ":audience")
    try:
        active = r.sismember("instances:active", orderid)
    except Exception:
        sys.stderr.write("Inactive/Non-existent meeting\n")
        return
    r.srem("instances:active", orderid)
    r.hdel("instances:active:meetings", orderid)
    r.hdel("meetings:active:instances", meetingid)

    # log meeting end in main and system event logs
    now = datetime.now() # get current timestamp

    eventid = str(r.incr("eventid"))
    endmeeting = {
        "user" : "0",
        "type" : "endMeeting_" + meetingid,
        "time" : now.strftime("%d/%m/%Y, %H:%M:%S")
    }
    r.hset("event:" + eventid, mapping=endmeeting)
    r.lpush("events", eventid)
    r.lpush("user:0:events", eventid)

    # Print to view effects
    print(r.hgetall("event:" + eventid))
    print(r.lrange("events", 0, -1))
    print(r.lrange("user:0:events", 0, -1))
    return

# Successful calls
endMeeting("1")
endMeeting("2")

# View effect of call
print(r.smembers("instances:active"))

# Exception producing calls
endMeeting("1")
endMeeting("3")
endMeeting("4")

{'user': '0', 'type': 'endMeeting_1', 'time': '14/05/2021, 22:54:36'}
['7', '6', '5', '4', '3', '2', '1']
['7', '2', '1']
{'user': '3', 'type': 'leaveMeeting_2', 'time': '14/05/2021, 22:54:36'}
['8', '7', '6', '5', '4', '3', '2', '1']
['8', '5']
{'user': '2', 'type': 'leaveMeeting_2', 'time': '14/05/2021, 22:54:36'}
['9', '8', '7', '6', '5', '4', '3', '2', '1']
['9', '4']
{'user': '0', 'type': 'endMeeting_2', 'time': '14/05/2021, 22:54:36'}
['10', '9', '8', '7', '6', '5', '4', '3', '2', '1']
['10', '7', '2', '1']
set()


Inactive meeting
Inactive meeting
Inactive meeting


## 10. Conclusion

Through this assignment I acquired first-hand experience in using Redis, integrating it into software and managing data using its storage model.

Due to unforeseen circumstances, and the lack of time caused by them, I was unfortunately not able to successfully implement all functions in the specification.

Had I implemented the chat functionality I would have employed a left-pushed list named "meeting:{meetingID}:chat" for each meeting, which would include messageIDs referencing "message:{messageID}" hashes containing "userID", "content", and "timestamp" fields, which would allow me to easily implement the relevant functions.

Meanwhile, showing the time participants in active meetings joined would be implemented by looking up the audience list and, in turn, the user event log where this information is stored, breaking execution when the first such instance would be reached (as a user might have had produced multiple join/leave entries). A different implementation would involve keeping this information in a separate key:set for each meeting.

## References

`[1]` [The Little Redis Book](https://github.com/karlseguin/the-little-redis-book)  
`[2]` [Redis Quick Start](https://redis.io/topics/quickstart)  
`[3]` [Redis command list](https://redis.io/commands)  
`[4]` [How to Use Redis With Python](https://realpython.com/python-redis/)