## Redis Cache with MongoDB

#### This project is a caching and rate-limiting system designed to enhance the performance and reliability of accessing customer data stored in a MongoDB database. It utilizes Redis as a caching mechanism to store frequently accessed data and implements rate limiting to prevent excessive requests. The system monitors changes in the MongoDB database and updates the cache accordingly, ensuring that the cached data remains consistent with the database. By combining caching with rate limiting, the project aims to optimize the retrieval of customer data while mitigating the risk of overwhelming the database with excessive requests.

 When running the provided code, there are several functionalities you can try and experiment with to understand its behavior and test its functionality:

Testing Rate Limiting:

Execute the code more than 5 times within a minute and observe if the rate limiting functionality works as expected. After the 5th execution, the code should return a rate limit exceeded error.

Cache Expiry Testing:

Set the expiration time of the cache to a shorter period and observe how quickly the cached data expires. You can modify the set_data_to_redis_with_expiry function to set a shorter expiry time, and then fetch the data multiple times to see when it expires and gets fetched from MongoDB again.

Database Change Monitoring:

Introduce changes in the MongoDB database while the program is running and observe how the cache invalidation process works. You can manually insert/update/delete documents in the MongoDB 'customers' collection and see if the cache gets invalidated accordingly.

Additional Features:

We can extend the functionality by adding features such as authentication, data validation, pagination, or additional caching strategies (e.g., caching at different levels, like at the application level or using a distributed cache).
By experimenting with these functionalities, we can gain a better understanding of how the code works and how it behaves under different scenarios, helping you identify and address any potential issues or areas for improvement.

#### These lines import necessary libraries and modules such as time, json, pymongo (MongoDB driver), redis (Redis client), functools, ratelimit (for rate limiting), and bson (Binary JSON) from pymongo.


In [None]:
import time
import json
import pymongo
import redis
import functools
from ratelimit import limits, RateLimitException
from bson import json_util
import threading


#### This section initializes connections to MongoDB and Redis, and specifies the database and collection to be used.

In [3]:
# MongoDB and Redis setup
client = pymongo.MongoClient('mongodb://mongodb:27017/') 
redis_client = redis.StrictRedis(host='redis', port=6379, db=0) 
db = client['shaunakdatabase']
collection = db['customers']



#### This function inserts data into the MongoDB collection.


In [4]:
def insert_data_to_mongodb(data):
    db.collection.insert_many(data)

#### This is a decorator function that applies rate limiting to other functions. It wraps the decorated function and limits the number of calls within a specified period.

In [5]:
def rate_limited(max_calls=5, period=60):
    def decorator(func):
        @functools.wraps(func)
        @limits(calls=max_calls, period=period)
        def wrapper(*args, **kwargs):
            try:
                return func(*args, **kwargs)
            except RateLimitException as e:
                print("Too many calls. Please try again later.")
                return {"error": "Rate limit exceeded. Please try again later."}, 429
        return wrapper
    return decorator


#### This function sets data to Redis with an expiration time (expiry_seconds). It is used to cache data retrieved from MongoDB with a specified expiry time.

In [6]:
def set_data_to_redis_with_expiry(key, dat, expiry_seconds):
    redis_client.setex(key, expiry_seconds, data)


#### The below functions invalidates cache by deleting data associated with a given customer_id from Redis. It is used to remove outdated data from the cache. It also monitors changes in the MongoDB database. It retrieves the most recent timestamp of the last change processed from Redis, queries the database for changes since that timestamp, and updates the cache accordingly.

In [7]:
def invalidate_cache(customer_id):
    redis_client.delete(customer_id)

def monitor_database_changes():
    try:
        # Retrieve the most recent timestamp of the last change processed
        last_change_timestamp = redis_client.get('last_change_timestamp')
        if last_change_timestamp is None:
            last_change_timestamp = 0
        else:
            last_change_timestamp = float(last_change_timestamp)

        # Query the database for changes since the last processed change
        changes = db['customers'].find(
            {'timestamp': {'$gt': last_change_timestamp}}
        ).sort([('timestamp', pymongo.ASCENDING)])

        # Count the number of changes
        num_changes = db['customers'].count_documents({'timestamp': {'$gt': last_change_timestamp}})

        # If updates are found, print a message and list the Redis IDs that will be deleted
        if num_changes > 0:
            print(f"{num_changes} updates found in the MongoDB database.")
            print("Redis IDs to be deleted:")
            for change in changes:
                redis_id = str(change['_id'])
                print(redis_id)
                invalidate_cache(redis_id)
            
            # Update the last change timestamp in Redis
            last_change_timestamp = changes[num_changes - 1]['timestamp']
            redis_client.set('last_change_timestamp', last_change_timestamp)
        else:
            print("No updates found in the MongoDB database.")

    except Exception as e:
        print(f"Error in monitoring database changes: {e}")

def start_cache_invalidation():
    # Start cache invalidation process
    monitor_database_changes()


#### The below function retrieves data from MongoDB based on the given customer_id. It is used to fetch data from MongoDB when it's not found in the cache. 

In [8]:
# MongoDB data retrieval
def get_data_from_mongodb(customer_id):
    document = db.collection.find_one({"customer_id": customer_id})
    if document:
        document["_id"] = str(document["_id"])
        return json_util.dumps(document)
    else:
        return None


#### The next function retrieves customer data, first checking the Redis cache. If data is not found in the cache, it fetches it from MongoDB, sets it in the cache, and returns the data. It is rate-limited to prevent abuse.

In [None]:
@rate_limited(max_calls=5, period=60)
def get_customer_data(customer_id):
    try:
        data = redis_client.get(customer_id)
        if data:
            print("Data found in Redis cache.")
            return json.loads(data)
        else:
            print("Data not found in Redis cache. Fetching from MongoDB.")
            data = get_data_from_mongodb(customer_id)
            if data:
                set_data_to_redis_with_expiry(customer_id, json.dumps(data), 60)  # Set expiration time to 60 seconds
            return data
    except RateLimitException:
        print("Too many calls. Please try again later.")
        return {"error": "Rate limit exceeded. Please try again later."}, 429


#### The main() function serves as the entry point of the program. It checks if the MongoDB collection exists, drops it if necessary, inserts data into the collection, starts the cache invalidation process, and retrieves customer data with rate limiting and caching.

In [9]:
def main():
    try:
        # Check if the collection exists and drop it if necessary
        if 'collection' in db.list_collection_names():
            db.drop_collection(collection)
            print(f"Collection '{collection}' dropped successfully.")
        else:
            print('Collection not found in database')
            print('Making the database collection')
        # Insert data into MongoDB collection
        with open('shaunak_ecommerce.json', 'r') as file:
            json_data = json.load(file)
        insert_data_to_mongodb(json_data)
        print("Data inserted into MongoDB collection 'customers'.")

        # Start cache invalidation process
        start_cache_invalidation()

        # Retrieve customer data with rate limiting and caching
        customer_id = 1003
        data = get_customer_data(customer_id)
        if data:
            print("Customer Data for customer_id", customer_id, ":", data)
        else:
            print("No data found for customer_id:", customer_id)
    except RateLimitException:
        print("Too many calls. Please try again later.")


In [11]:
if __name__ == "__main__":
    main()

Collection 'Collection(Database(MongoClient(host=['mongodb:27017'], document_class=dict, tz_aware=False, connect=True), 'shaunakdatabase'), 'customers')' dropped successfully.
Data inserted into MongoDB collection 'customers'.
No updates found in the MongoDB database.
Data found in Redis cache.
Customer Data for customer_id 1003 : {"_id": "65fdd0b51186834fcf07c4cf", "customer_id": 1003, "customer_name": {"first_name": "Niharika", "last_name": "Dhande", "nickname": "Niha"}, "contact_details": {"email": "niharikadhande@gmail.com", "phone": {"home": "+1-9922753866", "work": "+1-9960646333"}}, "address": {"address_line1": "5431 Hartwick road", "address_line2": "Near North Campus Commons", "city": {"city_name": "College Park", "pincode": "20743", "country": "United States"}}, "billing_info": {"credit_card": {"card_number": "**** **** **** 4644", "expiration_date": "11/28", "cvv": "***"}}, "interactions": {"viewed_products": {"electronics": {"smartphones": [{"product_name": "Google Pixel 2",