# Case Study of Scaler Programming Platform: Scalable System Design, Load Balancing, Caching, and Sharding

These notes encompass key concepts, mechanisms, and examples surrounding the design of an ML system specifically focused on a programming competition scenario at Scaler Neovarsity. The content is structured into relevant sections for clarity and deep understanding.

---

## Scenario Overview

When a student, such as **Sainadh**, logs into the Scaler Neovarsity programming competition, he encounters a web interface designed to facilitate participation:
### Web Interface Design

Here’s a structured visualization of the interface:

```
+-------------------------------------------------------+
|                     Web Interface                      |
+-------------------------------------------------------+
|                     Hi "Sainadh"                      |
|                                                       |
|  +-----------------------------------------------+    |
|  |               Programming Questions           |    |
|  | (1) Question Text                             |    |
|  | (2) Question Text                             |    |
|  | (3) Question Text                             |    |
|  |                                               |    |
|  +-----------------------------------------------+    |
|                                                       |
|         Currently Online Students:                    |
|  +-----------------------------------------------+    |
|  | * Alok                                        |    |
|  | * Prakash                                     |    |
|  | * Krishna                                     |    |
|  | * Prasad                                      |    |
|  | * John                                        |    |
|  | * Ali                                         |    |
|  | * Rishab                                      |    |
|  +-----------------------------------------------+    |
+-------------------------------------------------------+
```

---

## Design of Container for Currently Online Students

To effectively track the online status of students, we will utilize two SQL tables:

### 1. Students Table

This table contains unique identifiers and details for each student.

| Column Name     | Data Type   | Description                               |
|------------------|-------------|-------------------------------------------|
| roll_number      | INT         | Unique integer identifier for the student |
| student_name     | VARCHAR     | Name of the student                       |
| department        | VARCHAR     | Department to which the student belongs   |

### 2. Logged In Table

This table tracks whether students are online or offline.

| Column Name      | Data Type   | Description                               |
|------------------|-------------|-------------------------------------------|
| roll_number      | INT         | Foreign Key referencing Students table    |
| is_online        | BOOLEAN     | Indicates whether the student is logged in |

### Storage Estimation

- Estimated size for the logged-in table:
    - Each user: 4 bytes (for roll number) + 4 bytes (for online status) = **8 bytes**
    - Total for **1 billion users**:
$8 \, \text{bytes/user} \times 10^9 \, \text{users} = 8 \, \text{GB}$


### Do We Need Horizontal Scaling for This?
Given our total storage requirement of **8 GB** versus machine capacity of **1 TB**, **horizontal scaling is not required** at this point.

### Flowchart to Visualize the Overall Scenario

```mermaid
flowchart TD
    A[User Login] --> B[Welcome Message]
    B --> C[Display Programming Questions]
    B --> D[Display Currently Online Users]
    D --> E[Fetch User Data from SQL Tables]
```

---

## Monitoring User Status

Using traditional login methods can misrepresent user presence due to stale sessions. Hence, we should implement a **Heartbeat Mechanism**.

### Heartbeat Mechanism Explained

- **Concept**: Users will send a ping (heartbeat) signal to the server every specified amount (e.g., 5 seconds).
- **Data Storage**: Heartbeat data can be stored in a dedicated Heartbeat Table:

| Column Name      | Data Type   | Description                               |
|------------------|-------------|-------------------------------------------|
| user_id          | INT         | Roll number of the user                   |
| heartbeat        | TIMESTAMP   | Last ping time from the user              |

**Online Status Evaluation**: If the difference between the current time and the last heartbeat is greater than a predefined threshold (e.g., 30 seconds), the user will be marked as offline.

### Real-World Example
An example is how sports websites like ESPN Cricinfo update scores at regular intervals, ensuring users remain informed.

### Internet Glitches Handling
If there is a disruption (e.g., user loses connection), the system should accommodate temporary conditions without immediately defining the user as offline.

---

## Storing Heartbeat Data Using Redis

Utilizing **Redis** offers speed and efficiency for storing user heartbeat signals.

### What is Redis?
Redis (REmote DIctionary Server) is a high-performance, in-memory key-value store widely used for caching and real-time analytics.

### Advantages of Using Redis:
- **Speed**: Faster read/write operations than traditional databases.
- **Versatility**: Supports strings, lists, sets, sorted sets, hashes, and more.
- **Persistence**: Configurable persistence options enable data to be saved on disk if required.
- **Replication and Clustering**: Supports high availability through master-slave replication and sharding for handling larger datasets.

### Example Code for Using Redis in Python

Make sure to have Redis and the redis-py client installed.

```bash
pip install redis
```

```python
import redis

# Connecting to Redis
r = redis.Redis(host='localhost', port=6379, db=0)

# Setting a value
r.set('heartbeat:Sainadh', 'online')

# Getting a value
status = r.get('heartbeat:Sainadh')
print(f"Sainadh is currently {status.decode('utf-8')}")

# Adding a heartbeat entry with TTL
r.setex('heartbeat:Sainadh', 60, 'online')  # Expires in 60 seconds
```

---

### What Happens If Redis Runs Out of RAM?

In cases where Redis approaches its memory limits, a few strategies can help manage persistence effectively:

1. **Eviction Policies**: Configure Redis eviction policies that automatically remove less important entries when memory is full (e.g., Least Recently Used - LRU).
2. **Horizontal Scaling**: Distributing data across several Redis instances can alleviate memory pressure.
3. **Data Sharding**: By partitioning data among different Redis servers, applications can manage scale better without saturating a single instance.

---

### Handling High Volume of Heartbeat Ping Requests

When managing frequent heartbeat pings, several considerations must ensure the application remains responsive:

#### Managing High Volume

1. **Blocking Queues**:
   - A data structure that allows threads to block while waiting for the availability of data. It can efficiently manage the incoming heartbeats from users in an orderly fashion.
   
   **Use Case**: For example, in a multi-threaded application, if one thread is waiting to process heartbeats, blocking queues can help prevent data loss.

2. **TCP Connection Management**:
   - As users send heartbeats, maintaining stable TCP connections is crucial. Each ping should be well-handled to prevent any issues of packet loss and to maintain reliable communication between users and the server.

#### Example Code for a Heartbeat Handler

```python
import time
from queue import Queue
import threading

# Blocking queue to manage heartbeat checks
heartbeat_queue = Queue()

# Heartbeat handling function
def heartbeat_handler():
    while True:
        user_id = heartbeat_queue.get()
        print(f"Heartbeat received from user: {user_id}")
        heartbeat_queue.task_done()

# Start the thread to process heartbeats
thread = threading.Thread(target=heartbeat_handler)
thread.start()

# Simulate sending heartbeats from users
def send_heartbeat(user_id):
    heartbeat_queue.put(user_id)
    print(f"User {user_id} sent a heartbeat.")

# Simulate heartbeat requests
send_heartbeat('Sainadh')
send_heartbeat('Alok')

# Allow time for the processing
time.sleep(2)
```

---

### Connection Pooling in Databases

**Definition**: Connection pooling is a method of reusing existing connections to a database rather than opening new ones for every request. It maintains a pool of database connections that can be reused, thus reducing the overhead associated with establishing a connection with each request.

#### Advantages of Connection Pooling:
1. **Performance Improvement**: Reduces latency of database calls by keeping connections open.
2. **Resource Management**: Controls the maximum number of concurrent connections, preventing overload on the database.
3. **Scalability**: Efficient connection handling allows the system to scale better under heavy loads.

**Example Code for Connection Pooling in Python (using SQLAlchemy)**:

```python
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

# Create a database engine with connection pooling
engine = create_engine('mysql+pymysql://user:password@localhost/dbname', pool_size=10)

# Create a configured "Session" class
Session = sessionmaker(bind=engine)

# Create a session
session = Session()

# Example query
result = session.execute("SELECT * FROM Students")
for row in result:
    print(row)

# Close the session when done
session.close()
```

---

### Choosing Between RDBMS and In-Memory DB

Selecting the appropriate database type should consider:

- **Access Patterns**: 
  - For frequently accessed, speed-critical operations, in-memory databases like Redis are ideal.
  - For data requiring ACID transactions and relational integrity, RDBMS systems such as MySQL or PostgreSQL are preferable.

- **Data Volume**:
  - For large datasets with complex queries, RDBMS is typically more suitable.
  - When working with smaller datasets but needing high velocity, in-memory databases fit best.

#### Decision Factors Table

| Decision Criteria          | RDBMS                      | In-Memory DB               |
|----------------------------|---------------------------|----------------------------|
| Speed                      | Generally slower          | Extremely fast             |
| Data Volume                | High capacity             | Limited by RAM             |
| Complexity of Data Queries | Complex queries supported  | Simple key-value operations |
| Consistency Requirement     | ACID compliance            | Eventual consistency possible |

---

### Types of Database Storage Explained

1. **Relational Databases**: 
   - Examples: MySQL, PostgreSQL.
   - Strengths: Structured data storage, complex querying through SQL.

2. **NoSQL Databases**:
   - Examples: MongoDB, Cassandra.
   - Strengths: Flexible schema, highly scalable for unstructured data.

3. **Blob Storage**:
   - Services like Amazon S3 are tailored for large binary objects, allowing for efficient storage of large files—ideal for multimedia files.

#### What is a Blob?
**Blob (Binary Large Object)** is a collection of binary data stored as a single entity in a database, often used for images, videos, audio, and other large files.

### Storing Multimedia Files
Multimedia files like images and videos are typically stored as blobs in databases or used in specialized storage solutions like Amazon S3 for large-scale storage.

---

### What is Sharding?

**Sharding** refers to the process of breaking a large database into smaller, more manageable pieces called "shards." Each shard is held on a different database server, ensuring that the workload is distributed.

#### Benefits of Sharding:
- Increased Performance: Reads and writes can occur simultaneously across multiple shards.
- Enhanced Scalability: Easily add more shards as the volume of data grows.

#### Flowchart Illustrating Sharding Process

```mermaid
flowchart TD
    A[User Request] --> B[Load Balancer]
    B --> C[Sharding Logic]
    C --> D[Shards: Database 1]
    C --> E[Shards: Database 2]
    E -->|Store/Fetch Data| F[Replication for Recovery]
```

### Cost Comparison: Hard Disk vs. RAM
- **1 TB Hard Disk**: Lower cost and vast storage, prone to slower access times.
- **16 GB RAM**: High-speed access but considerably more expensive.

---

### NGINX Load Balancers

#### Definition
NGINX serves as a web server but is exceptionally effective as a reverse proxy and load balancer. It can distribute traffic effectively across multiple servers, ensuring optimized resource use and improved redundancy.

#### Example NGINX Configuration Code

```nginx
http {
   upstream backend {
      server backend1.example.com;
      server backend2.example.com;
   }

   server {
      location / {
         proxy_pass http://backend;
         proxy_set_header Host $host;
         proxy_set_header X-Real-IP $remote_addr;
         proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      }
   }
}
```

#### Learning Resources
For more information and tutorials:
- Search for NGINX tutorials on YouTube channels like "Academind" and "NetworkChuck".

---

### Microservices Architecture Challenges

#### Monolithic vs. Microservices

**Monolithic Architecture**:
- **Pros**: Easier deployment, simpler operational management.
- **Cons**: Difficulties in scaling, requires full redeployments for small changes.

**Microservices Architecture**:
- **Pros**: Services are independent; teams can deploy changes without affecting the whole system.
- **Cons**: Higher complexity, potential latency due to inter-service communication.

### Example Flowchart for Monolithic and Microservices

```mermaid
flowchart TD
    A[Monolithic Architecture] --> B[Single Deployment]
    B --> C[All Features Included]

    D[Microservices Architecture] --> E[Service A: Order Taking]
    D --> F[Service B: Payment Processing]
    D --> G[Service C: Notification]
```

---

### Load Balancer Configuration and Costs

Every load balancer typically has a public IP address and its configuration. Though using load balancers adds operational costs, the benefits of scalability and fault tolerance generally justify the expense.

### Queue Systems in Processing Requests

Queue management systems are essential in scenarios where immediate processing is not feasible. For instance, in platforms like Amazon, email confirmations processed during peak times may take longer due to queue backlogs.

### Example of Queue in Application
In a coding platform like Scaler, when a student submits code for execution, it may enter a queue before results are processed and returned, ensuring orderly and efficient operations.

#### Example Code for Task Queue

```python
from queue import Queue
import threading

task_queue = Queue()

def worker():
    while not task_queue.empty():
        task = task_queue.get()
        print(f"Processing task: {task}")
        task_queue.task_done()

for i in range(5):
    task_queue.put(f"Task {i}")

thread = threading.Thread(target=worker)
thread.start()
thread.join()
```

---

### Designing the Scaler Masterclass Webpage Scenario

Let’s consider designing the Scaler masterclass webpage, where users can enroll in programming courses like data science/ML/AI, etc.

### Design Overview
- **Components**:
  - User authentication mechanisms.
  - Multiple microservice architecture for course viewing, payment processing, and course material access.
  - Load balancers for handling incoming traffic.

### Flowchart Integrating Components

```mermaid
flowchart TD
    A[User Clicks Ad] --> B[Masterclass Webpage]
    B --> C[Authentication Service]
    C --> D[Course List Service]
    C --> E[Payment Service]
    E --> F[Confirmation Service]
    F --> G[Course Materials]
```

---

### System Design Solution for Scaler Masterclass Webpage

This system design aims to create a scalable, reliable, and efficient platform where users can enroll in programming courses like Data Science, Machine Learning, and Artificial Intelligence. The platform needs to handle user authentication, multiple microservices, payment processing, course material access, and high traffic loads.

---

### **Requirements**
#### Functional Requirements:
1. User authentication and registration.
2. Viewing course catalogs and details.
3. Processing payments securely.
4. Granting access to course materials after successful payment.
5. Confirmation notifications (email/SMS).

#### Non-functional Requirements:
1. Scalability to handle high traffic (e.g., during promotions).
2. High availability and reliability.
3. Secure data handling and transactions.
4. Low latency for user interactions.

---

### **High-Level Architecture**
The architecture follows a **microservices-based approach** for modularity and scalability. Each component is isolated and can scale independently.

#### Key Components:
1. **User Authentication Service**:
   - Handles login, registration, and session management.
   - Uses OAuth 2.0 for secure authentication.
   - Employs JWT tokens for stateless session management.

2. **Course Catalog Service**:
   - Manages course data like descriptions, pricing, and categories.
   - Fetches data from a database optimized for reads (e.g., NoSQL for quick access).

3. **Payment Service**:
   - Integrates with payment gateways (Stripe, Razorpay, etc.).
   - Ensures secure transactions via encryption (e.g., HTTPS, TLS).
   - Manages retries and transaction logs.

4. **Course Material Service**:
   - Provides access to course materials post-purchase.
   - Uses a CDN for low-latency file delivery.
   - Employs access control based on user roles.

5. **Notification Service**:
   - Sends confirmation emails/SMS.
   - Utilizes message queues for asynchronous processing.

6. **Load Balancer**:
   - Distributes incoming traffic across multiple servers.
   - Uses health checks to route traffic only to healthy instances.

7. **Cache Layer (Redis)**:
   - Caches frequently accessed data like course lists to reduce database load.
   - Stores user sessions temporarily.

8. **Database Layer**:
   - Relational database (e.g., PostgreSQL) for transactional data like payments and user info.
   - NoSQL database (e.g., MongoDB) for course content and metadata.

9. **Queue System (e.g., RabbitMQ, Kafka)**:
   - Manages asynchronous tasks like sending emails or processing payment confirmations.

10. **Monitoring and Logging**:
    - Tools like Prometheus and Grafana for performance monitoring.
    - Centralized logging with ELK stack (Elasticsearch, Logstash, Kibana).

---

### **Detailed Workflow**
1. **User Visits the Webpage**:
   - The load balancer routes the user request to a web server hosting the frontend.
   - The frontend is served via a CDN for faster load times.

2. **User Authentication**:
   - The user logs in or registers.
   - The request goes to the Authentication Service, which verifies credentials against the user database.
   - Upon successful login, a JWT token is generated and returned.

3. **Viewing Courses**:
   - The frontend queries the Course Catalog Service to fetch available courses.
   - Frequently accessed data is served from the Redis cache; otherwise, it fetches from the NoSQL database.

4. **Payment Processing**:
   - The user selects a course and proceeds to payment.
   - The Payment Service initiates a secure transaction with the payment gateway.
   - On success, the Payment Service writes the transaction details to the database.

5. **Access to Course Materials**:
   - After payment confirmation, the Course Material Service provides access.
   - Materials are delivered via a CDN for fast and reliable access.

6. **Notification**:
   - The Notification Service sends an email/SMS confirmation.
   - This task is handled asynchronously using a message queue.

---

### **Flowchart**
```mermaid
graph TD
    A[User Visits Webpage] --> B[Load Balancer]
    B --> C[Frontend Service]
    C --> D[Authentication Service]
    D --> D1[User Database]
    C --> E[Course Catalog Service]
    E --> F[Redis Cache]
    F --> G[Course Database]
    C --> H[Payment Service]
    H --> I[Payment Gateway]
    H --> J[Transaction Database]
    H --> K[Queue System]
    K --> L[Notification Service]
    L --> M[Email/SMS Provider]
    C --> N[Course Material Service]
    N --> O[CDN]
    O --> P[Course Files]
```

---

### **Capacity Planning and Guesstimations**
#### Assumptions:
- 1 million users visit the site monthly.
- 10% conversion rate (100,000 purchases/month).
- Average course size: 1GB.
- Peak traffic: 10,000 concurrent users.

#### Backend Sizing:
1. **Authentication Service**:
   - Handles 10,000 login requests/min at peak.
   - Assuming 200ms processing time, requires ~34 instances (considering 300 RPS per instance).

2. **Course Catalog Service**:
   - Redis cache hit rate: 90%.
   - NoSQL DB handles ~10% of 10,000 RPS = 1,000 RPS.
   - Database instances: 5 (each handling ~200 RPS).

3. **Payment Service**:
   - Peak transactions: 10,000/day = ~7 TPS.
   - Requires 2-3 instances for high availability.

4. **Course Material Service**:
   - Average bandwidth: 1GB * 100,000 downloads/month = ~3TB/day.
   - Use a CDN with sufficient edge servers to handle bandwidth.

5. **Notification Service**:
   - 100,000 emails/SMS daily.
   - Queue processes ~1,200 messages/minute.

#### Storage:
- **User Data**: 1 million users * 1KB/user = ~1GB.
- **Course Content**: 100 courses * 1GB = ~100GB.
- **Transaction Logs**: 100,000 transactions/month * 1KB = ~100MB/month.

---

### **Scalability Strategies**
1. **Horizontal Scaling**:
   - Scale microservices independently based on traffic (e.g., more instances for Authentication Service during login surges).

2. **Sharding**:
   - Shard user and transaction data based on user ID to reduce database contention.

3. **Caching**:
   - Use Redis for frequently accessed data like course lists and user sessions.

4. **CDN**:
   - Serve static assets and course materials via a CDN to minimize latency.

5. **Asynchronous Processing**:
   - Use message queues for tasks like sending notifications or processing payments.

---

### **Fault Tolerance**
1. **Retry Mechanisms**:
   - Implement retries for failed API calls (e.g., payment gateway).

2. **Redundancy**:
   - Use multiple instances for all services.

3. **Database Replication**:
   - Replicate databases for high availability and disaster recovery.

4. **Health Checks**:
   - Load balancer monitors service health and redirects traffic to healthy instances.

---

### **Monitoring Tools**
- **Prometheus**: Tracks metrics like CPU usage, response times, and request counts.
- **Grafana**: Visualizes system performance and alerts on anomalies.
- **ELK Stack**: Centralized logging for debugging and auditing.

---

This system design ensures scalability, reliability, and security while maintaining low latency for a smooth user experience.



---

### **API Servers: What Are They?**

#### 1. **Definition:**
   - An API server is a software component (code) that listens for incoming HTTP/HTTPS requests, processes them, and sends responses. It typically manages:
     - Routing to the appropriate microservices.
     - Rate limiting (controlling how many requests a user or client can make).
     - Authentication and authorization (validating users).
     - Data transformation or aggregation before returning results to clients.

#### 2. **Is It Code?**
   - Yes, it is code, often written in programming languages like:
     - Python (Flask, FastAPI)
     - Node.js
     - Java (Spring Boot)
     - GoLang
   - The API server implements business logic and handles tasks related to APIs.

#### 3. **Does It Need a Hard Disk and RAM?**
   - Yes, like any other application, an API server requires hardware resources to run. These resources depend on the load and complexity of the API.

---

### **Hard Disk and RAM Requirements**

#### 1. **Hard Disk:**
   - The hard disk is required for storing the server's code, logs, temporary files, or configurations.
   - If the API server is **stateless**, it doesn’t store much persistent data; the data typically resides in external databases or services.
   - **Example**: If you deploy your API in a Docker container, the image and logs need disk space.

#### 2. **RAM:**
   - RAM is essential for in-memory processing of requests. It temporarily stores:
     - Incoming requests.
     - Authentication tokens.
     - Cached data (if used).
     - Responses being sent to the client.
   - A server with low RAM might fail to handle high-concurrency workloads, causing slower responses or errors.

---

### **Example Scenario (With Rate Limiting)**

Let’s consider an API server for your ed-tech competition platform:

- A **Rate Limiter** limits each user to, say, 100 requests per minute.
- The server tracks active user requests using in-memory data structures (like a Redis cache).
- The API server forwards valid requests to specific microservices:
  - **Leaderboard Service** to show live scores.
  - **Authentication Service** to validate login tokens.
  - **Competition Logic Service** to handle answering questions.

---

### **Does an API Server Store Data?**

#### 1. **Stateless:**
   - Most modern API servers are **stateless** and do not store data directly. They interact with databases, caches, or other microservices to fetch or update data.
   - **Stateless** servers only maintain a session temporarily (in RAM).
   - **Example**: A server handling API calls to Redis for roll numbers or heartbeats doesn’t store the data but manages the requests.

#### 2. **Stateful (less common):**
   - Some API servers may store temporary session data or user-specific details in memory or even locally, especially if they are tightly coupled with the backend logic.

---

### **Summary:**
   - API servers are primarily **code** and are essential for managing microservices, rate-limiting, authentication, and request routing.
   - They require **RAM** for handling in-memory operations and **hard disk space** for storage of the codebase, logs, and configuration files.
   - Modern API servers are typically **stateless** and depend on external storage like databases or caches for data persistence.

---

## Conclusion

These comprehensive notes cover a wide array of fundamental concepts necessary for understanding and designing large-scale systems, including heartbeat mechanisms, connection pooling, Redis as an in-memory database, considerations for using sharding, load balancers, and queue management effectively.