# Apache Thrift - Library Management Example

This notebook demonstrates Apache Thrift concepts using a **Library Management System** example. We'll build a system where librarians can manage books and patrons can borrow them.

## Key Learning Objectives
- Understand Thrift **struct** (data containers)
- Understand Thrift **service** (remote interfaces)
- See how **field tags** work in binary encoding
- Practice **RPC** (Remote Procedure Calls) between client and server

## Thrift Data Types Recap

Thrift provides these fundamental types:
- **bool**: true/false values
- **i32**: 32-bit signed integers
- **i64**: 64-bit signed integers
- **double**: floating-point numbers
- **string**: UTF-8 text strings
- **list<T>**: ordered collections
- **set<T>**: unique collections
- **map<K,V>**: key-value pairs

**Important**: Field tags (like `1:`, `2:`, `3:`) are used for binary encoding instead of field names!

## Our Example: Library Management System

Let's model this JSON data using Thrift:

```json
{
  "bookId": "ISBN-12345",
  "title": "The Data Science Handbook",
  "author": "Dr. Jane Smith",
  "publishedYear": 2023,
  "genres": ["Technology", "Education"],
  "isAvailable": true
}
```

**Key Insight**: Unlike JSON which stores field names, Thrift uses numeric field tags for compact binary encoding!

## Step 1: Define the Thrift Schema

We'll create a `.thrift` file that defines our data structures and services:

In [1]:
%%writefile ../schema/library.thrift

# Book data structure
struct Book {
  1: required string bookId,
  2: required string title,
  3: required string author,
  4: optional i32 publishedYear,
  5: optional list<string> genres,
  6: optional bool isAvailable
}

# Library service with remote methods
service LibraryService {
    Book borrowBook(1: required Book book, 2: required string patronName),
    Book returnBook(1: required Book book),
    list<Book> searchBooks(1: required string keyword)
}

Writing ../schema/library.thrift


### Understanding the Schema

**struct Book:**
- `1: required string bookId` → Field tag 1, must be provided
- `4: optional i32 publishedYear` → Field tag 4, can be omitted
- **Why field tags?** Binary encoding uses numbers instead of field names for efficiency

**service LibraryService:**
- `borrowBook()` → Takes a Book and patron name, returns updated Book
- `returnBook()` → Takes a Book, marks it available, returns updated Book
- `searchBooks()` → Takes keyword, returns list of matching Books

**Key Point**: Services define what operations can be called remotely!

## Step 2: Implement the Server

The server contains the actual business logic:

In [2]:
%%writefile ../library_thrift_server.py
import thriftpy2
library_thrift = thriftpy2.load("./schema/library.thrift", module_name="library_thrift")

from thriftpy2.rpc import make_server

class LibraryService(object):
    def __init__(self):
        # Simple in-memory database
        self.book_database = {}
        
    def borrowBook(self, book, patronName):
        print(f"📚 {patronName} is borrowing '{book.title}' by {book.author}")
        book.isAvailable = False
        # In real system: record who borrowed it, due date, etc.
        return book
        
    def returnBook(self, book):
        print(f"📚 Returned: '{book.title}' is now available again")
        book.isAvailable = True
        return book
        
    def searchBooks(self, keyword):
        print(f"🔍 Searching for books with keyword: '{keyword}'")
        # Mock search results
        mock_results = [
            library_thrift.Book(
                bookId="ISBN-001",
                title="Data Science Fundamentals", 
                author="Dr. Alice Johnson",
                publishedYear=2022,
                genres=["Technology", "Data Science"],
                isAvailable=True
            ),
            library_thrift.Book(
                bookId="ISBN-002",
                title="Machine Learning Patterns",
                author="Prof. Bob Wilson", 
                publishedYear=2023,
                genres=["Technology", "AI"],
                isAvailable=True
            )
        ]
        return [book for book in mock_results if keyword.lower() in book.title.lower()]

# Start the server
server = make_server(library_thrift.LibraryService, LibraryService(), client_timeout=None)
print("🚀 Library server starting on default port...")
server.serve()

Writing ../library_thrift_server.py


### Server Implementation Notes

**Key Points:**
1. **Class name matches service**: `LibraryService` class implements `service LibraryService`
2. **Method signatures match**: `borrowBook(self, book, patronName)` matches schema
3. **Business logic**: Server modifies book availability, logs actions
4. **Return types**: Must return what the schema promises (Book or list<Book>)

**Real-world consideration**: In production, you'd use a real database instead of in-memory storage!

## Step 3: Run the Server

To start the server, run this in a terminal:

```bash
python library_thrift_server.py
```

You should see: `🚀 Library server starting on default port...`

The server will wait for client connections and handle RPC requests.

## Step 4: Create Client Connection

Now let's create a client that connects to our server:

In [3]:
import thriftpy2
library_thrift = thriftpy2.load("../schema/library.thrift", module_name="library_thrift")

from thriftpy2.rpc import make_client

# Connect to the server
library_client = make_client(library_thrift.LibraryService, timeout=None)
print("✅ Connected to Library server!")

✅ Connected to Library server!


## Step 5: Create and Use Book Objects

Let's create some Book objects and interact with the library:

In [4]:
# Create a book object
my_book = library_thrift.Book(
    bookId="ISBN-12345",
    title="The Data Science Handbook",
    author="Dr. Jane Smith",
    publishedYear=2023,
    genres=["Technology", "Education"],
    isAvailable=True
)

print("📖 Created book:")
print(f"   Title: {my_book.title}")
print(f"   Author: {my_book.author}")
print(f"   Available: {my_book.isAvailable}")
print(f"   Genres: {my_book.genres}")

📖 Created book:
   Title: The Data Science Handbook
   Author: Dr. Jane Smith
   Available: True
   Genres: ['Technology', 'Education']


### Understanding Object Creation

**Key Points:**
- `library_thrift.Book()` creates a Book instance using the generated class
- **Field names** are used in Python (title, author, etc.)
- **Field tags** are used in binary encoding (1, 2, 3, etc.)
- **Required fields** must be provided, **optional fields** can be omitted

## Step 6: Make Remote Procedure Calls

Now let's call the remote methods on our server:

In [5]:
# RPC Call 1: Borrow a book
print("\n🔄 Making RPC call: borrowBook()")
borrowed_book = library_client.borrowBook(my_book, "Alice Cooper")
print(f"Result: Book available = {borrowed_book.isAvailable}")


🔄 Making RPC call: borrowBook()
Result: Book available = False


In [6]:
# RPC Call 2: Return the book
print("\n🔄 Making RPC call: returnBook()")
returned_book = library_client.returnBook(borrowed_book)
print(f"Result: Book available = {returned_book.isAvailable}")


🔄 Making RPC call: returnBook()
Result: Book available = True


In [7]:
# RPC Call 3: Search for books
print("\n🔄 Making RPC call: searchBooks()")
search_results = library_client.searchBooks("Data")
print(f"Found {len(search_results)} books:")
for book in search_results:
    print(f"   - {book.title} by {book.author} ({book.publishedYear})")


🔄 Making RPC call: searchBooks()
Found 1 books:
   - Data Science Fundamentals by Dr. Alice Johnson (2022)


### Understanding RPC Calls

**What happens during an RPC call:**

1. **Client side**: `library_client.borrowBook(my_book, "Alice Cooper")`
2. **Serialization**: Book object → binary data using field tags
3. **Network**: Binary data sent to server
4. **Server side**: Binary data → Book object → method execution
5. **Response**: Modified Book object → binary data → back to client
6. **Deserialization**: Binary data → Book object on client

**Key insight**: The client and server communicate in binary, but you work with Python objects!

## Step 7: Complete Client Application

Let's create a complete client script that demonstrates the full workflow:

In [None]:
%%writefile ../library_thrift_client.py
import thriftpy2
library_thrift = thriftpy2.load("./schema/library.thrift", module_name="library_thrift")

from thriftpy2.rpc import make_client

def main():
    # Connect to server
    print("🔌 Connecting to Library server...")
    client = make_client(library_thrift.LibraryService, timeout=None)
    
    # Create some books
    book1 = library_thrift.Book(
        bookId="ISBN-001",
        title="Python for Data Analysis",
        author="Wes McKinney",
        publishedYear=2022,
        genres=["Programming", "Data Science"],
        isAvailable=True
    )
    
    book2 = library_thrift.Book(
        bookId="ISBN-002", 
        title="Machine Learning Yearning",
        author="Andrew Ng",
        publishedYear=2018,
        genres=["AI", "Machine Learning"],
        isAvailable=True
    )
    
    # Demonstrate library operations
    print("\n📚 Library Management Demo")
    print("=" * 40)
    
    # 1. Search for books
    print("\n🔍 Searching for books containing 'Machine'...")
    results = client.searchBooks("Machine")
    for book in results:
        print(f"   Found: {book.title} by {book.author}")
    
    # 2. Borrow books
    print("\n📖 John is borrowing books...")
    book1 = client.borrowBook(book1, "John Doe")
    book2 = client.borrowBook(book2, "John Doe")
    
    # 3. Return a book
    print("\n📚 John is returning one book...")
    book1 = client.returnBook(book1)
    
    print("\n✅ Demo completed successfully!")

if __name__ == "__main__":
    main()

## How to Test the Complete System

1. **Run the server** (in terminal 1):
   ```bash
   python library_thrift_server.py
   ```

2. **Run the client** (in terminal 2):
   ```bash
   python library_thrift_client.py
   ```

3. **Expected output**:
   - Server shows: RPC calls being processed
   - Client shows: Search results, borrow/return confirmations

## Key Learning Points

### 1. **Schema vs Implementation**
- **`.thrift` file**: Defines the contract (what data looks like, what methods exist)
- **Server class**: Implements the actual business logic
- **Client code**: Uses the contract to make remote calls

### 2. **Field Tags vs Field Names**
- **Field tags (1:, 2:, 3:)**: Used in binary encoding for efficiency
- **Field names (bookId, title)**: Used in your Python code for readability
- **Why this matters**: You can change field names without breaking compatibility!

### 3. **RPC Flow**
```
Client Python Object → Binary Data → Network → Binary Data → Server Python Object
Server Python Object → Binary Data → Network → Binary Data → Client Python Object
```

### 4. **Required vs Optional**
- **required**: Must be provided, breaking change to remove
- **optional**: Can be omitted, safe to add/remove for compatibility

## Practice Exercises

Try extending this example:

1. **Add a `Patron` struct** with fields like `patronId`, `name`, `email`, `borrowedBooks`

2. **Add new service methods**:
   - `registerPatron(Patron patron)` → returns registered Patron
   - `getPatronHistory(string patronId)` → returns list<Book>

3. **Handle edge cases**:
   - What happens if someone tries to borrow an unavailable book?
   - How do you track due dates?

4. **Compare with JSON**:
   - How much smaller is the Thrift binary vs equivalent JSON?
   - What happens if you add a new field to an old client?

These exercises will deepen your understanding of Thrift's schema evolution, binary efficiency, and RPC patterns!