# Protocol Buffers (gRPC) - Online Bookstore Example

This notebook demonstrates **Protocol Buffers and gRPC** concepts using an **Online Bookstore System**. We'll build a system where customers can browse books, place orders, and track purchases.

## Key Learning Objectives
- Understand Protobuf **messages** (data containers)
- Understand gRPC **services** (remote interfaces)  
- See how **field numbers** work in binary encoding
- Practice **RPC** (Remote Procedure Calls) between client and server
- Compare with Apache Thrift approach

## Protocol Buffers Data Types Recap

Protobuf provides these fundamental types:
- **bool**: true/false values
- **int32/int64**: signed integers with variable-length encoding
- **uint32/uint64**: unsigned integers  
- **float/double**: floating-point numbers
- **string**: UTF-8 text strings
- **bytes**: raw byte sequences
- **repeated T**: ordered lists (like arrays)
- **map<K,V>**: key-value pairs
- **message**: nested structures

**Important**: Field numbers (like `= 1`, `= 2`, `= 3`) are used for binary encoding instead of field names!

## Our Example: Online Bookstore System

Let's model this JSON data using Protocol Buffers:

```json
{
  "bookId": "BOOK-67890",
  "title": "Advanced Python Programming", 
  "author": "Dr. Sarah Chen",
  "price": 49.99,
  "categories": ["Programming", "Python", "Advanced"],
  "inStock": true,
  "stockCount": 25
}
```

**Key Insight**: Unlike JSON which stores field names, Protobuf uses numeric field numbers for ultra-compact binary encoding!

## Step 1: Define the Protocol Buffers Schema

We'll create a `.proto` file that defines our data structures and gRPC services:

In [None]:
%%writefile ../schema/bookstore.proto
syntax = "proto3";

// Book data structure
message Book {
  string book_id = 1;
  string title = 2;
  string author = 3;
  double price = 4;
  repeated string categories = 5;
  bool in_stock = 6;
  int32 stock_count = 7;
}

// Order request structure
message OrderRequest {
  Book book = 1;
  string customer_name = 2;
  int32 quantity = 3;
}

// Order response structure  
message OrderResponse {
  string order_id = 1;
  Book book = 2;
  string customer_name = 3;
  int32 quantity = 4;
  double total_price = 5;
  string status = 6;
}

// Search request
message SearchRequest {
  string keyword = 1;
}

// Search response
message SearchResponse {
  repeated Book books = 1;
}

// Bookstore service with remote methods
service BookstoreService {
  rpc PlaceOrder(OrderRequest) returns (OrderResponse);
  rpc CancelOrder(OrderRequest) returns (OrderResponse);  
  rpc SearchBooks(SearchRequest) returns (SearchResponse);
}

### Understanding the Schema

**message Book:**
- `string book_id = 1` → Field number 1, book identifier
- `double price = 4` → Field number 4, book price  
- `repeated string categories = 5` → Field number 5, list of categories
- **Why field numbers?** Binary encoding uses numbers instead of names for maximum efficiency

**service BookstoreService:**
- `PlaceOrder()` → Takes OrderRequest, returns OrderResponse with order details
- `CancelOrder()` → Takes OrderRequest, returns OrderResponse with cancellation status
- `SearchBooks()` → Takes SearchRequest, returns SearchResponse with matching books

**Key Point**: Services define what operations can be called remotely across the network!

## Step 2: Generate Python Code from Schema

First, we need to generate Python code from our `.proto` file:

In [None]:
import subprocess
import os

# Generate Python code from .proto file
try:
    result = subprocess.run([
        'python', '-m', 'grpc_tools.protoc',
        '-I../schema',
        '--python_out=../.',
        '--grpc_python_out=../.',
        '../schema/bookstore.proto'
    ], capture_output=True, text=True, cwd='/Users/jefflee/SCTP/5m-data-2.3-data-encoding-creation-flow')
    
    if result.returncode == 0:
        print("✅ Successfully generated Python code from bookstore.proto")
        print("📄 Generated files: bookstore_pb2.py, bookstore_pb2_grpc.py")
    else:
        print("❌ Error generating code:")
        print(result.stderr)
except Exception as e:
    print(f"❌ Error running protoc: {e}")
    print("💡 Make sure grpcio-tools is installed: pip install grpcio-tools")

### Code Generation Explained

**What just happened:**
- `protoc` compiler read our `bookstore.proto` schema
- Generated `bookstore_pb2.py` → Contains message classes (Book, OrderRequest, etc.)
- Generated `bookstore_pb2_grpc.py` → Contains service classes and stubs
- **Why generate?** Protocol Buffers compiles schemas to optimized code for each language

**Compare with Thrift**: Thrift loads schemas at runtime, Protobuf pre-compiles them!

## Step 3: Implement the Server

The server contains the actual business logic:

In [None]:
%%writefile ../bookstore_protobuf_server.py
from concurrent import futures
import grpc
import sys
import os

# Add current directory to path for imports
sys.path.append('.')

import bookstore_pb2
import bookstore_pb2_grpc
import uuid
import time

class BookstoreService(bookstore_pb2_grpc.BookstoreServiceServicer):
    def __init__(self):
        # Simple in-memory database
        self.inventory = {}
        self.orders = {}
        
        # Pre-populate with some books
        self._initialize_inventory()
        
    def _initialize_inventory(self):
        """Initialize the bookstore with some sample books"""
        sample_books = [
            bookstore_pb2.Book(
                book_id="BOOK-001",
                title="Python Data Science Handbook",
                author="Jake VanderPlas", 
                price=39.99,
                categories=["Programming", "Data Science", "Python"],
                in_stock=True,
                stock_count=15
            ),
            bookstore_pb2.Book(
                book_id="BOOK-002", 
                title="Machine Learning Engineering",
                author="Andriy Burkov",
                price=45.50,
                categories=["AI", "Machine Learning", "Engineering"],
                in_stock=True,
                stock_count=8
            ),
            bookstore_pb2.Book(
                book_id="BOOK-003",
                title="System Design Interview",
                author="Alex Xu",
                price=35.00,
                categories=["System Design", "Interviews", "Software"],
                in_stock=True,
                stock_count=12
            )
        ]
        
        for book in sample_books:
            self.inventory[book.book_id] = book

    def PlaceOrder(self, request, context):
        print(f"🛒 Processing order: {request.customer_name} wants {request.quantity}x '{request.book.title}'")
        
        # Check if book is in inventory
        book_id = request.book.book_id
        if book_id not in self.inventory:
            return bookstore_pb2.OrderResponse(
                order_id="",
                book=request.book,
                customer_name=request.customer_name,
                quantity=request.quantity,
                total_price=0.0,
                status="FAILED: Book not found"
            )
        
        # Check stock
        current_book = self.inventory[book_id]
        if not current_book.in_stock or current_book.stock_count < request.quantity:
            return bookstore_pb2.OrderResponse(
                order_id="",
                book=current_book,
                customer_name=request.customer_name,
                quantity=request.quantity, 
                total_price=0.0,
                status="FAILED: Insufficient stock"
            )
        
        # Process order
        order_id = str(uuid.uuid4())[:8]
        total_price = current_book.price * request.quantity
        
        # Update inventory
        current_book.stock_count -= request.quantity
        if current_book.stock_count == 0:
            current_book.in_stock = False
            
        # Store order
        self.orders[order_id] = {
            'book': current_book,
            'customer': request.customer_name,
            'quantity': request.quantity,
            'total': total_price
        }
        
        print(f"✅ Order {order_id} placed successfully! Total: ${total_price:.2f}")
        
        return bookstore_pb2.OrderResponse(
            order_id=order_id,
            book=current_book,
            customer_name=request.customer_name,
            quantity=request.quantity,
            total_price=total_price,
            status="SUCCESS: Order placed"
        )

    def CancelOrder(self, request, context):
        print(f"❌ Canceling order for: {request.customer_name}")
        
        # In a real system, you'd look up by order ID
        # For demo, we'll just return a cancellation response
        return bookstore_pb2.OrderResponse(
            order_id="CANCELLED",
            book=request.book,
            customer_name=request.customer_name,
            quantity=request.quantity,
            total_price=0.0,
            status="SUCCESS: Order cancelled"
        )

    def SearchBooks(self, request, context):
        keyword = request.keyword.lower()
        print(f"🔍 Searching inventory for: '{keyword}'")
        
        matching_books = []
        for book in self.inventory.values():
            # Search in title, author, and categories
            if (keyword in book.title.lower() or 
                keyword in book.author.lower() or
                any(keyword in cat.lower() for cat in book.categories)):
                matching_books.append(book)
        
        print(f"📚 Found {len(matching_books)} matching books")
        
        return bookstore_pb2.SearchResponse(books=matching_books)

def serve():
    print("🚀 Starting Bookstore gRPC server...")
    
    # Create server with thread pool
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
    
    # Add our service to the server
    bookstore_pb2_grpc.add_BookstoreServiceServicer_to_server(
        BookstoreService(), server
    )
    
    # Start server on port 50051
    server.add_insecure_port('[::]:50051')
    server.start()
    
    print("📖 Bookstore server listening on port 50051...")
    print("🛑 Press Ctrl+C to stop the server")
    
    try:
        server.wait_for_termination()
    except KeyboardInterrupt:
        print("\n🛑 Server stopped")

if __name__ == '__main__':
    serve()

### Server Implementation Notes

**Key Points:**
1. **Class inheritance**: `BookstoreService` extends `BookstoreServiceServicer` (auto-generated)
2. **Method signatures match**: `PlaceOrder(self, request, context)` matches schema
3. **Business logic**: Server manages inventory, processes orders, handles search
4. **Context parameter**: gRPC provides request context (metadata, auth info, etc.)
5. **Return types**: Must return exactly what the schema promises

**Compare with Thrift**: gRPC uses inheritance from generated base classes, Thrift uses duck typing!

## Step 4: Run the Server

To start the server, run this in a terminal:

```bash
python bookstore_protobuf_server.py
```

You should see: `🚀 Starting Bookstore gRPC server...`

The server will wait for client connections and handle gRPC requests.

## Step 5: Create Client Connection

Now let's create a client that connects to our gRPC server:

In [None]:
import sys
sys.path.append('..')

import grpc
import bookstore_pb2
import bookstore_pb2_grpc

# Create a gRPC channel (connection) to the server
channel = grpc.insecure_channel('localhost:50051')

# Create a stub (client proxy) for our service
bookstore_stub = bookstore_pb2_grpc.BookstoreServiceStub(channel)

print("✅ Connected to Bookstore gRPC server!")
print("📡 Client ready to make remote procedure calls")

### Understanding gRPC Connection

**Key Points:**
- **Channel**: Network connection to the server (like a phone line)
- **Stub**: Client-side proxy that makes RPC calls look like local method calls
- **Insecure channel**: No encryption (for development only!)
- **Compare with Thrift**: Similar concept, but gRPC uses generated stub classes

## Step 6: Create and Use Message Objects

Let's create some Book objects and interact with the bookstore:

In [None]:
# Create a book object using the generated Book class
my_book = bookstore_pb2.Book(
    book_id="BOOK-999",
    title="gRPC Microservices Guide",
    author="Alex Johnson", 
    price=42.99,
    categories=["Microservices", "gRPC", "Architecture"],
    in_stock=True,
    stock_count=20
)

print("📖 Created book object:")
print(f"   ID: {my_book.book_id}")
print(f"   Title: {my_book.title}")
print(f"   Author: {my_book.author}")
print(f"   Price: ${my_book.price}")
print(f"   Categories: {list(my_book.categories)}")
print(f"   In Stock: {my_book.in_stock} ({my_book.stock_count} copies)")

### Understanding Message Creation

**Key Points:**
- `bookstore_pb2.Book()` creates a Book instance using the generated class
- **Field names** are used in Python (book_id, title, etc.)
- **Field numbers** are used in binary encoding (1, 2, 3, etc.)
- **Repeated fields** (like categories) behave like Python lists
- **Default values**: Proto3 uses sensible defaults (0, "", False, empty list)

## Step 7: Make gRPC Remote Procedure Calls

Now let's call the remote methods on our server:

In [None]:
# gRPC Call 1: Search for books
print("\n🔄 Making gRPC call: SearchBooks()")
search_request = bookstore_pb2.SearchRequest(keyword="Python")
search_response = bookstore_stub.SearchBooks(search_request)

print(f"📚 Found {len(search_response.books)} books containing 'Python':")
for book in search_response.books:
    print(f"   - {book.title} by {book.author} (${book.price})")

In [None]:
# gRPC Call 2: Place an order
print("\n🔄 Making gRPC call: PlaceOrder()")

# Get the first book from our search results
if search_response.books:
    book_to_order = search_response.books[0]
    
    order_request = bookstore_pb2.OrderRequest(
        book=book_to_order,
        customer_name="Emma Watson",
        quantity=2
    )
    
    order_response = bookstore_stub.PlaceOrder(order_request)
    
    print(f"📦 Order Result: {order_response.status}")
    print(f"📝 Order ID: {order_response.order_id}")
    print(f"💰 Total Price: ${order_response.total_price}")
    print(f"📖 Book: {order_response.book.title}")
    print(f"📊 Remaining Stock: {order_response.book.stock_count}")

In [None]:
# gRPC Call 3: Try to cancel an order
print("\n🔄 Making gRPC call: CancelOrder()")

cancel_request = bookstore_pb2.OrderRequest(
    book=book_to_order,
    customer_name="Emma Watson", 
    quantity=1
)

cancel_response = bookstore_stub.CancelOrder(cancel_request)
print(f"🚫 Cancellation Result: {cancel_response.status}")

### Understanding gRPC RPC Calls

**What happens during a gRPC call:**

1. **Client side**: `bookstore_stub.SearchBooks(search_request)`
2. **Serialization**: SearchRequest object → binary protobuf data using field numbers
3. **Network**: HTTP/2 transport sends binary data to server
4. **Server side**: Binary data → SearchRequest object → method execution
5. **Response**: SearchResponse object → binary data → back to client
6. **Deserialization**: Binary data → SearchResponse object on client

**Key differences from Thrift:**
- gRPC uses HTTP/2 (more web-friendly)
- Protobuf binary encoding is slightly different
- Generated code structure is different but concepts are similar

## Step 8: Complete Client Application

Let's create a complete client script that demonstrates the full workflow:

In [None]:
%%writefile ../bookstore_protobuf_client.py
import grpc
import bookstore_pb2
import bookstore_pb2_grpc

def main():
    print("🔌 Connecting to Bookstore gRPC server...")
    
    # Create connection and stub
    with grpc.insecure_channel('localhost:50051') as channel:
        stub = bookstore_pb2_grpc.BookstoreServiceStub(channel)
        
        print("📖 Bookstore Demo")
        print("=" * 50)
        
        # 1. Search for books by category
        print("\n🔍 Searching for 'Data Science' books...")
        search_req = bookstore_pb2.SearchRequest(keyword="Data Science")
        search_resp = stub.SearchBooks(search_req)
        
        if search_resp.books:
            for book in search_resp.books:
                print(f"   📚 {book.title} by {book.author} - ${book.price}")
                print(f"      📊 Stock: {book.stock_count}, Categories: {list(book.categories)}")
        
        # 2. Place orders for different customers
        customers = ["Alice Smith", "Bob Jones", "Charlie Brown"]
        
        for i, customer in enumerate(customers):
            if i < len(search_resp.books):
                book = search_resp.books[i % len(search_resp.books)]
                
                print(f"\n🛒 {customer} is ordering '{book.title}'...")
                order_req = bookstore_pb2.OrderRequest(
                    book=book,
                    customer_name=customer,
                    quantity=1
                )
                
                order_resp = stub.PlaceOrder(order_req)
                
                if order_resp.order_id:
                    print(f"   ✅ Order {order_resp.order_id} placed!")
                    print(f"   💰 Total: ${order_resp.total_price}")
                else:
                    print(f"   ❌ Order failed: {order_resp.status}")
        
        # 3. Search for a different category
        print(f"\n🔍 Searching for 'Machine Learning' books...")
        ml_search = bookstore_pb2.SearchRequest(keyword="Machine Learning")
        ml_resp = stub.SearchBooks(ml_search)
        
        print(f"📚 Found {len(ml_resp.books)} Machine Learning books:")
        for book in ml_resp.books:
            print(f"   - {book.title} (Stock: {book.stock_count})")
        
        print("\n✅ Bookstore demo completed!")

if __name__ == "__main__":
    main()

## How to Test the Complete System

1. **Run the server** (in terminal 1):
   ```bash
   python bookstore_protobuf_server.py
   ```

2. **Run the client** (in terminal 2):
   ```bash
   python bookstore_protobuf_client.py
   ```

3. **Expected output**:
   - Server shows: gRPC calls being processed, inventory updates
   - Client shows: Search results, order confirmations, stock updates

**Note**: Make sure both terminals are in the same directory where the generated `bookstore_pb2.py` files are located!

## Key Learning Points

### 1. **Schema vs Implementation**
- **`.proto` file**: Defines the contract (message types, service methods)
- **Generated code**: `bookstore_pb2.py` (messages) + `bookstore_pb2_grpc.py` (services)  
- **Server class**: Implements the actual business logic
- **Client stub**: Provides a local interface to remote methods

### 2. **Field Numbers vs Field Names**
- **Field numbers (= 1, = 2, = 3)**: Used in binary encoding for efficiency
- **Field names (book_id, title)**: Used in your Python code for readability  
- **Why this matters**: You can rename fields without breaking wire compatibility!

### 3. **gRPC vs Thrift Comparison**
| Aspect | gRPC/Protobuf | Apache Thrift |
|--------|---------------|---------------|
| Transport | HTTP/2 | Custom binary protocol |
| Code Gen | Pre-compile with `protoc` | Runtime loading with `thriftpy2` |
| Schema | `.proto` files | `.thrift` files |
| Services | Inherit from generated base | Implement methods directly |
| Web Support | Excellent (HTTP/2) | Limited |

### 4. **RPC Flow**
```
Client Python Object → Protobuf Binary → HTTP/2 → Protobuf Binary → Server Python Object
Server Python Object → Protobuf Binary → HTTP/2 → Protobuf Binary → Client Python Object
```

### 5. **Message Types**
- **Simple fields**: `string book_id = 1`
- **Repeated fields**: `repeated string categories = 5` (like lists)
- **Nested messages**: `Book book = 1` (objects within objects)
- **Optional in Proto3**: All fields are optional by default

## Practice Exercises

Try extending this example:

1. **Add a Customer message type** with fields like `customer_id`, `name`, `email`, `order_history`

2. **Add new service methods**:
   - `RegisterCustomer(Customer)` → returns registered Customer
   - `GetCustomerOrders(CustomerId)` → returns list of OrderResponse
   - `UpdateBookPrice(BookId, NewPrice)` → returns updated Book

3. **Handle advanced scenarios**:
   - What happens when you try to order more books than available?
   - How do you implement pagination for search results?
   - How do you add authentication to your gRPC service?

4. **Performance comparison**:
   - Compare binary size of Protobuf vs JSON for the same data
   - Measure RPC call latency vs REST API calls
   - Test with high concurrency (many simultaneous orders)

5. **Schema evolution**:
   - Add a new field to Book message - does old client still work?
   - Remove a field - what happens to old data?
   - Change a field type - what breaks?

These exercises will deepen your understanding of Protobuf's binary efficiency, schema evolution, and gRPC's high-performance RPC patterns!

## Summary: gRPC/Protobuf vs Thrift

Now that you've seen both examples, here are the key takeaways:

### **When to choose gRPC/Protobuf:**
- ✅ Need web/browser integration (HTTP/2 compatible)
- ✅ High-performance microservices communication
- ✅ Strong typing and code generation
- ✅ Excellent tooling and ecosystem
- ✅ Streaming RPC support

### **When to choose Apache Thrift:**
- ✅ Simple binary protocols without HTTP overhead
- ✅ More flexible transport options (TCP, UDP, etc.)
- ✅ Runtime schema loading
- ✅ More language bindings
- ✅ Simpler setup for basic use cases

**Both are excellent choices** for binary serialization and RPC - the choice depends on your specific requirements!