# Orderbook Optimization Workshop
## Focus: SELL Orders (Asks) Only

## Goal
Start with a **working but slow** orderbook for asks, then **optimize it together** to achieve O(1) best ask lookups!

## API: 3 Core Functions
```python
submit(order_id, price, quantity)  # Add a sell order
cancel(order_id)                   # Cancel a sell order
get_best()                         # Get lowest ask price
```

## Your Mission
Optimize to reach these target complexities:

| Operation | Current | Target | Key Optimization |
|-----------|---------|--------|------------------|
| **submit()** | O(N) ‚ùå | O(log P) ‚úÖ | Use heap + dict |
| **cancel()** | O(N) ‚ùå | O(1) ‚úÖ | Dict lookup |
| **get_best()** | **O(N)** ‚ùå | **O(1)** ‚úÖ | **MIN HEAP!** |

Where:
- N = total orders (could be 1000s!)
- P = number of unique price levels (~10-100)

## Setup

In [None]:
from enum import Enum
from typing import Optional

class OrderStatus(Enum):
    OPEN = "OPEN"
    CANCELLED = "CANCELLED"

class Order:
    def __init__(self, order_id: str, price: float, quantity: int):
        self.order_id = order_id
        self.price = price
        self.quantity = quantity
        self.status = OrderStatus.OPEN
    
    def __repr__(self):
        return f"Order({self.order_id}, ${self.price}, qty={self.quantity}, {self.status.value})"

print("‚úì Setup complete!")

## Suboptimal Orderbook (Asks Only)

### Current Data Structure: Simple List ‚ùå
```python
orders = [order1, order2, order3, ...]  # Just a flat list!
```

**Problems:**
- `get_best()` scans entire list for min price: **O(N)** üêå
- `cancel()` requires linear search: **O(N)** üêå
- `submit()` must check for duplicates: **O(N)** üêå

**üéØ Target Structure (for later):**
```python
asks = {price: [orders at that price]}  # Dict of lists
ask_heap = [101, 102, 103]              # MIN HEAP (regular heap!)
orders = {order_id: Order}              # Dict for O(1) lookup
```

### The Solution: Min Heap! üß†
Python's `heapq` is a **min heap** by default - perfect for finding lowest ask!

```python
import heapq
ask_heap = []
heapq.heappush(ask_heap, 101.0)
heapq.heappush(ask_heap, 102.0)
heapq.heappush(ask_heap, 103.0)

# ask_heap = [101.0, 102.0, 103.0]
# Min heap puts 101.0 at top (lowest price)

best_ask = ask_heap[0]  # Just peek at top: $101 ‚úì
```

**No negation needed!** Just use the heap directly! üéâ

In [None]:
class SuboptimalOrderBook:
    """A working but inefficient orderbook for asks."""
    
    def __init__(self):
        # SUBOPTIMAL: Just a flat list of all sell orders!
        self.orders = []
    
    def submit(self, order_id: str, price: float, quantity: int) -> bool:
        """
        Submit a new sell order (ask).
        
        Current: O(N) - scan all orders to check for duplicate ID
        Target: O(log P) - dict for O(1) check + heap push O(log P)
        
        Optimization: Use dict to store orders by ID
        """
        # Check for duplicate (O(N) scan - SLOW!)
        for order in self.orders:
            if order.order_id == order_id:
                return False
        
        # Create and add order
        order = Order(order_id, price, quantity)
        self.orders.append(order)
        return True
    
    def cancel(self, order_id: str) -> bool:
        """
        Cancel a sell order by ID.
        
        Current: O(N) - linear search
        Target: O(1) - dict lookup
        
        Optimization: orders_dict[order_id] for instant access
        """
        # Linear search (SLOW!)
        for order in self.orders:
            if order.order_id == order_id:
                if order.status == OrderStatus.CANCELLED:
                    return False
                order.status = OrderStatus.CANCELLED
                return True
        return False
    
    def get_best(self) -> Optional[float]:
        """
        Get the lowest ask price.
        
        Current: O(N) - scan ALL orders to find min
        Target: O(1) ‚ö°‚ö°‚ö°
        
        Optimization: Use a MIN HEAP!
        - Python's heapq is min heap by default
        - Store prices: [101, 102, 103]
        - heap[0] gives lowest price instantly!
        
        No negation needed - min heap is what we want! ‚úì
        """
        best_price = None
        
        # SUPER SLOW: Check every single order!
        for order in self.orders:
            if order.status == OrderStatus.OPEN:
                if best_price is None or order.price < best_price:
                    best_price = order.price
        
        return best_price
    
    def __repr__(self):
        """Display the orderbook."""
        lines = ["=" * 60, "ASK ORDERBOOK (Suboptimal)", "=" * 60]
        
        # Group by price (inefficient!)
        prices = {}
        for order in self.orders:
            if order.status == OrderStatus.OPEN:
                prices[order.price] = prices.get(order.price, 0) + order.quantity
        
        # Show asks (low to high)
        for price in sorted(prices.keys()):
            lines.append(f"ASK: ${price:>8.2f} | Qty: {prices[price]}")
        
        lines.append("=" * 60)
        best = self.get_best()
        if best:
            lines.append(f"Best Ask (Lowest): ${best:.2f}")
        lines.append(f"Total orders: {len(self.orders)}")
        
        return "\n".join(lines)

print("‚úì Suboptimal OrderBook created!")
print("‚úì Try: ob = SuboptimalOrderBook()")

## Test the Suboptimal Implementation

It works... but watch it slow down!

In [None]:
# Test: Basic functionality
print("Test: Submit sell orders (asks)")
ob = SuboptimalOrderBook()

ob.submit("A1", 101.0, 50)
ob.submit("A2", 101.5, 30)
ob.submit("A3", 102.0, 20)
ob.submit("A4", 102.5, 40)
ob.submit("A5", 103.0, 25)

print(ob)

In [None]:
# Test: Cancel
print("Test: Cancel A1 (best ask)")
ob.cancel("A1")
print(f"\nNew best ask: ${ob.get_best()}")
print(ob)

## Performance Test: See the Problem! üêå

Let's add many orders and measure `get_best()` performance.

In [None]:
import time

print("Performance Test: get_best() with increasing order counts\n")

for n in [10, 50, 100, 500, 1000]:
    ob_perf = SuboptimalOrderBook()
    
    # Add n orders
    for i in range(n):
        ob_perf.submit(f"A{i}", 100.0 + i * 0.01, 10)
    
    # Time 1000 get_best() calls
    start = time.time()
    for _ in range(1000):
        ob_perf.get_best()
    end = time.time()
    
    avg_us = (end - start) / 1000 * 1e6
    print(f"{n:5d} orders: {avg_us:6.2f} microseconds per get_best()")

print("\n‚ö†Ô∏è  Performance degrades linearly with order count - this is O(N)!")
print("üéØ Goal: O(1) - constant time regardless of order count!")
print("üí° With 1000 orders, we want <1 microsecond, not 100+!")

## Discussion: How to Fix This? ü§î

### Problem 1: get_best() is O(N)
Currently scanning all N orders to find the minimum price.

**Q:** What data structure gives instant access to the minimum?

**A:** A **MIN HEAP**!
- Python's `heapq` is a min heap by default
- Store prices: `[101, 102, 103]`
- `heap[0]` always gives the lowest ‚Üí O(1)!
- No negation needed (unlike max heap for bids)

### Problem 2: cancel() is O(N)
Linear search to find order by ID.

**Q:** What gives O(1) lookup by key?

**A:** **Dictionary!**
```python
orders = {order_id: Order}
order = orders["A1"]  # O(1) lookup!
```

### Problem 3: submit() is O(N)
Must scan to check for duplicate ID.

**Q:** If we use dict, what's the new complexity?

**A:** **O(log P)**
- Check `if order_id in orders` ‚Üí O(1)
- Push to heap ‚Üí O(log P) where P = price levels
- Total: O(log P)

## Min Heap Explained üß†

This is **simpler than max heap** - no tricks needed!

### Python's heapq = Min Heap
```python
import heapq
heap = []
heapq.heappush(heap, 5)
heapq.heappush(heap, 3)
heapq.heappush(heap, 8)
# heap[0] = 3 (smallest) ‚úì
```

### Perfect for Asks (Lowest Price)
```python
# Ask prices: $101, $102, $103
ask_heap = []
heapq.heappush(ask_heap, 101.0)
heapq.heappush(ask_heap, 102.0)
heapq.heappush(ask_heap, 103.0)

# ask_heap = [101.0, 102.0, 103.0]
# Min heap automatically puts 101.0 at top

# To get best ask (lowest):
best_ask = ask_heap[0]  # Just peek: $101 ‚úì
```

### Visual:
```
Prices: [101, 102, 103]
         ‚Üì
Push to heap: [101, 102, 103]
         ‚Üì
heap[0] = 101  ‚Üê Lowest ask! ‚úì
```

**Result:** O(1) access to lowest ask! üéâ

**No negation, no tricks** - just use heap directly!

## Next Steps: Let's Optimize Together! üöÄ

### Target Structure:
```python
class OptimalOrderBook:
    def __init__(self):
        self.asks = {}         # {price: [orders at that price]}
        self.ask_heap = []     # [101, 102, 103] (regular heap!)
        self.orders = {}       # {order_id: Order}
```

### Implementation Plan:
1. **submit()**:
   - Check dict for duplicate: O(1)
   - Add to orders dict: O(1)
   - Push to heap: O(log P)
   - **Total: O(log P)**

2. **cancel()**:
   - Lookup in dict: O(1)
   - Mark as cancelled: O(1)
   - **Total: O(1)**

3. **get_best()**:
   - Peek at heap top: O(1)
   - No negation needed: O(1)
   - **Total: O(1)** ‚ö°

### Expected Performance:
- From **50+ microseconds** ‚Üí **<1 microsecond** üöÄ
- **50x+ speedup!**

### Key Advantages of Using Asks:
- ‚úì No negative price trick needed
- ‚úì Simpler to understand
- ‚úì Direct heap usage
- ‚úì Same O(1) performance!

**Now let's code the optimal version together!** üí™