# Day 01: Limit Order Book (LOB) Basics

## Week 21: Market Microstructure

**Learning Objectives:**
- Understand the structure and mechanics of Limit Order Books
- Learn about bid-ask spread and its implications
- Analyze market depth and liquidity
- Implement LOB simulation and visualization

---

## 1. Introduction to Market Microstructure

**Market microstructure** studies how exchanges operate and how prices are formed at the granular level.

### Key Components:
- **Order Types**: Market orders, limit orders, stop orders
- **Limit Order Book (LOB)**: Central mechanism for price discovery
- **Market Makers**: Provide liquidity by posting quotes
- **Price Formation**: How orders interact to determine prices

### Why Study Market Microstructure?
1. **Execution Quality**: Minimize transaction costs
2. **Alpha Generation**: Extract signals from order flow
3. **Risk Management**: Understand liquidity risk
4. **Regulatory Compliance**: Meet best execution requirements

In [None]:
# Import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from collections import defaultdict, deque
from dataclasses import dataclass, field
from typing import Dict, List, Optional, Tuple
from enum import Enum
import heapq
import uuid
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# Plotting configuration
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['font.size'] = 11

print("Libraries loaded successfully!")

---

## 2. Limit Order Book Fundamentals

### What is a Limit Order Book?

A **Limit Order Book (LOB)** is an electronic record of all outstanding limit orders organized by price level.

```
         SELL SIDE (Asks)              BUY SIDE (Bids)
    ┌─────────────────────┐      ┌─────────────────────┐
    │  Price  │  Quantity │      │  Price  │  Quantity │
    ├─────────┼───────────┤      ├─────────┼───────────┤
    │  102.50 │    500    │      │  100.00 │    800    │
    │  102.00 │    300    │      │   99.50 │    600    │
    │  101.50 │    200    │      │   99.00 │    400    │
    │  101.00 │    100    │ ←→   │   98.50 │    200    │
    └─────────┴───────────┘      └─────────┴───────────┘
         Best Ask: 101.00         Best Bid: 100.00
                    Spread = 101.00 - 100.00 = 1.00
```

### Order Types

| Order Type | Description | Execution |
|------------|-------------|-----------|
| **Limit Order** | Execute at specified price or better | Passive, adds liquidity |
| **Market Order** | Execute immediately at best available price | Aggressive, removes liquidity |
| **Stop Order** | Triggered when price reaches threshold | Conditional |
| **Iceberg** | Large order with visible portion | Hidden liquidity |

In [None]:
# Define Order Side enum
class Side(Enum):
    BUY = "BUY"
    SELL = "SELL"

class OrderType(Enum):
    LIMIT = "LIMIT"
    MARKET = "MARKET"

@dataclass
class Order:
    """Represents a single order in the LOB."""
    order_id: str
    timestamp: datetime
    side: Side
    price: float
    quantity: int
    order_type: OrderType = OrderType.LIMIT
    filled_quantity: int = 0
    
    @property
    def remaining_quantity(self) -> int:
        return self.quantity - self.filled_quantity
    
    @property
    def is_filled(self) -> bool:
        return self.remaining_quantity == 0
    
    def __repr__(self):
        return f"Order({self.side.value}, {self.price:.2f}, {self.remaining_quantity})"

# Example orders
buy_order = Order(
    order_id=str(uuid.uuid4())[:8],
    timestamp=datetime.now(),
    side=Side.BUY,
    price=100.00,
    quantity=500
)

sell_order = Order(
    order_id=str(uuid.uuid4())[:8],
    timestamp=datetime.now(),
    side=Side.SELL,
    price=101.00,
    quantity=300
)

print("Buy Order:", buy_order)
print("Sell Order:", sell_order)

### Price-Time Priority (FIFO)

Most exchanges use **Price-Time Priority** for order matching:

1. **Price Priority**: Better prices get filled first
   - Buy orders: Higher prices have priority
   - Sell orders: Lower prices have priority

2. **Time Priority**: At the same price, earlier orders get filled first (FIFO)

In [None]:
@dataclass
class PriceLevel:
    """Represents a single price level with multiple orders."""
    price: float
    orders: List[Order] = field(default_factory=list)
    
    @property
    def total_quantity(self) -> int:
        return sum(order.remaining_quantity for order in self.orders)
    
    @property
    def num_orders(self) -> int:
        return len(self.orders)
    
    def add_order(self, order: Order):
        """Add order to price level (time priority - append to end)."""
        self.orders.append(order)
    
    def remove_order(self, order_id: str) -> Optional[Order]:
        """Remove order by ID."""
        for i, order in enumerate(self.orders):
            if order.order_id == order_id:
                return self.orders.pop(i)
        return None
    
    def __repr__(self):
        return f"PriceLevel({self.price:.2f}, qty={self.total_quantity}, orders={self.num_orders})"

# Example price level
level = PriceLevel(price=100.00)
level.add_order(Order(str(uuid.uuid4())[:8], datetime.now(), Side.BUY, 100.00, 200))
level.add_order(Order(str(uuid.uuid4())[:8], datetime.now(), Side.BUY, 100.00, 300))
level.add_order(Order(str(uuid.uuid4())[:8], datetime.now(), Side.BUY, 100.00, 150))

print(f"Price Level: {level}")
print(f"Orders at this level:")
for order in level.orders:
    print(f"  {order}")

---

## 3. Implementing a Limit Order Book

In [None]:
class LimitOrderBook:
    """A simple Limit Order Book implementation."""
    
    def __init__(self, ticker: str = "STOCK"):
        self.ticker = ticker
        self.bids: Dict[float, PriceLevel] = {}  # Buy orders
        self.asks: Dict[float, PriceLevel] = {}  # Sell orders
        self.order_map: Dict[str, Order] = {}    # Quick order lookup
        self.trades: List[Dict] = []             # Trade history
        self.order_count = 0
        
    @property
    def best_bid(self) -> Optional[float]:
        """Highest buy price."""
        return max(self.bids.keys()) if self.bids else None
    
    @property
    def best_ask(self) -> Optional[float]:
        """Lowest sell price."""
        return min(self.asks.keys()) if self.asks else None
    
    @property
    def mid_price(self) -> Optional[float]:
        """Mid price between best bid and ask."""
        if self.best_bid and self.best_ask:
            return (self.best_bid + self.best_ask) / 2
        return None
    
    @property
    def spread(self) -> Optional[float]:
        """Bid-ask spread in absolute terms."""
        if self.best_bid and self.best_ask:
            return self.best_ask - self.best_bid
        return None
    
    @property
    def spread_bps(self) -> Optional[float]:
        """Bid-ask spread in basis points."""
        if self.spread and self.mid_price:
            return (self.spread / self.mid_price) * 10000
        return None
    
    def add_limit_order(self, side: Side, price: float, quantity: int) -> Order:
        """Add a limit order to the book."""
        self.order_count += 1
        order = Order(
            order_id=f"ORD{self.order_count:06d}",
            timestamp=datetime.now(),
            side=side,
            price=price,
            quantity=quantity,
            order_type=OrderType.LIMIT
        )
        
        # Try to match first
        remaining = self._match_order(order)
        
        # If not fully filled, add to book
        if remaining > 0:
            book = self.bids if side == Side.BUY else self.asks
            if price not in book:
                book[price] = PriceLevel(price)
            book[price].add_order(order)
            self.order_map[order.order_id] = order
        
        return order
    
    def add_market_order(self, side: Side, quantity: int) -> Tuple[int, float]:
        """Execute a market order, returns (filled_qty, avg_price)."""
        self.order_count += 1
        order = Order(
            order_id=f"ORD{self.order_count:06d}",
            timestamp=datetime.now(),
            side=side,
            price=float('inf') if side == Side.BUY else 0.0,
            quantity=quantity,
            order_type=OrderType.MARKET
        )
        
        filled_qty = quantity - self._match_order(order)
        
        # Calculate average price from recent trades
        relevant_trades = [t for t in self.trades[-100:] if t['aggressor_id'] == order.order_id]
        if relevant_trades:
            total_value = sum(t['price'] * t['quantity'] for t in relevant_trades)
            total_qty = sum(t['quantity'] for t in relevant_trades)
            avg_price = total_value / total_qty if total_qty > 0 else 0
        else:
            avg_price = 0
        
        return filled_qty, avg_price
    
    def _match_order(self, order: Order) -> int:
        """Match incoming order against the book. Returns remaining quantity."""
        if order.side == Side.BUY:
            opposite_book = self.asks
            price_check = lambda p: p <= order.price
            get_best = lambda: min(opposite_book.keys()) if opposite_book else None
        else:
            opposite_book = self.bids
            price_check = lambda p: p >= order.price
            get_best = lambda: max(opposite_book.keys()) if opposite_book else None
        
        remaining = order.remaining_quantity
        
        while remaining > 0:
            best_price = get_best()
            if best_price is None or not price_check(best_price):
                break
            
            level = opposite_book[best_price]
            
            while remaining > 0 and level.orders:
                passive_order = level.orders[0]
                match_qty = min(remaining, passive_order.remaining_quantity)
                
                # Execute trade
                trade = {
                    'timestamp': datetime.now(),
                    'price': best_price,
                    'quantity': match_qty,
                    'aggressor_side': order.side.value,
                    'aggressor_id': order.order_id,
                    'passive_id': passive_order.order_id
                }
                self.trades.append(trade)
                
                # Update quantities
                order.filled_quantity += match_qty
                passive_order.filled_quantity += match_qty
                remaining -= match_qty
                
                # Remove filled orders
                if passive_order.is_filled:
                    level.orders.pop(0)
                    if passive_order.order_id in self.order_map:
                        del self.order_map[passive_order.order_id]
            
            # Remove empty price levels
            if not level.orders:
                del opposite_book[best_price]
        
        return remaining
    
    def cancel_order(self, order_id: str) -> bool:
        """Cancel an order by ID."""
        if order_id not in self.order_map:
            return False
        
        order = self.order_map[order_id]
        book = self.bids if order.side == Side.BUY else self.asks
        
        if order.price in book:
            book[order.price].remove_order(order_id)
            if not book[order.price].orders:
                del book[order.price]
        
        del self.order_map[order_id]
        return True
    
    def get_depth(self, levels: int = 5) -> Dict:
        """Get order book depth for top N levels."""
        bid_prices = sorted(self.bids.keys(), reverse=True)[:levels]
        ask_prices = sorted(self.asks.keys())[:levels]
        
        return {
            'bids': [(p, self.bids[p].total_quantity, self.bids[p].num_orders) 
                     for p in bid_prices],
            'asks': [(p, self.asks[p].total_quantity, self.asks[p].num_orders) 
                     for p in ask_prices]
        }
    
    def __repr__(self):
        return (f"LOB({self.ticker}) | "
                f"Bid: {self.best_bid:.2f} | "
                f"Ask: {self.best_ask:.2f} | "
                f"Spread: {self.spread:.2f}")

print("LimitOrderBook class defined!")

In [None]:
# Create and populate a sample LOB
lob = LimitOrderBook("AAPL")

# Add buy orders (bids)
lob.add_limit_order(Side.BUY, 100.00, 500)
lob.add_limit_order(Side.BUY, 100.00, 300)  # Same price, different order
lob.add_limit_order(Side.BUY, 99.50, 400)
lob.add_limit_order(Side.BUY, 99.00, 600)
lob.add_limit_order(Side.BUY, 98.50, 800)
lob.add_limit_order(Side.BUY, 98.00, 1000)

# Add sell orders (asks)
lob.add_limit_order(Side.SELL, 100.50, 400)
lob.add_limit_order(Side.SELL, 100.50, 200)
lob.add_limit_order(Side.SELL, 101.00, 500)
lob.add_limit_order(Side.SELL, 101.50, 700)
lob.add_limit_order(Side.SELL, 102.00, 900)
lob.add_limit_order(Side.SELL, 102.50, 1100)

print(lob)
print(f"\nMid Price: ${lob.mid_price:.2f}")
print(f"Spread: ${lob.spread:.2f} ({lob.spread_bps:.1f} bps)")

---

## 4. Bid-Ask Spread Analysis

### What is the Bid-Ask Spread?

The **bid-ask spread** is the difference between:
- **Best Ask (Offer)**: Lowest price sellers are willing to accept
- **Best Bid**: Highest price buyers are willing to pay

### Spread Measures

| Measure | Formula | Use Case |
|---------|---------|----------|
| Absolute Spread | $Ask - Bid$ | Direct cost measure |
| Relative Spread | $\frac{Ask - Bid}{Mid}$ | Cross-asset comparison |
| Effective Spread | $2 \times |Price - Mid|$ | Actual execution cost |
| Realized Spread | $2 \times (Price - Mid_{t+\Delta})$ | Post-trade price impact |

In [None]:
def calculate_spread_metrics(lob: LimitOrderBook) -> Dict:
    """Calculate various spread metrics."""
    best_bid = lob.best_bid
    best_ask = lob.best_ask
    mid = lob.mid_price
    
    if not all([best_bid, best_ask, mid]):
        return {}
    
    return {
        'best_bid': best_bid,
        'best_ask': best_ask,
        'mid_price': mid,
        'absolute_spread': best_ask - best_bid,
        'relative_spread_pct': (best_ask - best_bid) / mid * 100,
        'spread_bps': (best_ask - best_bid) / mid * 10000,
        'half_spread': (best_ask - best_bid) / 2,
        'half_spread_bps': (best_ask - best_bid) / mid * 5000
    }

metrics = calculate_spread_metrics(lob)
print("=" * 50)
print("        SPREAD METRICS")
print("=" * 50)
for key, value in metrics.items():
    if 'price' in key or 'bid' in key or 'ask' in key or 'spread' in key and 'bps' not in key and 'pct' not in key:
        print(f"{key:25s}: ${value:.4f}")
    elif 'bps' in key:
        print(f"{key:25s}: {value:.2f} bps")
    elif 'pct' in key:
        print(f"{key:25s}: {value:.4f}%")

### Factors Affecting the Spread

1. **Volatility**: Higher volatility → Wider spreads (inventory risk)
2. **Volume/Liquidity**: Higher volume → Tighter spreads
3. **Information Asymmetry**: More adverse selection → Wider spreads
4. **Competition**: More market makers → Tighter spreads
5. **Tick Size**: Minimum price increment affects spread

In [None]:
# Simulate spread under different conditions
np.random.seed(42)

def simulate_spread_dynamics(n_samples: int = 1000,
                              base_spread: float = 0.05,
                              volatility_factor: float = 0.1,
                              volume_factor: float = -0.02) -> pd.DataFrame:
    """Simulate spread dynamics based on market conditions."""
    
    # Generate random market conditions
    volatility = np.random.exponential(0.02, n_samples)  # Realized vol
    volume = np.random.exponential(100000, n_samples)    # Trading volume
    time_of_day = np.linspace(0, 1, n_samples)           # 0=open, 1=close
    
    # U-shape pattern for intraday spread
    intraday_effect = 0.02 * (4 * (time_of_day - 0.5)**2)
    
    # Calculate spread
    spread = (base_spread + 
              volatility_factor * volatility - 
              volume_factor * np.log(volume/100000) +
              intraday_effect +
              np.random.normal(0, 0.005, n_samples))
    
    spread = np.maximum(spread, 0.01)  # Minimum 1 cent spread
    
    return pd.DataFrame({
        'time': time_of_day,
        'volatility': volatility,
        'volume': volume,
        'spread': spread
    })

spread_df = simulate_spread_dynamics()

# Visualize
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# 1. Spread vs Time of Day (U-shape)
ax1 = axes[0, 0]
spread_by_time = spread_df.groupby(pd.cut(spread_df['time'], 20))['spread'].mean()
ax1.plot(range(20), spread_by_time.values, 'b-o', linewidth=2, markersize=6)
ax1.set_xlabel('Time of Day (Open → Close)')
ax1.set_ylabel('Average Spread ($)')
ax1.set_title('Intraday Spread Pattern (U-Shape)', fontsize=12, fontweight='bold')
ax1.axhline(spread_df['spread'].mean(), color='r', linestyle='--', label='Daily Avg')
ax1.legend()

# 2. Spread vs Volatility
ax2 = axes[0, 1]
ax2.scatter(spread_df['volatility']*100, spread_df['spread'], alpha=0.3, s=10)
z = np.polyfit(spread_df['volatility']*100, spread_df['spread'], 1)
p = np.poly1d(z)
x_line = np.linspace(0, spread_df['volatility'].max()*100, 100)
ax2.plot(x_line, p(x_line), 'r-', linewidth=2, label=f'Trend (slope={z[0]:.3f})')
ax2.set_xlabel('Volatility (%)')
ax2.set_ylabel('Spread ($)')
ax2.set_title('Spread vs Volatility (+correlation)', fontsize=12, fontweight='bold')
ax2.legend()

# 3. Spread vs Volume
ax3 = axes[1, 0]
ax3.scatter(spread_df['volume']/1000, spread_df['spread'], alpha=0.3, s=10)
ax3.set_xlabel('Volume (thousands)')
ax3.set_ylabel('Spread ($)')
ax3.set_title('Spread vs Volume (-correlation)', fontsize=12, fontweight='bold')

# 4. Spread Distribution
ax4 = axes[1, 1]
ax4.hist(spread_df['spread'], bins=50, edgecolor='black', alpha=0.7)
ax4.axvline(spread_df['spread'].mean(), color='r', linestyle='--', 
            label=f'Mean: ${spread_df["spread"].mean():.4f}')
ax4.axvline(spread_df['spread'].median(), color='g', linestyle='--',
            label=f'Median: ${spread_df["spread"].median():.4f}')
ax4.set_xlabel('Spread ($)')
ax4.set_ylabel('Frequency')
ax4.set_title('Spread Distribution', fontsize=12, fontweight='bold')
ax4.legend()

plt.tight_layout()
plt.show()

---

## 5. Market Depth Analysis

### What is Market Depth?

**Market depth** measures the market's ability to absorb large orders without significant price impact.

### Key Depth Metrics

| Metric | Description | Formula |
|--------|-------------|----------|
| **Depth at Best** | Volume at BBO | $Q_{bid}^{best} + Q_{ask}^{best}$ |
| **Cumulative Depth** | Total volume within N levels | $\sum_{i=1}^{N} Q_i$ |
| **Depth Imbalance** | Asymmetry between sides | $\frac{Q_{bid} - Q_{ask}}{Q_{bid} + Q_{ask}}$ |
| **VWAP Distance** | Volume-weighted distance from mid | Price impact estimate |

In [None]:
def analyze_depth(lob: LimitOrderBook, levels: int = 10) -> Dict:
    """Comprehensive depth analysis."""
    depth = lob.get_depth(levels)
    mid = lob.mid_price
    
    # Calculate metrics
    bid_depth = sum(qty for _, qty, _ in depth['bids'])
    ask_depth = sum(qty for _, qty, _ in depth['asks'])
    total_depth = bid_depth + ask_depth
    
    # Depth at best
    bid_at_best = depth['bids'][0][1] if depth['bids'] else 0
    ask_at_best = depth['asks'][0][1] if depth['asks'] else 0
    
    # Order imbalance
    imbalance = (bid_depth - ask_depth) / total_depth if total_depth > 0 else 0
    imbalance_at_best = ((bid_at_best - ask_at_best) / (bid_at_best + ask_at_best) 
                         if (bid_at_best + ask_at_best) > 0 else 0)
    
    # Weighted average distance from mid
    bid_weighted_dist = sum((mid - p) * qty for p, qty, _ in depth['bids']) / bid_depth if bid_depth > 0 else 0
    ask_weighted_dist = sum((p - mid) * qty for p, qty, _ in depth['asks']) / ask_depth if ask_depth > 0 else 0
    
    return {
        'bid_depth': bid_depth,
        'ask_depth': ask_depth,
        'total_depth': total_depth,
        'bid_at_best': bid_at_best,
        'ask_at_best': ask_at_best,
        'depth_imbalance': imbalance,
        'imbalance_at_best': imbalance_at_best,
        'bid_weighted_distance': bid_weighted_dist,
        'ask_weighted_distance': ask_weighted_dist,
        'depth_data': depth
    }

depth_analysis = analyze_depth(lob)

print("=" * 50)
print("        DEPTH ANALYSIS")
print("=" * 50)
print(f"Bid Depth (total):        {depth_analysis['bid_depth']:,} shares")
print(f"Ask Depth (total):        {depth_analysis['ask_depth']:,} shares")
print(f"Total Depth:              {depth_analysis['total_depth']:,} shares")
print(f"\nBid at Best:              {depth_analysis['bid_at_best']:,} shares")
print(f"Ask at Best:              {depth_analysis['ask_at_best']:,} shares")
print(f"\nDepth Imbalance:          {depth_analysis['depth_imbalance']:.2%}")
print(f"Imbalance at Best:        {depth_analysis['imbalance_at_best']:.2%}")
print(f"\nNote: Positive imbalance = more buy pressure")

In [None]:
def visualize_order_book(lob: LimitOrderBook, levels: int = 10):
    """Create a visual representation of the order book."""
    depth = lob.get_depth(levels)
    
    fig, axes = plt.subplots(1, 2, figsize=(16, 7))
    
    # === Left Plot: Horizontal Bar Chart ===
    ax1 = axes[0]
    
    # Bids (green, left side)
    bid_prices = [p for p, _, _ in depth['bids']]
    bid_qtys = [q for _, q, _ in depth['bids']]
    
    # Asks (red, right side)
    ask_prices = [p for p, _, _ in depth['asks']]
    ask_qtys = [q for _, q, _ in depth['asks']]
    
    # Combined for plotting
    all_prices = bid_prices[::-1] + ask_prices
    all_qtys = [-q for q in bid_qtys[::-1]] + ask_qtys  # Negative for bids
    colors = ['green'] * len(bid_prices) + ['red'] * len(ask_prices)
    
    y_pos = range(len(all_prices))
    bars = ax1.barh(y_pos, all_qtys, color=colors, alpha=0.7, edgecolor='black')
    ax1.set_yticks(y_pos)
    ax1.set_yticklabels([f'${p:.2f}' for p in all_prices])
    ax1.set_xlabel('Quantity (Negative=Bids, Positive=Asks)')
    ax1.set_ylabel('Price')
    ax1.set_title('Order Book Depth', fontsize=14, fontweight='bold')
    ax1.axvline(0, color='black', linewidth=2)
    
    # Add quantity labels
    for bar, qty in zip(bars, all_qtys):
        width = bar.get_width()
        label_x = width + 20 if width > 0 else width - 20
        ax1.annotate(f'{abs(qty)}', xy=(label_x, bar.get_y() + bar.get_height()/2),
                     ha='left' if width > 0 else 'right', va='center', fontsize=9)
    
    # Add spread annotation
    ax1.axhspan(len(bid_prices) - 0.5, len(bid_prices) + 0.5, 
                alpha=0.2, color='yellow', label=f'Spread: ${lob.spread:.2f}')
    ax1.legend(loc='upper right')
    
    # === Right Plot: Cumulative Depth ===
    ax2 = axes[1]
    
    # Calculate cumulative depth
    bid_cum = np.cumsum(bid_qtys)
    ask_cum = np.cumsum(ask_qtys)
    
    # Plot step functions
    ax2.step(bid_prices, bid_cum, 'g-', linewidth=2, where='post', label='Cumulative Bids')
    ax2.fill_between(bid_prices, bid_cum, step='post', alpha=0.3, color='green')
    
    ax2.step(ask_prices, ask_cum, 'r-', linewidth=2, where='post', label='Cumulative Asks')
    ax2.fill_between(ask_prices, ask_cum, step='post', alpha=0.3, color='red')
    
    ax2.axvline(lob.mid_price, color='blue', linestyle='--', linewidth=2,
                label=f'Mid: ${lob.mid_price:.2f}')
    ax2.axvline(lob.best_bid, color='green', linestyle=':', linewidth=1.5)
    ax2.axvline(lob.best_ask, color='red', linestyle=':', linewidth=1.5)
    
    ax2.set_xlabel('Price ($)')
    ax2.set_ylabel('Cumulative Quantity')
    ax2.set_title('Cumulative Depth Chart', fontsize=14, fontweight='bold')
    ax2.legend()
    ax2.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()

visualize_order_book(lob)

---

## 6. Order Book Imbalance and Price Prediction

**Order book imbalance** is a key signal used by HFT and algorithmic traders.

### Imbalance Formula

$$\text{Imbalance} = \frac{Q_{bid} - Q_{ask}}{Q_{bid} + Q_{ask}}$$

- **Imbalance > 0**: More buy pressure → Price likely to increase
- **Imbalance < 0**: More sell pressure → Price likely to decrease

In [None]:
def calculate_weighted_imbalance(lob: LimitOrderBook, levels: int = 5, 
                                  decay: float = 0.5) -> float:
    """Calculate exponentially-weighted order book imbalance.
    
    Levels closer to the best bid/ask receive higher weight.
    
    Args:
        lob: Limit order book
        levels: Number of price levels to consider
        decay: Exponential decay factor (0-1)
    
    Returns:
        Weighted imbalance in range [-1, 1]
    """
    depth = lob.get_depth(levels)
    
    # Calculate weights
    weights = np.array([decay ** i for i in range(levels)])
    weights = weights / weights.sum()  # Normalize
    
    # Get quantities
    bid_qtys = np.array([q for _, q, _ in depth['bids']] + [0] * (levels - len(depth['bids'])))
    ask_qtys = np.array([q for _, q, _ in depth['asks']] + [0] * (levels - len(depth['asks'])))
    
    # Weighted sums
    weighted_bid = np.sum(bid_qtys[:levels] * weights[:len(bid_qtys)])
    weighted_ask = np.sum(ask_qtys[:levels] * weights[:len(ask_qtys)])
    
    total = weighted_bid + weighted_ask
    return (weighted_bid - weighted_ask) / total if total > 0 else 0

# Test with different decay factors
print("Order Book Imbalance Analysis")
print("=" * 40)

for decay in [0.3, 0.5, 0.7, 0.9]:
    imb = calculate_weighted_imbalance(lob, levels=5, decay=decay)
    direction = "BUY pressure" if imb > 0 else "SELL pressure"
    print(f"Decay={decay:.1f}: Imbalance={imb:+.4f} ({direction})")

In [None]:
# Simulate imbalance and price changes
np.random.seed(123)

def simulate_imbalance_price_relationship(n_obs: int = 500) -> pd.DataFrame:
    """Simulate order book imbalance and future price changes."""
    
    imbalances = np.random.uniform(-1, 1, n_obs)
    
    # Price change = f(imbalance) + noise
    # In reality, high imbalance predicts short-term price movement
    alpha = 0.05  # Sensitivity to imbalance
    noise = np.random.normal(0, 0.02, n_obs)
    
    price_changes = alpha * imbalances + noise
    
    return pd.DataFrame({
        'imbalance': imbalances,
        'price_change': price_changes,
        'price_change_pct': price_changes * 100
    })

sim_df = simulate_imbalance_price_relationship()

# Visualize
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Scatter plot
ax1 = axes[0]
ax1.scatter(sim_df['imbalance'], sim_df['price_change_pct'], alpha=0.5, s=20)

# Regression line
z = np.polyfit(sim_df['imbalance'], sim_df['price_change_pct'], 1)
p = np.poly1d(z)
x_line = np.linspace(-1, 1, 100)
ax1.plot(x_line, p(x_line), 'r-', linewidth=2, 
         label=f'y = {z[0]:.3f}x + {z[1]:.3f}')

ax1.set_xlabel('Order Book Imbalance')
ax1.set_ylabel('Future Price Change (%)')
ax1.set_title('Imbalance vs Future Price Change', fontsize=12, fontweight='bold')
ax1.axhline(0, color='black', linestyle='-', linewidth=0.5)
ax1.axvline(0, color='black', linestyle='-', linewidth=0.5)
ax1.legend()
ax1.grid(True, alpha=0.3)

# Binned average
ax2 = axes[1]
bins = pd.cut(sim_df['imbalance'], bins=10)
binned = sim_df.groupby(bins)['price_change_pct'].agg(['mean', 'std', 'count'])

x_centers = [interval.mid for interval in binned.index]
ax2.bar(range(len(x_centers)), binned['mean'], yerr=binned['std']/np.sqrt(binned['count']),
        capsize=3, alpha=0.7, color='steelblue', edgecolor='black')
ax2.set_xticks(range(len(x_centers)))
ax2.set_xticklabels([f'{x:.1f}' for x in x_centers], rotation=45)
ax2.set_xlabel('Imbalance Bin')
ax2.set_ylabel('Average Price Change (%)')
ax2.set_title('Binned Imbalance → Price Change', fontsize=12, fontweight='bold')
ax2.axhline(0, color='red', linestyle='--', linewidth=1)

plt.tight_layout()
plt.show()

# Correlation
corr = sim_df['imbalance'].corr(sim_df['price_change'])
print(f"\nCorrelation between imbalance and price change: {corr:.4f}")

---

## 7. Price Impact Estimation

**Price impact** is the adverse price movement caused by executing a large order.

### Square Root Impact Model (Kyle, 1985)

$$\text{Impact} = \lambda \cdot \sigma \cdot \sqrt{\frac{Q}{V}}$$

Where:
- $\lambda$ = Impact coefficient
- $\sigma$ = Volatility
- $Q$ = Order size
- $V$ = Average daily volume

In [None]:
def estimate_price_impact(lob: LimitOrderBook, 
                          order_size: int, 
                          side: Side) -> Dict:
    """Estimate price impact of executing a large order.
    
    Returns actual impact from walking the book and theoretical estimates.
    """
    depth = lob.get_depth(levels=20)
    mid_price = lob.mid_price
    
    if side == Side.BUY:
        levels = depth['asks']
    else:
        levels = depth['bids']
    
    # Walk the book to calculate actual execution
    remaining = order_size
    total_cost = 0
    levels_consumed = 0
    
    for price, qty, _ in levels:
        if remaining <= 0:
            break
        
        fill_qty = min(remaining, qty)
        total_cost += fill_qty * price
        remaining -= fill_qty
        levels_consumed += 1
    
    filled_qty = order_size - remaining
    
    if filled_qty > 0:
        avg_price = total_cost / filled_qty
        
        # Calculate various impact measures
        if side == Side.BUY:
            absolute_impact = avg_price - mid_price
        else:
            absolute_impact = mid_price - avg_price
        
        relative_impact_bps = (absolute_impact / mid_price) * 10000
        
        return {
            'order_size': order_size,
            'filled_quantity': filled_qty,
            'unfilled_quantity': remaining,
            'average_price': avg_price,
            'mid_price': mid_price,
            'absolute_impact': absolute_impact,
            'relative_impact_bps': relative_impact_bps,
            'total_cost': total_cost,
            'levels_consumed': levels_consumed,
            'fill_rate': filled_qty / order_size
        }
    else:
        return {'error': 'No liquidity available'}

# Test with different order sizes
print("Price Impact Analysis for BUY Orders")
print("=" * 60)

order_sizes = [100, 500, 1000, 2000, 3000]
impacts = []

for size in order_sizes:
    result = estimate_price_impact(lob, size, Side.BUY)
    impacts.append(result)
    print(f"\nOrder Size: {size:,} shares")
    print(f"  Filled: {result['filled_quantity']:,} ({result['fill_rate']:.1%})")
    print(f"  Avg Price: ${result['average_price']:.4f}")
    print(f"  Impact: ${result['absolute_impact']:.4f} ({result['relative_impact_bps']:.2f} bps)")
    print(f"  Levels Consumed: {result['levels_consumed']}")

In [None]:
# Visualize price impact curve
order_sizes_extended = list(range(50, 5001, 50))
impacts_extended = []

for size in order_sizes_extended:
    result = estimate_price_impact(lob, size, Side.BUY)
    if 'error' not in result:
        impacts_extended.append({
            'size': size,
            'impact_bps': result['relative_impact_bps'],
            'fill_rate': result['fill_rate']
        })

impact_df = pd.DataFrame(impacts_extended)

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Impact curve
ax1 = axes[0]
ax1.plot(impact_df['size'], impact_df['impact_bps'], 'b-', linewidth=2)
ax1.fill_between(impact_df['size'], 0, impact_df['impact_bps'], alpha=0.3)
ax1.set_xlabel('Order Size (shares)')
ax1.set_ylabel('Price Impact (bps)')
ax1.set_title('Price Impact Curve', fontsize=12, fontweight='bold')
ax1.grid(True, alpha=0.3)

# Add sqrt fit
# Impact ∝ sqrt(size) is the typical relationship
from scipy.optimize import curve_fit

def sqrt_impact(x, a):
    return a * np.sqrt(x)

valid_data = impact_df[impact_df['fill_rate'] == 1.0]
if len(valid_data) > 5:
    popt, _ = curve_fit(sqrt_impact, valid_data['size'], valid_data['impact_bps'])
    x_fit = np.linspace(valid_data['size'].min(), valid_data['size'].max(), 100)
    ax1.plot(x_fit, sqrt_impact(x_fit, *popt), 'r--', linewidth=2, 
             label=f'√x fit: {popt[0]:.3f}√x')
    ax1.legend()

# Fill rate
ax2 = axes[1]
ax2.plot(impact_df['size'], impact_df['fill_rate'] * 100, 'g-', linewidth=2)
ax2.axhline(100, color='red', linestyle='--', alpha=0.7)
ax2.set_xlabel('Order Size (shares)')
ax2.set_ylabel('Fill Rate (%)')
ax2.set_title('Fill Rate vs Order Size', fontsize=12, fontweight='bold')
ax2.set_ylim([0, 105])
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

---

## 8. Exercises

### Exercise 1: Implement LOB Snapshot
Create a function that returns a formatted string showing the top 5 levels of the order book.

In [None]:
def format_lob_snapshot(lob: LimitOrderBook, levels: int = 5) -> str:
    """Format order book as a readable table.
    
    TODO: Implement this function
    
    Expected output format:
    =========== ORDER BOOK ===========
           BIDS    |    ASKS
    --------------------------------
     800 @ 100.00  | 100.50 @ 600
     600 @  99.50  | 101.00 @ 500
     ...
    """
    # Your code here
    pass

# Test your implementation
# print(format_lob_snapshot(lob))

### Exercise 2: Calculate Effective Spread
Implement a function to calculate the effective spread for executed trades.

In [None]:
def calculate_effective_spread(trade_price: float, 
                                mid_price: float,
                                side: Side) -> float:
    """Calculate effective spread for a trade.
    
    Effective Spread = 2 * |Trade Price - Mid Price|
    
    For buys: trade_price > mid_price (positive cost)
    For sells: trade_price < mid_price (positive cost)
    
    TODO: Implement this function
    """
    # Your code here
    pass

# Test
# eff_spread = calculate_effective_spread(100.25, 100.00, Side.BUY)
# print(f"Effective Spread: ${eff_spread:.4f}")

### Exercise 3: Build an Order Flow Signal
Create a function that calculates rolling order book imbalance over multiple snapshots.

In [None]:
def calculate_rolling_imbalance(imbalances: List[float], 
                                 window: int = 10) -> List[float]:
    """Calculate exponentially weighted rolling imbalance.
    
    TODO: Implement using exponential moving average
    
    Args:
        imbalances: List of imbalance observations
        window: EMA window size
    
    Returns:
        List of smoothed imbalance values
    """
    # Your code here
    pass

# Test with simulated data
# np.random.seed(42)
# imbalances = list(np.random.uniform(-1, 1, 100))
# smoothed = calculate_rolling_imbalance(imbalances, window=10)
# plt.plot(imbalances, alpha=0.5, label='Raw')
# plt.plot(smoothed, label='Smoothed')
# plt.legend()
# plt.show()

---

## 9. Summary

### Key Takeaways

1. **Limit Order Book Structure**
   - Bids (buy orders) organized by decreasing price
   - Asks (sell orders) organized by increasing price
   - Price-Time priority for order matching

2. **Bid-Ask Spread**
   - Fundamental cost of immediacy
   - Affected by volatility, volume, information asymmetry
   - U-shaped intraday pattern

3. **Market Depth**
   - Measures liquidity available at various price levels
   - Depth imbalance predicts short-term price movements
   - Key input for execution algorithms

4. **Price Impact**
   - Increases non-linearly with order size (√x relationship)
   - Critical for optimal execution and cost estimation

### Next Steps
- **Day 02**: Market Making Strategies
- **Day 03**: Order Flow Toxicity (VPIN)
- **Day 04**: Optimal Execution (Almgren-Chriss)

---

## 10. References

1. **Harris, L.** (2003). *Trading and Exchanges: Market Microstructure for Practitioners*
2. **Kyle, A.S.** (1985). "Continuous Auctions and Insider Trading." *Econometrica*
3. **Gueant, O.** (2016). *The Financial Mathematics of Market Liquidity*
4. **Cartea, A., Jaimungal, S., & Penalva, J.** (2015). *Algorithmic and High-Frequency Trading*