# Market Microstructure Theory
## Week 21 - Comprehensive Overview

**Market microstructure** studies how markets operate at the microscopic level - the mechanisms through which latent demands are translated into transactions.

### Why It Matters for Quant Trading:
- **Execution optimization**: Minimize transaction costs and market impact
- **Alpha generation**: Order flow contains predictive information
- **Risk management**: Understand liquidity risk in stressed markets
- **Market making**: Profitably provide liquidity

### Key Topics Covered:
1. Market Structure & Order Types
2. Bid-Ask Spread Analysis
3. Order Book Modeling
4. Price Impact Models
5. Market Maker Inventory Models
6. Information Asymmetry (Adverse Selection)
7. Trade Classification Algorithms
8. Interview Questions & Key Formulas

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from dataclasses import dataclass
from enum import Enum
from typing import List, Optional, Tuple
from collections import defaultdict
import warnings
warnings.filterwarnings('ignore')

np.random.seed(42)
print("Market Microstructure Theory - Libraries Loaded")

Market Microstructure Theory - Libraries Loaded


---
## 1. Market Structure Fundamentals

### Order Types

| Order Type | Description | Execution | Priority |
|------------|-------------|-----------|----------|
| **Market Order** | Execute immediately at best available price | Immediate | None |
| **Limit Order** | Execute at specified price or better | Conditional | Price-Time |
| **Stop Order** | Becomes market order when trigger price hit | Triggered | N/A |
| **Iceberg** | Large order shown in small portions | Hidden | Partial |

### Market Participants

| Participant | Role | Information | Time Horizon |
|-------------|------|-------------|--------------|
| **Market Makers** | Provide liquidity, quote 2-sided | Low | Very short |
| **Informed Traders** | Trade on private information | High | Variable |
| **Noise Traders** | Trade for non-informational reasons | Low | Variable |
| **Arbitrageurs** | Exploit price discrepancies | Public | Short |
| **Institutional** | Large orders, index tracking | Mixed | Long |

### Market Quality Dimensions
- **Liquidity**: Ability to trade without price impact
- **Efficiency**: Speed of price discovery
- **Transparency**: Visibility of orders/trades
- **Resilience**: Recovery speed after large trades

In [2]:
# Basic Order and Order Book Classes

class Side(Enum):
    BUY = 1
    SELL = -1

@dataclass
class Order:
    """Represents a limit order"""
    order_id: int
    side: Side
    price: float
    quantity: int
    timestamp: float
    
    def __repr__(self):
        return f"Order({self.side.name}, P={self.price:.2f}, Q={self.quantity})"

# Example orders
orders = [
    Order(1, Side.BUY, 99.50, 100, 0.001),
    Order(2, Side.BUY, 99.45, 200, 0.002),
    Order(3, Side.SELL, 100.50, 150, 0.003),
    Order(4, Side.SELL, 100.55, 100, 0.004),
]

print("Sample Limit Orders:")
for o in orders:
    print(f"  {o}")

Sample Limit Orders:
  Order(BUY, P=99.50, Q=100)
  Order(BUY, P=99.45, Q=200)
  Order(SELL, P=100.50, Q=150)
  Order(SELL, P=100.55, Q=100)


---
## 2. Bid-Ask Spread Analysis

### Spread Components

The bid-ask spread compensates market makers for:

1. **Adverse Selection Cost**: Risk of trading with informed traders
2. **Inventory Holding Cost**: Risk from holding unbalanced inventory  
3. **Order Processing Cost**: Fixed costs of market making

$$\text{Spread} = \text{Adverse Selection} + \text{Inventory} + \text{Order Processing}$$

### Spread Measures

| Measure | Formula | Interpretation |
|---------|---------|----------------|
| **Quoted Spread** | $S_q = P_{ask} - P_{bid}$ | Visible cost |
| **Relative Spread** | $S_{rel} = \frac{P_{ask} - P_{bid}}{M}$ | Percentage cost |
| **Effective Spread** | $S_e = 2 \cdot D \cdot (P_{trade} - M)$ | Actual cost paid |
| **Realized Spread** | $S_r = 2 \cdot D \cdot (P_{trade} - M_{t+\Delta})$ | Revenue to MM |

Where $M = \frac{P_{ask} + P_{bid}}{2}$ is the midprice and $D$ is trade direction (+1 buy, -1 sell)

In [3]:
def calculate_spreads(bid: float, ask: float, trade_price: float, 
                       direction: int, future_mid: float = None) -> dict:
    """
    Calculate various spread measures
    
    Parameters:
    - bid, ask: Best bid and ask prices
    - trade_price: Execution price
    - direction: +1 for buy, -1 for sell
    - future_mid: Midpoint after Δt (for realized spread)
    """
    mid = (bid + ask) / 2
    
    quoted_spread = ask - bid
    relative_spread = quoted_spread / mid
    effective_spread = 2 * direction * (trade_price - mid)
    
    realized_spread = None
    if future_mid is not None:
        realized_spread = 2 * direction * (trade_price - future_mid)
    
    return {
        'quoted_spread': quoted_spread,
        'relative_spread_bps': relative_spread * 10000,
        'effective_spread': effective_spread,
        'realized_spread': realized_spread
    }

# Example
spreads = calculate_spreads(bid=99.95, ask=100.05, trade_price=100.03, 
                           direction=1, future_mid=100.02)
print("Spread Analysis:")
for k, v in spreads.items():
    if v is not None:
        print(f"  {k}: {v:.4f}")

Spread Analysis:
  quoted_spread: 0.1000
  relative_spread_bps: 10.0000
  effective_spread: 0.0600
  realized_spread: 0.0200


### Roll Model (1984)

Estimates effective spread from transaction prices alone, assuming:
- Efficient prices follow random walk
- Trades alternate between bid and ask with equal probability

**Key Result:**
$$\text{Cov}(\Delta P_t, \Delta P_{t-1}) = -c^2$$

Where $c$ is the effective half-spread. Therefore:
$$c = \sqrt{-\text{Cov}(\Delta P_t, \Delta P_{t-1})}$$

In [4]:
def roll_spread_estimate(prices: np.ndarray) -> float:
    """
    Estimate effective spread using Roll (1984) model
    """
    returns = np.diff(prices)
    cov = np.cov(returns[:-1], returns[1:])[0, 1]
    
    # Spread only defined if covariance is negative
    if cov < 0:
        half_spread = np.sqrt(-cov)
        return 2 * half_spread
    else:
        return np.nan  # Model assumption violated

# Simulate prices with bid-ask bounce
true_spread = 0.10
n_trades = 1000
efficient_prices = 100 + np.cumsum(np.random.randn(n_trades) * 0.05)
trade_directions = np.random.choice([-1, 1], n_trades)
observed_prices = efficient_prices + trade_directions * true_spread / 2

estimated_spread = roll_spread_estimate(observed_prices)
print(f"True spread: {true_spread:.4f}")
print(f"Roll estimated spread: {estimated_spread:.4f}")

True spread: 0.1000
Roll estimated spread: 0.0913


---
## 3. Order Book Modeling

### Limit Order Book (LOB) Structure

```
         ASK SIDE (Offers)
Level    Price    Quantity
  3     100.55      100
  2     100.50      150
  1     100.45      200  ← Best Ask
         --------- SPREAD ---------
  1      99.95      180  ← Best Bid
  2      99.90      250
  3      99.85      120
         BID SIDE (Bids)
```

### Order Book Metrics

**Book Imbalance:**
$$I = \frac{V_{bid} - V_{ask}}{V_{bid} + V_{ask}}$$

- $I > 0$: More buying pressure → price likely to rise
- $I < 0$: More selling pressure → price likely to fall

**Depth at Level k:**
$$D_k = \sum_{i=1}^{k} Q_i^{bid} + \sum_{i=1}^{k} Q_i^{ask}$$

In [5]:
class LimitOrderBook:
    """Simple Limit Order Book implementation"""
    
    def __init__(self):
        self.bids = {}  # price -> list of (quantity, order_id)
        self.asks = {}
        self.order_id = 0
    
    def add_order(self, side: str, price: float, quantity: int) -> int:
        self.order_id += 1
        book = self.bids if side == 'buy' else self.asks
        if price not in book:
            book[price] = []
        book[price].append((quantity, self.order_id))
        return self.order_id
    
    def best_bid(self) -> Optional[float]:
        return max(self.bids.keys()) if self.bids else None
    
    def best_ask(self) -> Optional[float]:
        return min(self.asks.keys()) if self.asks else None
    
    def midprice(self) -> Optional[float]:
        bb, ba = self.best_bid(), self.best_ask()
        return (bb + ba) / 2 if bb and ba else None
    
    def spread(self) -> Optional[float]:
        bb, ba = self.best_bid(), self.best_ask()
        return ba - bb if bb and ba else None
    
    def book_imbalance(self, levels: int = 1) -> float:
        """Calculate order book imbalance"""
        bid_vol = sum(sum(q for q, _ in self.bids.get(p, [])) 
                     for p in sorted(self.bids.keys(), reverse=True)[:levels])
        ask_vol = sum(sum(q for q, _ in self.asks.get(p, [])) 
                     for p in sorted(self.asks.keys())[:levels])
        total = bid_vol + ask_vol
        return (bid_vol - ask_vol) / total if total > 0 else 0
    
    def display(self, levels: int = 3):
        print("\n--- ORDER BOOK ---")
        ask_prices = sorted(self.asks.keys())[:levels]
        for p in reversed(ask_prices):
            vol = sum(q for q, _ in self.asks[p])
            print(f"  ASK: {p:.2f} x {vol}")
        print(f"  --- SPREAD: {self.spread():.2f} ---")
        bid_prices = sorted(self.bids.keys(), reverse=True)[:levels]
        for p in bid_prices:
            vol = sum(q for q, _ in self.bids[p])
            print(f"  BID: {p:.2f} x {vol}")

# Build sample order book
lob = LimitOrderBook()
for price, qty in [(99.95, 180), (99.90, 250), (99.85, 120)]:
    lob.add_order('buy', price, qty)
for price, qty in [(100.05, 200), (100.10, 150), (100.15, 100)]:
    lob.add_order('sell', price, qty)

lob.display()
print(f"\nMidprice: {lob.midprice():.2f}")
print(f"Book Imbalance (1 level): {lob.book_imbalance(1):.3f}")
print(f"Book Imbalance (3 levels): {lob.book_imbalance(3):.3f}")


--- ORDER BOOK ---
  ASK: 100.15 x 100
  ASK: 100.10 x 150
  ASK: 100.05 x 200
  --- SPREAD: 0.10 ---
  BID: 99.95 x 180
  BID: 99.90 x 250
  BID: 99.85 x 120

Midprice: 100.00
Book Imbalance (1 level): -0.053
Book Imbalance (3 levels): 0.100


---
## 4. Market Impact Models

### What is Market Impact?

**Market impact** is the change in asset price caused by a trade. Large orders move prices against the trader.

### Why Market Impact Matters

| Aspect | Impact |
|--------|--------|
| **Transaction Costs** | Large orders pay more than quoted spread |
| **Information Leakage** | Market learns about your order flow |
| **Strategy Capacity** | Limits how much capital you can deploy |
| **Alpha Decay** | Impact costs erode trading signals |

### Impact Components

$$\text{Total Impact} = \underbrace{\text{Permanent Impact}}_{\text{Information}} + \underbrace{\text{Temporary Impact}}_{\text{Liquidity}}$$

**Permanent Impact ($g$)**: Price moves due to information content - doesn't revert
- Reflects fundamental value changes
- Increases with information quality of trader

**Temporary Impact ($h$)**: Price moves due to liquidity demand - reverts over time
- Reflects supply/demand imbalance
- Decays as market absorbs the trade

In [None]:
class MarketImpactModel:
    """
    Linear Market Impact Model
    
    Price Change = η * sign(Q) * |Q|^δ
    
    Where:
    - η (eta): Impact coefficient
    - Q: Order size
    - δ (delta): Impact exponent (1 for linear, 0.5 for square root)
    """
    
    def __init__(self, eta: float = 0.01, delta: float = 0.5, 
                 sigma: float = 0.02, adv: float = 1000000):
        """
        Parameters:
        - eta: Impact coefficient
        - delta: Impact exponent
        - sigma: Daily volatility
        - adv: Average daily volume
        """
        self.eta = eta
        self.delta = delta
        self.sigma = sigma
        self.adv = adv
    
    def permanent_impact(self, quantity: float) -> float:
        """Permanent impact (doesn't revert)"""
        sign = np.sign(quantity)
        return self.eta * sign * (np.abs(quantity) / self.adv) ** self.delta
    
    def temporary_impact(self, quantity: float, urgency: float = 1.0) -> float:
        """
        Temporary impact (reverts over time)
        Higher urgency = trading faster = more temporary impact
        """
        sign = np.sign(quantity)
        return urgency * self.eta * 2 * sign * (np.abs(quantity) / self.adv) ** self.delta

    def total_impact(self, quantity: float, urgency: float = 1.0) -> float:
        """Total market impact"""
        return self.permanent_impact(quantity) + self.temporary_impact(quantity, urgency)
    
    def impact_cost(self, quantity: float, price: float, urgency: float = 1.0) -> float:
        """Dollar cost of market impact"""
        avg_impact = self.total_impact(quantity, urgency) / 2  # Average execution price
        return np.abs(quantity) * price * avg_impact

# Example: Impact of different order sizes
model = MarketImpactModel(eta=0.1, delta=0.5, sigma=0.02, adv=1_000_000)

print("Market Impact Analysis")
print("=" * 60)
print(f"{'Order Size':>12} {'% of ADV':>10} {'Perm Impact':>12} {'Temp Impact':>12} {'Total':>10}")
print("-" * 60)

for size in [1000, 5000, 10000, 50000, 100000]:
    perm = model.permanent_impact(size) * 100
    temp = model.temporary_impact(size) * 100
    total = model.total_impact(size) * 100
    pct_adv = size / model.adv * 100
    print(f"{size:>12,} {pct_adv:>9.1f}% {perm:>11.3f}% {temp:>11.3f}% {total:>9.3f}%")

---
## 5. Square Root Law of Market Impact

### The Universal Square Root Law

Empirical studies (Almgren et al., Kyle-Obizhaeva) show market impact follows:

$$\Delta P = \sigma \cdot \sqrt{\frac{Q}{V}}$$

Where:
- $\Delta P$: Price change (in price units or %)
- $\sigma$: Daily volatility  
- $Q$: Order quantity (shares)
- $V$: Daily volume (shares)

### Key Properties

1. **Sublinear**: Doubling order size less than doubles impact
2. **Universal**: Applies across asset classes, markets, time periods
3. **Concave**: Marginal impact decreases with size

### Derivation Intuition

From Kyle's Lambda model with strategic trading:
- Informed trader optimizes to hide information
- Market maker sets prices based on order flow
- Equilibrium impact is square root of order size

### Empirical Evidence

| Study | Asset Class | Finding |
|-------|-------------|---------|
| Almgren et al. (2005) | US Equities | δ ≈ 0.5-0.6 |
| Moro et al. (2009) | LSE | δ ≈ 0.5 |
| Zarinelli et al. (2015) | FX | δ ≈ 0.5 |

In [None]:
def square_root_impact(order_size: float, daily_volume: float, 
                       volatility: float, eta: float = 1.0) -> float:
    """
    Square Root Law Market Impact
    
    ΔP/P = η * σ * √(Q/V)
    
    Parameters:
    - order_size: Number of shares to trade
    - daily_volume: Average daily volume
    - volatility: Daily volatility (e.g., 0.02 for 2%)
    - eta: Scaling factor (typically 0.1 to 1.0)
    
    Returns:
    - Price impact as fraction (0.01 = 1%)
    """
    participation_rate = order_size / daily_volume
    return eta * volatility * np.sqrt(participation_rate)


def visualize_square_root_law():
    """Visualize the square root impact relationship"""
    fig, axes = plt.subplots(1, 3, figsize=(14, 4))
    
    # Parameters
    daily_vol = 1_000_000
    sigma = 0.02  # 2% daily vol
    
    # 1. Impact vs Order Size
    sizes = np.linspace(1000, 200000, 100)
    impacts = [square_root_impact(s, daily_vol, sigma) * 100 for s in sizes]
    
    axes[0].plot(sizes/1000, impacts, 'b-', linewidth=2)
    axes[0].set_xlabel('Order Size (000s shares)')
    axes[0].set_ylabel('Price Impact (%)')
    axes[0].set_title('Square Root Law: Impact vs Size')
    axes[0].grid(True, alpha=0.3)
    
    # Compare with linear impact
    linear_impacts = [0.02 * (s/daily_vol) * 100 * 20 for s in sizes]  # Scaled linear
    axes[0].plot(sizes/1000, linear_impacts, 'r--', alpha=0.5, label='Linear (scaled)')
    axes[0].legend()
    
    # 2. Impact at different participation rates
    participation_rates = np.linspace(0.001, 0.20, 100)
    for vol_mult, label in [(0.5, 'Low Vol'), (1.0, 'Normal'), (2.0, 'High Vol')]:
        impacts = [sigma * vol_mult * np.sqrt(pr) * 100 for pr in participation_rates]
        axes[1].plot(participation_rates * 100, impacts, label=label, linewidth=2)
    
    axes[1].set_xlabel('Participation Rate (%)')
    axes[1].set_ylabel('Price Impact (%)')
    axes[1].set_title('Impact by Volatility Regime')
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    # 3. Cost scaling - Impact Cost = Size × Impact
    sizes = np.linspace(1000, 200000, 100)
    costs = [s * square_root_impact(s, daily_vol, sigma) for s in sizes]
    
    # Cost scales as Q^1.5 due to square root impact
    axes[2].plot(sizes/1000, np.array(costs)/1000, 'g-', linewidth=2)
    axes[2].set_xlabel('Order Size (000s shares)')
    axes[2].set_ylabel('Impact Cost ($000s)')
    axes[2].set_title('Total Impact Cost (scales as Q^1.5)')
    axes[2].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Print example calculations
    print("\nSquare Root Impact Examples:")
    print("=" * 65)
    print(f"{'Order Size':>12} {'% of ADV':>10} {'Impact (%)':>12} {'Cost ($)':>15}")
    print("-" * 65)
    
    price = 100  # Assume $100 stock
    for size in [10000, 25000, 50000, 100000, 200000]:
        impact = square_root_impact(size, daily_vol, sigma)
        cost = size * price * impact
        pct_adv = size / daily_vol * 100
        print(f"{size:>12,} {pct_adv:>9.1f}% {impact*100:>11.4f}% ${cost:>13,.0f}")

visualize_square_root_law()

---
## 6. Execution Algorithms Overview

### Why Use Execution Algorithms?

| Problem | Solution |
|---------|----------|
| Large orders move prices | Split orders over time |
| Need to benchmark performance | Track vs TWAP/VWAP |
| Balance urgency vs cost | Optimize execution schedule |
| Hide order intentions | Randomize timing/sizing |

### Classification of Execution Algorithms

```
Execution Algorithms
├── Schedule-Based (Passive)
│   ├── TWAP (Time-Weighted Average Price)
│   ├── VWAP (Volume-Weighted Average Price)
│   └── POV (Percentage of Volume)
│
├── Opportunistic (Adaptive)
│   ├── Implementation Shortfall
│   ├── Arrival Price
│   └── Aggressive-in-the-Money
│
└── Liquidity-Seeking
    ├── Dark Pool Aggregation
    ├── Iceberg Orders
    └── Smart Order Routing
```

### Key Trade-offs

$$\text{Execution Quality} = f(\text{Urgency}, \text{Impact}, \text{Risk}, \text{Information})$$

| Strategy | Impact | Timing Risk | Information Leakage |
|----------|--------|-------------|---------------------|
| Execute All Now | High | Low | Low |
| TWAP | Medium | Medium | Medium |
| VWAP | Medium | Medium | Medium |
| IS-Optimal | Optimized | Balanced | Medium |
| Slow Execution | Low | High | High |

---
## 7. TWAP (Time-Weighted Average Price)

### Concept

**TWAP** splits the order evenly across time intervals:

$$q_i = \frac{Q}{N} \quad \forall i = 1, ..., N$$

Where:
- $Q$ = Total order quantity
- $N$ = Number of time intervals
- $q_i$ = Quantity to trade in interval $i$

### TWAP Benchmark

$$\text{TWAP Benchmark} = \frac{1}{N} \sum_{i=1}^{N} P_i$$

### Pros and Cons

| Pros | Cons |
|------|------|
| Simple to implement | Ignores volume patterns |
| Minimizes timing risk | Predictable (gameable) |
| Good for low-volume stocks | May trade in illiquid periods |
| Easy to understand/benchmark | Fixed schedule regardless of conditions |

### When to Use TWAP

- Low-information trades (index rebalancing)
- Illiquid stocks where volume profile is flat
- When you want to minimize market impact
- Passive orders not driven by alpha signals

In [None]:
class TWAPExecutor:
    """
    Time-Weighted Average Price Execution Algorithm
    
    Splits order evenly across time intervals
    """
    
    def __init__(self, total_quantity: int, num_intervals: int, 
                 start_time: float = 0.0, end_time: float = 1.0):
        """
        Parameters:
        - total_quantity: Total shares to execute
        - num_intervals: Number of time buckets
        - start_time, end_time: Normalized trading window [0, 1]
        """
        self.total_quantity = total_quantity
        self.num_intervals = num_intervals
        self.start_time = start_time
        self.end_time = end_time
        
        # Equal quantity per interval
        self.quantity_per_interval = total_quantity / num_intervals
        self.interval_duration = (end_time - start_time) / num_intervals
        
    def get_schedule(self) -> pd.DataFrame:
        """Generate the TWAP execution schedule"""
        schedule = []
        cumulative = 0
        
        for i in range(self.num_intervals):
            interval_start = self.start_time + i * self.interval_duration
            interval_end = interval_start + self.interval_duration
            qty = self.quantity_per_interval
            cumulative += qty
            
            schedule.append({
                'interval': i + 1,
                'start_time': interval_start,
                'end_time': interval_end,
                'quantity': qty,
                'cumulative': cumulative,
                'pct_complete': cumulative / self.total_quantity * 100
            })
        
        return pd.DataFrame(schedule)
    
    def calculate_benchmark(self, prices: np.ndarray) -> float:
        """Calculate TWAP benchmark price"""
        return np.mean(prices)
    
    def evaluate_execution(self, execution_prices: np.ndarray, 
                          benchmark_prices: np.ndarray) -> dict:
        """
        Evaluate execution performance vs TWAP benchmark
        """
        twap_benchmark = self.calculate_benchmark(benchmark_prices)
        avg_execution = np.mean(execution_prices)
        
        # Slippage (positive = worse for buys)
        slippage = avg_execution - twap_benchmark
        slippage_bps = (slippage / twap_benchmark) * 10000
        
        return {
            'twap_benchmark': twap_benchmark,
            'avg_execution_price': avg_execution,
            'slippage': slippage,
            'slippage_bps': slippage_bps,
            'total_cost': slippage * self.total_quantity
        }

# Example TWAP execution
twap = TWAPExecutor(total_quantity=100000, num_intervals=10)
schedule = twap.get_schedule()

print("TWAP Execution Schedule")
print("=" * 70)
print(schedule.to_string(index=False))

# Simulate execution
np.random.seed(42)
benchmark_prices = 100 + np.cumsum(np.random.randn(10) * 0.1)  # Market prices
execution_prices = benchmark_prices + np.random.randn(10) * 0.02  # Actual fills

results = twap.evaluate_execution(execution_prices, benchmark_prices)
print("\nExecution Performance:")
print(f"  TWAP Benchmark: ${results['twap_benchmark']:.4f}")
print(f"  Avg Execution:  ${results['avg_execution_price']:.4f}")
print(f"  Slippage:       {results['slippage_bps']:.2f} bps")
print(f"  Total Cost:     ${results['total_cost']:,.2f}")

---
## 8. VWAP (Volume-Weighted Average Price)

### Concept

**VWAP** matches trading schedule to historical volume profile:

$$q_i = Q \cdot \frac{v_i}{\sum_{j=1}^{N} v_j}$$

Where:
- $v_i$ = Expected volume in interval $i$
- The schedule tracks the market's volume distribution

### VWAP Benchmark

$$\text{VWAP} = \frac{\sum_{i=1}^{N} P_i \cdot V_i}{\sum_{i=1}^{N} V_i}$$

### Volume Profile (U-Shape)

Markets typically exhibit **U-shaped** intraday volume:
- **Open**: High volume (overnight information)
- **Midday**: Low volume (lunch, low news)
- **Close**: High volume (rebalancing, MOC orders)

### Pros and Cons

| Pros | Cons |
|------|------|
| Matches market liquidity | Requires volume forecast |
| Industry standard benchmark | Still predictable |
| Minimizes impact relative to liquidity | Anchors to historical patterns |
| Good for institutional mandates | Doesn't adapt to real-time |

### VWAP vs TWAP Comparison

| Scenario | Better Algorithm |
|----------|-----------------|
| Liquid stocks | VWAP |
| Illiquid stocks | TWAP |
| Unknown volume profile | TWAP |
| Institutional benchmark | VWAP |
| Minimizing gaming risk | TWAP (less predictable) |

In [None]:
class VWAPExecutor:
    """
    Volume-Weighted Average Price Execution Algorithm
    
    Schedules orders according to expected volume profile
    """
    
    def __init__(self, total_quantity: int, volume_profile: np.ndarray = None):
        """
        Parameters:
        - total_quantity: Total shares to execute
        - volume_profile: Relative volume in each interval (sums to 1)
        """
        self.total_quantity = total_quantity
        
        # Default: U-shaped intraday volume profile
        if volume_profile is None:
            # Typical intraday pattern: high at open/close, low midday
            volume_profile = self._generate_u_shape_profile(13)  # Half-hourly intervals
        
        self.volume_profile = volume_profile / volume_profile.sum()
        self.num_intervals = len(volume_profile)
        
    def _generate_u_shape_profile(self, n_intervals: int) -> np.ndarray:
        """Generate U-shaped volume profile"""
        x = np.linspace(-1, 1, n_intervals)
        # U-shape: higher at ends, lower in middle
        profile = 0.5 * x**2 + 0.5 + np.random.rand(n_intervals) * 0.1
        return profile
    
    def get_schedule(self) -> pd.DataFrame:
        """Generate the VWAP execution schedule"""
        schedule = []
        cumulative = 0
        
        for i in range(self.num_intervals):
            qty = self.total_quantity * self.volume_profile[i]
            cumulative += qty
            
            schedule.append({
                'interval': i + 1,
                'volume_weight': self.volume_profile[i],
                'quantity': qty,
                'cumulative': cumulative,
                'pct_complete': cumulative / self.total_quantity * 100
            })
        
        return pd.DataFrame(schedule)
    
    def calculate_benchmark(self, prices: np.ndarray, volumes: np.ndarray) -> float:
        """Calculate VWAP benchmark"""
        return np.sum(prices * volumes) / np.sum(volumes)
    
    def evaluate_execution(self, execution_prices: np.ndarray, 
                          execution_quantities: np.ndarray,
                          market_prices: np.ndarray, 
                          market_volumes: np.ndarray) -> dict:
        """Evaluate execution vs VWAP benchmark"""
        vwap_benchmark = self.calculate_benchmark(market_prices, market_volumes)
        avg_execution = self.calculate_benchmark(execution_prices, execution_quantities)
        
        slippage = avg_execution - vwap_benchmark
        slippage_bps = (slippage / vwap_benchmark) * 10000
        
        return {
            'vwap_benchmark': vwap_benchmark,
            'avg_execution_price': avg_execution,
            'slippage': slippage,
            'slippage_bps': slippage_bps,
            'total_cost': slippage * self.total_quantity
        }


def visualize_vwap_vs_twap():
    """Compare VWAP and TWAP schedules"""
    total_qty = 100000
    
    vwap = VWAPExecutor(total_qty)
    twap = TWAPExecutor(total_qty, num_intervals=len(vwap.volume_profile))
    
    vwap_schedule = vwap.get_schedule()
    twap_schedule = twap.get_schedule()
    
    fig, axes = plt.subplots(1, 3, figsize=(14, 4))
    
    # 1. Volume profile
    axes[0].bar(range(1, len(vwap.volume_profile)+1), vwap.volume_profile * 100, 
                alpha=0.7, label='Expected Volume %')
    axes[0].set_xlabel('Interval')
    axes[0].set_ylabel('Volume (%)')
    axes[0].set_title('Typical U-Shaped Volume Profile')
    axes[0].grid(True, alpha=0.3)
    
    # 2. Quantity per interval comparison
    x = np.arange(len(vwap_schedule))
    width = 0.35
    axes[1].bar(x - width/2, vwap_schedule['quantity'], width, label='VWAP', alpha=0.7)
    axes[1].bar(x + width/2, twap_schedule['quantity'], width, label='TWAP', alpha=0.7)
    axes[1].set_xlabel('Interval')
    axes[1].set_ylabel('Quantity')
    axes[1].set_title('VWAP vs TWAP: Quantity per Interval')
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    # 3. Cumulative execution
    axes[2].plot(vwap_schedule['interval'], vwap_schedule['pct_complete'], 
                 'b-o', label='VWAP', linewidth=2)
    axes[2].plot(twap_schedule['interval'], twap_schedule['pct_complete'], 
                 'r--s', label='TWAP', linewidth=2)
    axes[2].set_xlabel('Interval')
    axes[2].set_ylabel('% Complete')
    axes[2].set_title('Cumulative Execution Progress')
    axes[2].legend()
    axes[2].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    print("\nVWAP Schedule (first 5 intervals):")
    print(vwap_schedule.head().to_string(index=False))

visualize_vwap_vs_twap()

---
## 9. Implementation Shortfall (IS)

### Definition

**Implementation Shortfall** measures the total cost of executing vs a paper portfolio:

$$\text{IS} = \text{Paper Return} - \text{Actual Return}$$

Or equivalently for a buy order:

$$\text{IS} = (P_{\text{avg execution}} - P_{\text{decision}}) \times Q$$

### Components of Implementation Shortfall

$$\text{IS} = \underbrace{\text{Delay Cost}}_{\text{Pre-trade drift}} + \underbrace{\text{Trading Cost}}_{\text{Impact + Spread}} + \underbrace{\text{Opportunity Cost}}_{\text{Unfilled portion}}$$

| Component | Definition | Cause |
|-----------|------------|-------|
| **Delay Cost** | Price move from decision to first trade | Market movement, order routing |
| **Trading Cost** | Cost of executing (spread + impact) | Liquidity consumption |
| **Opportunity Cost** | Value of unfilled orders | Incomplete execution |

### Why IS Matters

1. **Comprehensive**: Captures all trading costs
2. **Decision-based**: Benchmarks vs actual investment decision
3. **Incentive-aligned**: Rewards speed when alpha is high
4. **Industry standard**: Used by institutional investors

### IS vs VWAP

| Benchmark | Measures | Best For |
|-----------|----------|----------|
| **VWAP** | Execution quality vs market average | Passive/index trades |
| **IS** | Total implementation cost | Active/alpha-driven trades |

In [None]:
class ImplementationShortfall:
    """
    Implementation Shortfall Calculator
    
    Measures total cost of execution vs decision price
    """
    
    def __init__(self, decision_price: float, total_quantity: int, side: str = 'buy'):
        """
        Parameters:
        - decision_price: Price when investment decision was made
        - total_quantity: Target quantity to execute
        - side: 'buy' or 'sell'
        """
        self.decision_price = decision_price
        self.total_quantity = total_quantity
        self.side = side
        self.direction = 1 if side == 'buy' else -1
        
        # Track executions
        self.executions = []
        
    def add_execution(self, price: float, quantity: int, timestamp: float):
        """Record an execution"""
        self.executions.append({
            'price': price,
            'quantity': quantity,
            'timestamp': timestamp
        })
    
    def calculate_is(self, closing_price: float = None) -> dict:
        """
        Calculate Implementation Shortfall components
        
        Parameters:
        - closing_price: End-of-day price (for opportunity cost)
        """
        if not self.executions:
            return {'error': 'No executions recorded'}
        
        # Calculate executed quantity and average price
        total_executed = sum(e['quantity'] for e in self.executions)
        total_value = sum(e['price'] * e['quantity'] for e in self.executions)
        avg_execution_price = total_value / total_executed if total_executed > 0 else 0
        
        # First execution price (for delay cost)
        first_trade_price = self.executions[0]['price']
        
        # Paper portfolio value (if we bought at decision price)
        paper_value = self.decision_price * self.total_quantity
        actual_value = total_value
        
        # Components
        # Delay cost: price move from decision to first trade
        delay_cost = self.direction * (first_trade_price - self.decision_price) * total_executed
        
        # Trading cost: price move from first trade to average execution
        trading_cost = self.direction * (avg_execution_price - first_trade_price) * total_executed
        
        # Opportunity cost: value of unfilled orders
        unfilled = self.total_quantity - total_executed
        if closing_price is not None and unfilled > 0:
            opportunity_cost = self.direction * (closing_price - self.decision_price) * unfilled
        else:
            opportunity_cost = 0
        
        # Total IS
        total_is = delay_cost + trading_cost + opportunity_cost
        total_is_bps = (total_is / paper_value) * 10000
        
        return {
            'decision_price': self.decision_price,
            'avg_execution_price': avg_execution_price,
            'first_trade_price': first_trade_price,
            'total_executed': total_executed,
            'total_target': self.total_quantity,
            'fill_rate': total_executed / self.total_quantity * 100,
            'delay_cost': delay_cost,
            'trading_cost': trading_cost,
            'opportunity_cost': opportunity_cost,
            'total_is': total_is,
            'total_is_bps': total_is_bps
        }

# Example: Implementation Shortfall calculation
print("Implementation Shortfall Example")
print("=" * 60)

# Decision to buy 50,000 shares at $100
is_calc = ImplementationShortfall(decision_price=100.00, total_quantity=50000, side='buy')

# Simulate executions over time
# Price drifts up as we execute (realistic for large buy orders)
is_calc.add_execution(price=100.05, quantity=10000, timestamp=0.1)  # First trade
is_calc.add_execution(price=100.08, quantity=15000, timestamp=0.3)
is_calc.add_execution(price=100.12, quantity=12000, timestamp=0.5)
is_calc.add_execution(price=100.15, quantity=8000, timestamp=0.7)
# Left 5000 unfilled

results = is_calc.calculate_is(closing_price=100.20)

print(f"\nDecision Price:      ${results['decision_price']:.2f}")
print(f"Avg Execution Price: ${results['avg_execution_price']:.2f}")
print(f"Fill Rate:           {results['fill_rate']:.1f}%")
print(f"\nIS Components:")
print(f"  Delay Cost:       ${results['delay_cost']:,.2f}")
print(f"  Trading Cost:     ${results['trading_cost']:,.2f}")
print(f"  Opportunity Cost: ${results['opportunity_cost']:,.2f}")
print(f"\nTotal IS:            ${results['total_is']:,.2f}")
print(f"Total IS (bps):      {results['total_is_bps']:.2f} bps")

---
## 10. Almgren-Chriss Optimal Execution Model

### The Fundamental Trade-off

**Key Insight**: Faster execution → Lower risk but Higher impact cost

The Almgren-Chriss (2000) model formalizes optimal execution:

$$\min_{\{x_k\}} \quad E[\text{Cost}] + \lambda \cdot \text{Var}[\text{Cost}]$$

### Model Setup

**State Variables:**
- $X_k$: Remaining shares at time $k$ (start with $X_0 = X$)
- $x_k$: Shares sold in interval $k$
- $S_k$: Stock price at time $k$

**Dynamics:**
$$S_k = S_{k-1} + \sigma \tau^{1/2} \xi_k - g(x_k)$$

Where:
- $\sigma$: Volatility
- $\tau$: Time interval
- $\xi_k$: Standard normal innovation
- $g(x_k)$: Permanent impact function

### Cost Function

**Expected Cost:**
$$E[\text{Cost}] = \sum_{k=1}^{N} E[x_k \cdot \text{Impact}(x_k)]$$

**Variance of Cost:**
$$\text{Var}[\text{Cost}] = \sigma^2 \sum_{k=1}^{N} X_k^2 \tau$$

### Optimal Trajectory

For linear impact, the optimal solution is:

$$x_k^* = \frac{\sinh(\kappa(T-t_k))}{\sinh(\kappa T)} \cdot X_0$$

Where:
$$\kappa = \sqrt{\frac{\lambda \sigma^2}{\eta}}$$

- $\lambda$: Risk aversion parameter
- $\eta$: Temporary impact coefficient
- $T$: Total execution horizon

In [None]:
class AlmgrenChrissOptimalExecution:
    """
    Almgren-Chriss Optimal Execution Model
    
    Finds optimal trading trajectory that minimizes:
    E[Cost] + λ * Var[Cost]
    
    Reference: Almgren & Chriss (2000) "Optimal Execution of Portfolio Transactions"
    """
    
    def __init__(self, 
                 total_shares: float,
                 total_time: float,
                 num_intervals: int,
                 volatility: float,
                 permanent_impact: float,
                 temporary_impact: float,
                 risk_aversion: float):
        """
        Parameters:
        - total_shares: X0, initial position to liquidate
        - total_time: T, trading horizon (in days)
        - num_intervals: N, number of trading intervals
        - volatility: σ, daily volatility
        - permanent_impact: γ, permanent impact coefficient
        - temporary_impact: η, temporary impact coefficient  
        - risk_aversion: λ, risk aversion parameter
        """
        self.X0 = total_shares
        self.T = total_time
        self.N = num_intervals
        self.tau = total_time / num_intervals  # interval length
        self.sigma = volatility
        self.gamma = permanent_impact
        self.eta = temporary_impact
        self.lambd = risk_aversion
        
        # Compute kappa (urgency parameter)
        self.kappa = self._compute_kappa()
        
    def _compute_kappa(self) -> float:
        """
        Kappa determines how front-loaded the execution is
        Higher kappa = more aggressive (faster) execution
        """
        if self.lambd > 0 and self.eta > 0:
            kappa_sq = self.lambd * (self.sigma ** 2) / self.eta
            return np.sqrt(kappa_sq)
        return 0
    
    def optimal_trajectory(self) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
        """
        Compute optimal trading trajectory
        
        Returns:
        - times: time points
        - holdings: remaining shares X_k at each time
        - trades: shares to trade x_k in each interval
        """
        times = np.linspace(0, self.T, self.N + 1)
        holdings = np.zeros(self.N + 1)
        trades = np.zeros(self.N)
        
        if self.kappa == 0:
            # No risk aversion: linear (TWAP-like) trajectory
            holdings = self.X0 * (1 - times / self.T)
            trades = np.full(self.N, self.X0 / self.N)
        else:
            # Optimal trajectory using sinh formula
            for k in range(self.N + 1):
                t_k = times[k]
                holdings[k] = self.X0 * np.sinh(self.kappa * (self.T - t_k)) / np.sinh(self.kappa * self.T)
            
            # Trades are differences in holdings
            trades = -np.diff(holdings)
        
        return times, holdings, trades
    
    def expected_cost(self, trades: np.ndarray = None) -> float:
        """Calculate expected execution cost"""
        if trades is None:
            _, _, trades = self.optimal_trajectory()
        
        # Temporary impact cost
        temp_cost = self.eta * np.sum(trades ** 2) / self.tau
        
        # Permanent impact cost
        _, holdings, _ = self.optimal_trajectory()
        perm_cost = self.gamma * np.sum(trades) * self.X0 / 2
        
        return temp_cost + perm_cost
    
    def variance_of_cost(self) -> float:
        """Calculate variance of execution cost"""
        _, holdings, _ = self.optimal_trajectory()
        
        # Variance from price uncertainty while holding inventory
        variance = (self.sigma ** 2) * self.tau * np.sum(holdings[:-1] ** 2)
        return variance
    
    def efficient_frontier(self, risk_aversions: np.ndarray) -> pd.DataFrame:
        """
        Compute the efficient frontier of execution strategies
        """
        results = []
        
        for lambd in risk_aversions:
            self.lambd = lambd
            self.kappa = self._compute_kappa()
            
            _, _, trades = self.optimal_trajectory()
            exp_cost = self.expected_cost(trades)
            var_cost = self.variance_of_cost()
            
            results.append({
                'risk_aversion': lambd,
                'kappa': self.kappa,
                'expected_cost': exp_cost,
                'std_cost': np.sqrt(var_cost),
                'utility': exp_cost + lambd * var_cost
            })
        
        return pd.DataFrame(results)


def visualize_almgren_chriss():
    """Visualize optimal execution trajectories"""
    
    # Base parameters
    params = {
        'total_shares': 100000,
        'total_time': 1.0,  # 1 day
        'num_intervals': 20,
        'volatility': 0.02,
        'permanent_impact': 0.0001,
        'temporary_impact': 0.001,
        'risk_aversion': 1e-6
    }
    
    fig, axes = plt.subplots(2, 2, figsize=(12, 10))
    
    # 1. Trajectory comparison for different risk aversions
    ax1 = axes[0, 0]
    for lambd, label, color in [(0, 'TWAP (λ=0)', 'blue'), 
                                  (1e-7, 'Low Risk Aversion', 'green'),
                                  (1e-6, 'Medium Risk Aversion', 'orange'),
                                  (1e-5, 'High Risk Aversion', 'red')]:
        model = AlmgrenChrissOptimalExecution(**{**params, 'risk_aversion': lambd})
        times, holdings, _ = model.optimal_trajectory()
        ax1.plot(times, holdings/1000, label=label, linewidth=2, color=color)
    
    ax1.set_xlabel('Time (days)')
    ax1.set_ylabel('Remaining Shares (000s)')
    ax1.set_title('Optimal Trajectories: Risk Aversion Effect')
    ax1.legend()
    ax1.grid(True, alpha=0.3)
    
    # 2. Trade size over time
    ax2 = axes[0, 1]
    for lambd, label, color in [(0, 'TWAP', 'blue'), 
                                  (1e-6, 'Optimal (λ=1e-6)', 'orange'),
                                  (1e-5, 'Aggressive (λ=1e-5)', 'red')]:
        model = AlmgrenChrissOptimalExecution(**{**params, 'risk_aversion': lambd})
        times, _, trades = model.optimal_trajectory()
        ax2.bar(range(len(trades)), trades/1000, alpha=0.5, label=label)
    
    ax2.set_xlabel('Interval')
    ax2.set_ylabel('Trade Size (000s)')
    ax2.set_title('Trade Size per Interval')
    ax2.legend()
    ax2.grid(True, alpha=0.3)
    
    # 3. Efficient frontier
    ax3 = axes[1, 0]
    model = AlmgrenChrissOptimalExecution(**params)
    risk_aversions = np.logspace(-8, -4, 50)
    frontier = model.efficient_frontier(risk_aversions)
    
    ax3.plot(frontier['std_cost'], frontier['expected_cost'], 'b-', linewidth=2)
    ax3.scatter(frontier['std_cost'].iloc[0], frontier['expected_cost'].iloc[0], 
                color='green', s=100, zorder=5, label='Min Variance')
    ax3.scatter(frontier['std_cost'].iloc[-1], frontier['expected_cost'].iloc[-1], 
                color='red', s=100, zorder=5, label='Min Cost')
    
    ax3.set_xlabel('Standard Deviation of Cost ($)')
    ax3.set_ylabel('Expected Cost ($)')
    ax3.set_title('Efficient Frontier of Execution Strategies')
    ax3.legend()
    ax3.grid(True, alpha=0.3)
    
    # 4. Cost breakdown
    ax4 = axes[1, 1]
    risk_aversions_sample = [0, 1e-7, 1e-6, 1e-5]
    labels = ['TWAP', 'Low λ', 'Med λ', 'High λ']
    
    exp_costs = []
    std_costs = []
    
    for lambd in risk_aversions_sample:
        model = AlmgrenChrissOptimalExecution(**{**params, 'risk_aversion': lambd})
        _, _, trades = model.optimal_trajectory()
        exp_costs.append(model.expected_cost(trades))
        std_costs.append(np.sqrt(model.variance_of_cost()))
    
    x = np.arange(len(labels))
    width = 0.35
    ax4.bar(x - width/2, exp_costs, width, label='Expected Cost', alpha=0.7)
    ax4.bar(x + width/2, std_costs, width, label='Std Dev of Cost', alpha=0.7)
    ax4.set_xticks(x)
    ax4.set_xticklabels(labels)
    ax4.set_ylabel('Cost ($)')
    ax4.set_title('Expected Cost vs Risk by Strategy')
    ax4.legend()
    ax4.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Print numerical results
    print("\nAlmgren-Chriss Optimal Execution Analysis")
    print("=" * 70)
    print(f"Parameters: {params['total_shares']:,} shares over {params['total_time']} day")
    print(f"Volatility: {params['volatility']*100:.1f}%, Temp Impact: {params['temporary_impact']}")
    print("\n" + frontier[['risk_aversion', 'kappa', 'expected_cost', 'std_cost']].head(10).to_string())

visualize_almgren_chriss()

---
## 11. POV (Percentage of Volume) Algorithm

### Concept

**POV** trades a fixed percentage of market volume in each interval:

$$q_k = \rho \cdot V_k$$

Where:
- $\rho$: Target participation rate (e.g., 10%)
- $V_k$: Actual market volume in interval $k$

### Key Properties

1. **Adaptive**: Adjusts to real-time volume
2. **Impact-aware**: Naturally trades less in low liquidity
3. **Completion uncertain**: May not finish in target time

### POV vs VWAP

| Aspect | VWAP | POV |
|--------|------|-----|
| Volume input | Historical forecast | Real-time actual |
| Schedule | Pre-determined | Adaptive |
| Completion | Guaranteed at T | Not guaranteed |
| Market impact | May spike in low vol | Consistent % of vol |

In [None]:
class POVExecutor:
    """
    Percentage of Volume (POV) Execution Algorithm
    
    Trades a fixed percentage of market volume
    """
    
    def __init__(self, total_quantity: int, target_participation: float = 0.10):
        """
        Parameters:
        - total_quantity: Total shares to execute
        - target_participation: Target % of market volume (0.10 = 10%)
        """
        self.total_quantity = total_quantity
        self.target_participation = target_participation
        self.remaining = total_quantity
        self.executions = []
        
    def get_trade_size(self, market_volume: float) -> int:
        """
        Calculate trade size for current interval based on market volume
        """
        target_trade = int(market_volume * self.target_participation)
        # Don't trade more than remaining
        actual_trade = min(target_trade, self.remaining)
        return actual_trade
    
    def execute_interval(self, market_volume: float, price: float, timestamp: float) -> int:
        """Execute for one interval"""
        trade_size = self.get_trade_size(market_volume)
        
        if trade_size > 0:
            self.executions.append({
                'timestamp': timestamp,
                'market_volume': market_volume,
                'trade_size': trade_size,
                'price': price,
                'participation': trade_size / market_volume if market_volume > 0 else 0
            })
            self.remaining -= trade_size
        
        return trade_size
    
    def simulate_execution(self, market_volumes: np.ndarray, 
                          prices: np.ndarray) -> pd.DataFrame:
        """Simulate POV execution over multiple intervals"""
        self.remaining = self.total_quantity
        self.executions = []
        
        for i, (vol, price) in enumerate(zip(market_volumes, prices)):
            self.execute_interval(vol, price, timestamp=i)
            
        return pd.DataFrame(self.executions)


# Example POV execution
np.random.seed(42)

# Simulate varying market volume (U-shape)
n_intervals = 20
base_volume = 50000
volume_multipliers = 0.5 * (np.linspace(-1, 1, n_intervals)**2) + 0.5 + np.random.rand(n_intervals) * 0.2
market_volumes = base_volume * volume_multipliers
prices = 100 + np.cumsum(np.random.randn(n_intervals) * 0.05)

pov = POVExecutor(total_quantity=100000, target_participation=0.15)
results = pov.simulate_execution(market_volumes, prices)

print("POV Execution Summary")
print("=" * 70)
print(f"Target quantity: {pov.total_quantity:,}")
print(f"Target participation: {pov.target_participation*100:.0f}%")
print(f"Total executed: {results['trade_size'].sum():,.0f}")
print(f"Remaining: {pov.remaining:,}")
print(f"Avg participation: {results['participation'].mean()*100:.1f}%")
print(f"\nFirst 10 intervals:")
print(results.head(10).to_string(index=False))

---
## 12. Summary: Key Formulas & Interview Topics

### Essential Formulas

| Concept | Formula |
|---------|---------|
| **Mid Price** | $M = \frac{P_{bid} + P_{ask}}{2}$ |
| **Quoted Spread** | $S = P_{ask} - P_{bid}$ |
| **Effective Spread** | $S_e = 2 \cdot D \cdot (P_{trade} - M)$ |
| **Roll Spread** | $c = \sqrt{-\text{Cov}(\Delta P_t, \Delta P_{t-1})}$ |
| **Book Imbalance** | $I = \frac{V_{bid} - V_{ask}}{V_{bid} + V_{ask}}$ |
| **Square Root Impact** | $\Delta P = \sigma \sqrt{\frac{Q}{V}}$ |
| **Implementation Shortfall** | $IS = (P_{exec} - P_{decision}) \times Q$ |
| **VWAP** | $\text{VWAP} = \frac{\sum P_i V_i}{\sum V_i}$ |
| **Almgren-Chriss κ** | $\kappa = \sqrt{\frac{\lambda \sigma^2}{\eta}}$ |

### Common Interview Questions

1. **LOB**: "How does order book imbalance predict short-term price movements?"
2. **Spread**: "What are the three components of the bid-ask spread?"
3. **Impact**: "Why does market impact follow a square root law?"
4. **TWAP vs VWAP**: "When would you choose TWAP over VWAP?"
5. **IS**: "How do you decompose implementation shortfall?"
6. **Almgren-Chriss**: "What happens to optimal trajectory as risk aversion increases?"

### Key Takeaways

1. **LOB** provides real-time view of supply/demand; imbalance predicts direction
2. **Market Impact** is permanent + temporary; scales sublinearly (√Q)
3. **TWAP** = equal time slices; **VWAP** = volume-weighted slices
4. **Implementation Shortfall** = total cost vs paper portfolio
5. **Almgren-Chriss** optimizes cost-risk tradeoff; higher λ → faster execution

In [None]:
# Comprehensive Algorithm Comparison
def compare_execution_algorithms():
    """
    Compare all execution algorithms on the same order
    """
    # Setup
    total_qty = 100000
    n_intervals = 20
    
    # Generate market data
    np.random.seed(42)
    base_volume = 50000
    volume_profile = 0.5 * (np.linspace(-1, 1, n_intervals)**2) + 0.5
    volume_profile = volume_profile / volume_profile.sum()  # Normalize
    market_volumes = base_volume * (volume_profile * n_intervals)
    
    # Price with drift and volatility
    prices = 100 + np.cumsum(np.random.randn(n_intervals) * 0.08)
    
    # 1. TWAP
    twap_trades = np.full(n_intervals, total_qty / n_intervals)
    twap_vwap = np.sum(prices * twap_trades) / np.sum(twap_trades)
    
    # 2. VWAP (follows volume profile)
    vwap_trades = total_qty * volume_profile
    vwap_vwap = np.sum(prices * vwap_trades) / np.sum(vwap_trades)
    
    # 3. Almgren-Chriss (front-loaded)
    ac = AlmgrenChrissOptimalExecution(
        total_shares=total_qty, total_time=1.0, num_intervals=n_intervals,
        volatility=0.02, permanent_impact=0.0001, temporary_impact=0.001,
        risk_aversion=1e-5
    )
    _, _, ac_trades = ac.optimal_trajectory()
    ac_vwap = np.sum(prices * ac_trades) / np.sum(ac_trades)
    
    # 4. POV
    pov = POVExecutor(total_quantity=total_qty, target_participation=0.15)
    pov_results = pov.simulate_execution(market_volumes, prices)
    pov_trades = np.zeros(n_intervals)
    for _, row in pov_results.iterrows():
        pov_trades[int(row['timestamp'])] = row['trade_size']
    pov_vwap = np.sum(prices * pov_trades) / np.sum(pov_trades) if pov_trades.sum() > 0 else 0
    
    # Market VWAP benchmark
    market_vwap = np.sum(prices * market_volumes) / np.sum(market_volumes)
    
    # Visualize
    fig, axes = plt.subplots(2, 2, figsize=(14, 10))
    
    # 1. Trade schedules
    x = np.arange(n_intervals)
    width = 0.2
    axes[0, 0].bar(x - 1.5*width, twap_trades/1000, width, label='TWAP', alpha=0.7)
    axes[0, 0].bar(x - 0.5*width, vwap_trades/1000, width, label='VWAP', alpha=0.7)
    axes[0, 0].bar(x + 0.5*width, ac_trades/1000, width, label='Almgren-Chriss', alpha=0.7)
    axes[0, 0].bar(x + 1.5*width, pov_trades/1000, width, label='POV', alpha=0.7)
    axes[0, 0].set_xlabel('Interval')
    axes[0, 0].set_ylabel('Trade Size (000s)')
    axes[0, 0].set_title('Trade Schedules Comparison')
    axes[0, 0].legend()
    axes[0, 0].grid(True, alpha=0.3)
    
    # 2. Cumulative execution
    for trades, label in [(twap_trades, 'TWAP'), (vwap_trades, 'VWAP'), 
                          (ac_trades, 'Almgren-Chriss'), (pov_trades, 'POV')]:
        cumsum = np.cumsum(trades)
        axes[0, 1].plot(range(n_intervals), cumsum/1000, '-o', label=label, linewidth=2, markersize=4)
    axes[0, 1].set_xlabel('Interval')
    axes[0, 1].set_ylabel('Cumulative Shares (000s)')
    axes[0, 1].set_title('Cumulative Execution Progress')
    axes[0, 1].legend()
    axes[0, 1].grid(True, alpha=0.3)
    
    # 3. Price and volume context
    ax3 = axes[1, 0]
    ax3.plot(range(n_intervals), prices, 'b-o', linewidth=2, label='Price')
    ax3.set_xlabel('Interval')
    ax3.set_ylabel('Price ($)', color='blue')
    ax3_twin = ax3.twinx()
    ax3_twin.bar(range(n_intervals), market_volumes/1000, alpha=0.3, color='gray', label='Volume')
    ax3_twin.set_ylabel('Market Volume (000s)', color='gray')
    ax3.set_title('Market Conditions')
    ax3.grid(True, alpha=0.3)
    
    # 4. Execution quality comparison
    algos = ['TWAP', 'VWAP', 'A-C', 'POV']
    exec_vwaps = [twap_vwap, vwap_vwap, ac_vwap, pov_vwap]
    slippage_bps = [(v - market_vwap) / market_vwap * 10000 for v in exec_vwaps]
    
    colors = ['green' if s < 0 else 'red' for s in slippage_bps]
    axes[1, 1].bar(algos, slippage_bps, color=colors, alpha=0.7)
    axes[1, 1].axhline(y=0, color='black', linestyle='-', linewidth=0.5)
    axes[1, 1].set_xlabel('Algorithm')
    axes[1, 1].set_ylabel('Slippage vs Market VWAP (bps)')
    axes[1, 1].set_title('Execution Quality (negative = better for buys)')
    axes[1, 1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Print summary
    print("\nExecution Algorithm Comparison")
    print("=" * 70)
    print(f"Total Order: {total_qty:,} shares")
    print(f"Market VWAP Benchmark: ${market_vwap:.4f}")
    print(f"\n{'Algorithm':<20} {'Avg Price':>12} {'Slippage (bps)':>15} {'Filled':>12}")
    print("-" * 70)
    for algo, vwap, trades in [('TWAP', twap_vwap, twap_trades),
                                ('VWAP', vwap_vwap, vwap_trades),
                                ('Almgren-Chriss', ac_vwap, ac_trades),
                                ('POV', pov_vwap, pov_trades)]:
        slip = (vwap - market_vwap) / market_vwap * 10000
        filled = np.sum(trades)
        print(f"{algo:<20} ${vwap:>11.4f} {slip:>14.2f} {filled:>11,.0f}")

compare_execution_algorithms()

print("\n" + "="*70)
print("THEORY NOTEBOOK COMPLETE - Week 21 Market Microstructure")
print("="*70)