In [1]:
import numpy as np

# 🧠 Solving the Infinite-Horizon HJB for Market Making

We compute the optimal bid and ask spreads that:

- Maximize expected utility of terminal wealth  
- Balance inventory risk and market opportunity  
- React to signals (price, imbalance, volatility)  

We solve a **stationary infinite-horizon HJB equation** to get the value function.


In [2]:
# No code needed for this high-level description

## 🧮 Step 1: Define Variables

### 🔢 State Variables (dynamically evolving)

Let the state be a tuple: $(q, s, z, \sigma)$

- $q$: Inventory  
- $s$: Midprice  
- $z$: Order imbalance  
- $\sigma$: Volatility  

---

### ⚙️ Control Variables

- $\delta_b$: Bid spread from midprice → quote at $s - \delta_b$  
- $\delta_a$: Ask spread from midprice → quote at $s + \delta_a$


In [3]:
# Define state and control variables
q = 0  # inventory
s = 30000.0  # midprice
z = 0.0  # imbalance
sigma = 0.05  # volatility
delta_b = 0.5  # bid spread
delta_a = 0.5  # ask spread

### 📈 Market Parameters

- $\lambda_b(\delta_b) = A e^{-k \delta_b}$: Buy arrival rate  
- $\lambda_a(\delta_a) = A e^{-k \delta_a}$: Sell arrival rate  
- $\mu(s, z, \sigma)$: Drift of midprice  
- $\sigma$: Local volatility (state variable)


In [4]:
# Define market parameters
import numpy as np
A = 0.1
k = 1.5
gamma = 0.1
lambda_b = lambda delta: A * np.exp(-k * delta)
lambda_a = lambda delta: A * np.exp(-k * delta)

### 🎯 Utility and Value Function

We use exponential utility of terminal wealth:

$$
u(t, x, q, s) = -\exp\left(-\gamma\left(x + q s + \phi(q, s, z, \sigma)\right)\right)
$$

- $x$: Cash (drops out of control problem)  
- $\phi(q, s, z, \sigma)$: Value function to solve for


In [5]:
# Define the utility function and the ansatz
import math
def utility(x, q, s, phi):
    return -math.exp(-gamma * (x + q * s + phi))

## 📘 Step 2: The HJB Equation

In the stationary infinite-horizon case (no $\partial_t \phi$):

$$
0 = \mu(s, z, \sigma) \frac{\partial \phi}{\partial s}
+ \frac{1}{2} \sigma^2 \frac{\partial^2 \phi}{\partial s^2}
+ J_{\text{bid}} + J_{\text{ask}}
$$


## 📈 Including Volatility as a State Variable

Volatility $\sigma$ affects the risk of quoting and the optimal spread.

### ✅ Add volatility dynamics to the HJB:
If $\sigma$ follows an SDE like:
$$ d\sigma = \eta(\sigma)\,dt + \nu(\sigma)\,dW_t $$
Then the HJB gains two new terms:
$$ \eta(\sigma)\, \frac{\partial \phi}{\partial \sigma} + \frac{1}{2} \nu^2(\sigma) \frac{\partial^2 \phi}{\partial \sigma^2} $$

### ✅ The full HJB:
$$ \begin{aligned}
0 = &\ \mu(s, z, \sigma) \frac{\partial \phi}{\partial s}
+ \frac{1}{2} \sigma^2 \frac{\partial^2 \phi}{\partial s^2} \\
&+ \eta(\sigma) \frac{\partial \phi}{\partial \sigma}
+ \frac{1}{2} \nu^2(\sigma) \frac{\partial^2 \phi}{\partial \sigma^2} \\
&+ J_{\text{bid}} + J_{\text{ask}}
\end{aligned} $$

### ✅ Update optimal spreads:
$$ \delta_b^* = \phi(q+1, s, z, \sigma) - \phi(q, s, z, \sigma) + C $$
$$ \delta_a^* = -\left(\phi(q-1, s, z, \sigma) - \phi(q, s, z, \sigma)\right) + C $$

### ✅ Discretize $\sigma$:
- Use finite differences for $\partial_\sigma \phi$ and $\partial^2_{\sigma\sigma} \phi$
- Result: a 4D grid $\phi[q][s][z][\sigma]$


In [7]:
# Example jump terms computation
phi_q = 1.0
phi_q_plus1 = 1.05
phi_q_minus1 = 0.95
delta_b = phi_q_plus1 - phi_q + (1/gamma) * np.log((k+gamma)/k)
delta_a = -(phi_q_minus1 - phi_q) + (1/gamma) * np.log((k+gamma)/k)
J_bid = lambda_b(delta_b) * (np.exp(-gamma * (delta_b + phi_q_plus1 - phi_q)) - 1)
J_ask = lambda_a(delta_a) * (np.exp(-gamma * (-delta_a + phi_q_minus1 - phi_q)) - 1)

### 🔁 Jump Terms

$$
J_{\text{bid}} = \lambda_b(\delta_b) \left[
\exp\left(-\gamma(\delta_b + \phi(q+1) - \phi(q))\right) - 1
\right]
$$

$$
J_{\text{ask}} = \lambda_a(\delta_a) \left[
\exp\left(-\gamma(-\delta_a + \phi(q-1) - \phi(q))\right) - 1
\right]
$$


In [8]:
# Compute optimal spreads
C = (1/gamma) * np.log((k + gamma)/k)
delta_b_star = phi_q_plus1 - phi_q + C
delta_a_star = -(phi_q_minus1 - phi_q) + C

### 🧮 Optimal Spreads

$$
\delta_b^* = \phi(q+1, s, z, \sigma) - \phi(q, s, z, \sigma) + C
$$

$$
\delta_a^* = -\left(\phi(q-1, s, z, \sigma) - \phi(q, s, z, \sigma)\right) + C
$$

Where:

$$
C = \frac{1}{\gamma} \log\left(\frac{k + \gamma}{k}\right)
$$


In [9]:
# Basic value iteration loop
phi_grid = np.zeros((3, 3, 3, 3))  # Example shape (q, s, z, sigma)
for iter in range(10):
    # dummy update rule
    phi_grid += 0.01  # Increment as a placeholder

## 🔧 Step 3: Numerical Value Iteration

1. Discretize $q, s, z, \sigma$  
2. Initialize grid $\phi[q][s][z][\sigma]$  
3. Repeat until convergence:
   - For each grid point:
     - Compute $\delta_b, \delta_a$
     - Compute jump terms $J_{\text{bid}}, J_{\text{ask}}$
     - Approximate $\partial_s \phi$ and $\partial^2_{ss} \phi$ via finite differences
     - Evaluate HJB:
     
       $$
       HJB = \mu \frac{\partial \phi}{\partial s} + \frac{1}{2} \sigma^2 \frac{\partial^2 \phi}{\partial s^2} + J_{\text{bid}} + J_{\text{ask}}
       $$

     - Update:

       $$
       \phi_{\text{new}} = \phi - \alpha \cdot HJB
       $$
4. Measure:

   $$
   \text{max\_diff} = \max \left|\phi_{\text{new}} - \phi\right|
   $$

   Stop if $\text{max\_diff} < \varepsilon$


In [10]:
# Extract control policy example
def optimal_spreads(phi, q, s, z, sigma):
    C = (1/gamma) * np.log((k + gamma)/k)
    delta_b = phi[q+1, s, z, sigma] - phi[q, s, z, sigma] + C
    delta_a = -(phi[q-1, s, z, sigma] - phi[q, s, z, sigma]) + C
    return delta_b, delta_a

## 📤 Step 4: Extract Control Policy

After convergence:

$$
\delta_b(q, s, z, \sigma) = \phi(q+1) - \phi(q) + C
$$

$$
\delta_a(q, s, z, \sigma) = -(\phi(q-1) - \phi(q)) + C
$$
