# High Frequency Trading - Avellaneda-Stoikov Model

---

## Introduction

Avellaneda & Stoikov (2006)'s paper *High-frequency trading in a limit order book* develops an **optimal (stock) market-making strategy** for HFT in a stochastic limit order book. A dealer dynamically posts bid/ask quotes to capture spreads while controlling inventory risk through continuous-time optimal stochastic control.

Traditional optimization frameworks address **two core risks**:

* **Inventory risk**: Uncertainty in the value of dealer holdings $q_t$. Under diffusive mid-price dynamics $dS_u = \sigma dW_u$, the terminal wealth $X_T + q_T S_T$ exhibits quadratic variance $\text{Var}(q_T S_T) = q_T^2 \sigma^2 T$, which escalates rapidly with $|q_t|$, where $X_T$ is cash earned from realised spreads.
* **Asymmetric information risk** (adverse selection): Informed traders exploit stale quotes, capturing the economic spread while the dealer earns only microstructure edge.

The paper primarily tackles **inventory risk optimization** via a (asymptotically) tractable Hamilton-Jacobi-Bellman PDE, deriving inventory-aware reservation prices $r(t,q_t)$ and optimal spreads that naturally mean-revert positions toward zero.

## Model Setup & Assumptions

### State Variables
$$
\mathbb{S} := (t, q_t, X_t, S_t)
$$ 
where:
* $t$ represents the current time
* $q_t$ represents the current inventory position
* $X_t$ represents the current cash incurred from the realised spreads
* $S_t$ is the mid-price of the stock

  
### Price Process
The stock efficient (mid) price follows pure Brownian motion: 

$$
dS_u = \sigma dW_u
$$.

Over ultra-short HFT horizons (seconds to minutes), the drift $\mu \approx 0$ because the daily risk premium is dwarfed by volatility - trends and mean-reversion govern longer scales (hours+). 

This driftless diffusion is mathematically crucial: paired with CARA utility $u(x) = -\frac{1}{\gamma}e^{-\gamma x}$, it enables a (asymptotically) tractable HJB PDE. 

Geometric Brownian motion $dS_u = \sigma S_u\,dW_u$ delivers equivalent short-horizon asymptotics but demands lognormal algebra which encounters issues with mathematical infinities; for simplicity we make no further mention of it here. Arithmetic Brownian motion $dS_u = \sigma\,dW_u$ streamlines derivations without much loss of generality. High-frequency empirical calibration (Avellaneda-Stoikov 2008; Gu\'eant 2016) confirms the approximation holds over 1-60s horizons.


### Orders
The agent quotes limit orders at distances $\delta^a, \delta^b > 0$ from their reference price. Intuitively, smaller $\delta$ means higher priority - your order sits closer to the best bid/ask and executes first against incoming market orders.

Consider a large market order of size $Q$ at price $p^Q$. The top $Q$ best limit asks execute, with market impact defined as $\Delta p := p^Q - s$. A limit ask at $p^a$ executes if $\delta^a < \Delta p$ - your quote must be "aggressive" enough to be reached by the aggressive market order.

Order arrivals are modeled as Poisson processes with intensities:
$$
\lambda^a(\delta^a), \quad \lambda^b(\delta^b)
$$

Intuitively, quotes further from the mid-price execute less frequently - far-away orders only fill against very aggressive market orders. Econophysics literature supports this: market order sizes follow a power law $f^Q(x) \propto x^{-1-\alpha}$ (large orders are rare), and impact satisfies $\Delta p \propto \ln(Q)$ (big orders walk the book deeper).

The intensity derives as:
$$
\begin{aligned}
\lambda(\delta) &= \Lambda \, P(\Delta p > \delta) \\
&= \Lambda \, P(\ln(Q) > K\delta) \\
&= \Lambda \, P(Q > e^{K\delta}) \\
&= \Lambda \int_{e^{K\delta}}^\infty x^{-1-\alpha} \, dx \\
&= A e^{-k\delta}
\end{aligned}
$$
where $A = \Lambda \alpha^{-1} K^\alpha$, $k = K\alpha$ are constants to calibrate.

### Wealth Process
Given the stochastic arrival of market orders, wealth evolves as a jump process:
$$
dX_t = \delta^a_t \, dN^a_t - \delta^b_t \, dN^b_t
$$
where $N^a_t$ counts ask fills (shares sold, cash inflow $\delta^a_t$) and $N^b_t$ counts bid fills (shares bought, cash outflow $\delta^b_t$). 


### Inventory Process
The inventory process follows:
$$
dq_t = dN^b_t - dN^a_t
$$
or equivalently $q_t = N^b_t - N^a_t$, where $N^b_t$ counts bid fills (shares bought) and $N^a_t$ counts ask fills (shares sold). 

Bid fills increase inventory (+1 share), ask fills decrease inventory (-1 share), creating the classic inventory risk $q_t \sigma \sqrt{T-t}$ that the market maker must hedge.

### Utility
CARA (Constant Absolute Risk Aversion) utility $u(x) = -\exp(-\gamma x)$ enables tractable solutions to the otherwise unsolvable 4D HJB equation. The value function separates as:
$$
u(t,x,q,s) = -\exp(-\gamma x) \cdot v(t,q,s)
$$
where $v(t,q,s)$ solves a simpler 3D PDE independent of absolute wealth $x$.

With driftless $dS_t = \sigma dW_t$, terminal wealth $x_T + q_T S_T$ is Gaussian. CARA + Gaussian yields closed-form expectations:
$$
\mathbb{E}[-\exp(-\gamma Z)] = -\exp(-\gamma \mathbb{E}[Z] + \frac{\gamma^2}{2}\mathrm{Var}[Z])
$$

Without CARA, wealth $x$ and inventory $q$ couple inseparably, requiring numerical 4D HJB solutions. 

Economically, CARA utility $u(x)=-\frac{1}{\gamma}e^{-\gamma x}$ captures constant absolute risk aversion - the market maker's utility penalizes dollar exposure risk, not percentage of total wealth. This makes sense as their reduce inventory risk -  an HFT MM with \$1M capital holding $q=1000$ shares faces inventory risk:
$$
\text{Dollar P\&L risk} = |q|\sigma\sqrt{T-t}S = 1000 \times 0.02 \times \sqrt{1/252} \times 100 \approx \$100k
$$
which is irrespective of how large the HFT AUM is.

# Model Summary

$$
\begin{cases}
dX_t = \delta^b_t \, dN^b_t - \delta^a_t \, dN^a_t \\
dq_t = dN^b_t - dN^a_t \\
dS_t = \sigma S_t \, dW_t
\end{cases}
$$
where $N^{a,b}_t$ are independent Poisson processes with state-dependent intensities:
$$
\lambda^b_t = A e^{-k \delta^b_t}, \quad \lambda^a_t = A e^{-k \delta^a_t}
$$

We wish to maximise the agent utility given an optimal stochastic control, i.e.
$$
u(s,x,q,t) = \sup_{\delta^a,\delta^b \geq 0} \mathbb{E}^\delta \left[ -\exp\left(-\gamma (X_T + q_T S_T)\right) \right | \mathcal{F}_t]
$$
where $\mathbb{E}^\delta$ is the $\mathbb{P}$-expectation of the compensated $\delta$-Poisson process.

# CARA Utility under Frozen Inventory

In it's current form the CARA utility is a 4D function (with 3 stochastic dimensions). It will unlikely to be analytically untractable. We first consider an agent with "frozen inventory", where the agent holds their inventory of stocks (of size $q$) till terminal time $T$. 

We can now derive a closed form for the CARA utility under this assumption:

Given $S_T | S_t = s \sim \mathcal{N}(s, \sigma^2 \tau)$ where $\tau = T-t$, we derive the moment generating function $M_t(\theta) = \mathbb{E}[e^{\theta S_T} | S_t = s]$:
$$
M_t(\theta) = \int_{-\infty}^{\infty} e^{\theta z} \cdot \frac{1}{\sqrt{2\pi\sigma^2\tau}} \exp\left( -\frac{(z-s)^2}{2\sigma^2\tau} \right) dz
$$
Completing the square of the exponent:
$$
\begin{aligned}
\theta z - \frac{z^2 - 2sz + s^2}{2\sigma^2\tau} &= -\frac{1}{2\sigma^2\tau} \left[ z^2 - 2(s + \theta\sigma^2\tau)z \right] + \frac{s^2}{2\sigma^2\tau} \\
&= -\frac{(z - (s + \theta\sigma^2\tau))^2}{2\sigma^2\tau} + \underbrace{s\theta + \frac{1}{2}\theta^2\sigma^2\tau}_{\text{bonus term}}
\end{aligned}
$$
Recall the integral over the shifted Gaussian is 1, so:
$$
M_t(\theta) = \exp\left( s\theta + \frac{1}{2}\theta^2\sigma^2\tau \right)
$$
Plugging in the CARA condition $\theta = -\gamma q$ gives:
$$
u(t,x,q,s) = -\exp(-\gamma x) \exp(-\gamma q s) \exp\left( \frac{\gamma^2 q^2 \sigma^2}{2} (T-t) \right)
$$
In particular the separability 

## Indifference Prices

The central innovation of the paper revolves around combinining **indifference prices** and the microstructure of the order book.

### Reervation Bid Price

The **reservation bid** price $r^b(t, q, s)$ is the certain buy price making the dealer indifferent between:

1. Buying 1 share immediately at $r^b(t, q, s)$:
   $$
   u\big(t, x - r^b(t,q_t), q_t + 1, s\big)
   $$

2. Continuing optimal market-making from current state:
   $$
   u\big(t, x, q_t, s\big)
   $$

Indifference here means equality:
$$
u\big(t, x - r^b(t, q, s), q + 1, s\big) = u\big(t, x, q, s\big)
$$

We derive the closed form under the frozen inventory case for some intitution, though it is not formally needed below. Substituting this back into CARA utility under frozen inventory, we have
$$
r^b(t, q, s) = s + (- 1 - 2q) \frac{\gamma \sigma^2 (T - t)}{2}
$$

### Reservation Ask Price
A similar argument for the ask prices gives the **reservation ask** price:
$$
r^a(t, q, s) = s + (1 - 2q) \frac{\gamma \sigma^2 (T - t)}{2}
$$

### Indifference Price
Finally, define the **indifference** price as the average of these two prices:
$$
r(t, q, s) = s - q \gamma \sigma^2 (T - t)
$$

The interpretation is simple:
* If the agent is long stock ($q > 0$), the indifference price is below mid-price, indicating an intention to liquidate.
* Conversely if the agent is short stock ($q < 0$), the indifference price is above mid-price, since the agent is willing to purchase at a higher price.

## Derivation of the HJB PDE

By the dynamic programming principle:
$$
u(t,x,q,s) = \sup_{\delta_{[t,t+h]}} \mathbb{E}^\delta \Big[ u(t+h, X^\delta_{t+h}, q^\delta_{t+h}, S_{t+h}) \Big| \mathcal{F}_t \Big]
$$

By the jump-diffusion ItÃ´ formula:
$$
du_r = \mathcal{L}^0 u_r \, dr + \sigma s_r u_s \, dW_r + [u(X_r + \delta^b_r, q_r - 1) - u(X_r, q_r, s_r)] dN^b_r 
$$
$$
+ [u(X_r - \delta^a_r, q_r + 1) - u(X_r, q_r, s_r)] dN^a_r
$$
where $\mathcal{L}^0 u = \partial_t u + \frac{1}{2}\sigma^2 s^2 u_{ss}$.

Decompose raw Poissons to expose their drifts: $dN^i_r = d\tilde{N}^i_r + \lambda^i(\delta^i) \, dr$ for $i \in \{a, b\}$:
$$
du_r = \Bigg\{ 
\mathcal{L}^0 u_r + 
\lambda^b \big[u(X_r + \delta^b, q_r - 1, s_r) - u_r\big] + 
\lambda^a \big[u(X_r - \delta^a, q_r + 1, s_r) - u_r\big] 
\Bigg\} dr 
+ \sigma s_r u_s \, dW_r 
+ \big[u(+) - u_r\big] d\tilde{N}^b_r 
+ \big[u(-) - u_r\big] d\tilde{N}^a_r
$$

Full generator:
$$
\mathcal{L}^\delta u_r := \mathcal{L}^0 u_r + \lambda^b_r [u(X_r + \delta^b_r, q_r - 1) - u(X_r, q_r, s_r)] 
$$
$$
+ \lambda^a_r [u(X_r - \delta^a_r, q_r + 1) - u(X_r, q_r, s_r)]
$$

Integrate over $[t,t+h]$ and take $\mathbb{E}^\delta[\cdot | \mathcal{F}_t]$:

Martingale terms vanish:
$$
\mathbb{E}^\delta \left[\int_t^{t+h} \sigma s_r u_s \, dW_r \,\Big|\, \mathcal{F}_t \right] = 0, \quad 
\mathbb{E}^\delta \left[\int_t^{t+h} [u(\pm)-u] d\tilde{N}^i_r \,\Big|\, \mathcal{F}_t \right] = 0
$$


And we are left with:
$$
\mathbb{E}^\delta \Big[ u(t+h,X_{t+h},q_{t+h},S_{t+h}) \Big| \mathcal{F}_t \Big] = u(t,x,q,s) + \mathbb{E}^\delta \Big[ \int_t^{t+h} \mathcal{L}^\delta u_r \, dr \Big| \mathcal{F}_t \Big]
$$

Reintroducing the DPP gives HJB as $h \to 0$:
$$
0 = \sup_{\delta^a, \delta^b \geq 0} \mathcal{L}^\delta u = \partial_t u + \frac{1}{2} \sigma^2 s^2 u_{ss} + \sup_{\delta^a, \delta^b \geq 0} \Big\{ \lambda^b(\delta^b) \big[ u(t, x + \delta^b, q-1, s) - u(t,x,q,s) \big] + \lambda^a(\delta^a) \big[ u(t, x - \delta^a, q+1, s) - u(t,x,q,s) \big] \Big\}
$$

Which gives us the final HJB PDE with terminal condition:
$$
\begin{cases}
0 = \partial_t u + \frac{1}{2} \sigma^2 u_{ss} + \sup_{\delta^a,\delta^b \geq 0} \Big\{ 
\lambda^b(\delta^b) \big[ u(t, x - (s - \delta^b), q+1, s) - u(t,x,q,s) \big] + 
\lambda^a(\delta^a) \big[ u(t, x + s + \delta^a, q-1, s) - u(t,x,q,s) \big]
\big\} \\
u(T,x,q,s) = -\exp(-\gamma(x + q s))
\end{cases}

## Simplification via Ansatz

Motivated by the frozen inventory certainty equivalent, we adopt the CARA ansatz:
$$
u(t,x,q,s) = -\exp\left( -\gamma \left[ x + \theta(t,q,s) \right] \right)
$$

We have via the chain rule:
$$
\begin{cases}
\partial_t u = \gamma \partial_t \theta \, u \\
u_s = \gamma \theta_s u \\
u_{ss} = \gamma \theta_{ss} u + \gamma \theta_s (\gamma \theta_s u) = \gamma \theta_{ss} u + \gamma^2 \theta_s^2 u

\end{cases}
$$

And:
$$
\begin{cases}
u(t, x + (s + \delta^a), q-1, s) - u = u \left[ \exp\left( \gamma (s + \delta^a) + \gamma \left[ \theta(t,q-1,s) - \theta(t,q,s) \right] \right) - 1 \right] \\
u(t, x - (s - \delta^b), q+1, s) - u = u \left[ \exp\left( \gamma (s - \delta^b) + \gamma \left[ \theta(t,q+1,s) - \theta(t,q,s) \right] \right) - 1 \right]
\end{cases}
$$|

Substituting back into the the HJB and dividing by $u$ (which we know < $0$ as $\exp(\cdot) > 0$):
$$
0 = \partial_t \theta + \frac{1}{2} \sigma^2 \partial_{ss} \theta + \frac{1}{2} \gamma \sigma^2 \theta_s^2 
+ \sup_{\delta^a,\delta^b \geq 0} \Big\{ 
\lambda^a(\delta^a) \left[ \exp\left( \gamma (s + \delta^a) + \gamma \left[ \theta(t,q-1,s) - \theta(t,q,s) \right] \right) - 1 \right]
+ 
\lambda^b(\delta^b) \left[ \exp\left( \gamma (s - \delta^b) + \gamma \left[ \theta(t,q+1,s) - \theta(t,q,s) \right] \right) - 1 \right]
\Big\}
$$

Now recall the reservation bid price is defined by
$$
u(t, x, q, s) = u(t, x - r^b, q+1, s)
$$

Similarly substituting the ansatz:
$$
\begin{aligned}
-\exp(-\gamma x) \exp(-\gamma \theta(t,s,q)) &= -\exp(-\gamma (x - r^b)) \exp(-\gamma \theta(t,s,q+1)) \\
r^b(t,s,q) &= \theta(t,s,q+1) - \theta(t,s,q)
\end{aligned}
$$

A similar argument gives the reservation ask price:
$$
\begin{aligned}
r^a(t,s,q) &= \theta(t,s,q) - \theta(t,s,q-1)
\end{aligned}
$$

Which further reduces the HJB PDE nicely to:
$$
\begin{cases}
0 = \partial_t \theta + \frac{1}{2} \sigma^2 \partial_{ss} \theta - \frac{1}{2} \gamma \sigma^2 \theta_s^2 
+ \sup_{\delta^a,\delta^b \geq 0} \Big\{ 
\lambda^a(\delta^a) \left[ \exp\big(\gamma (s + \delta^a - r^a)\big) - 1 \right] + 
\lambda^b(\delta^b) \left[ \exp\big(\gamma (s - \delta^b - r^b)\big) - 1 \right]
\Big\} \\
\theta(s, q, T) = qs 
\end{cases}
$$


## Implicit Optimal Controls

We can find optimal $\delta^a$ and $\delta^b$ via the first order conditions.

### Bid Side FOC 
$$
\begin{aligned}
0 &= \frac{\partial\mathcal{L}}{\partial\delta^b}  \\
&= \lambda^{b}(\delta^b)[e^{\gamma(s-\delta^b-r^b)}-1] - \gamma\lambda^b(\delta^b)e^{\gamma(s-\delta^b-r^b)}
\end{aligned}
$$

Let $z^* := e^{\gamma(s-\delta^b_*-r^b)}$ and we have:
$$
\lambda^{b\prime}(z^*-1) = \gamma\lambda^b z^* \implies z^* = \frac{\lambda^{b\prime}(\delta^b_*)}{\lambda^{b\prime}(\delta^b_*) - \gamma\lambda^b(\delta^b_*)}
$$

Take logs and rearrange and we have:
$$
\boxed{s - r^b = \delta^b_* - \frac{1}{\gamma}\ln\left(1 - \frac{\gamma\lambda^b(\delta^b_*)}{\lambda^{b\prime}(\delta^b_*)}\right)}
$$


### Ask Side FOC

A similar argument gives:

$$
\boxed{r^a - s = \delta^a_* - \frac{1}{\gamma}\ln\left(1 - \frac{\gamma\lambda^a(\delta^a_*)}{\lambda^{a\prime}(\delta^a_*)}\right)}
$$

Which now gives the optimal bid and ask spread implicitly

## Asymptotic Expansion in Inventory and Arrival Terms

After substituting optimal $\delta^a_*, \delta^b_*$ and assuming symmetric exponential arrival rates:

$$\lambda^a(\delta) = \lambda^b(\delta) = A e^{-k\delta}$$

We obtain:
$$
\begin{cases}
\partial_t \theta + \frac{1}{2} \sigma^2 \theta_{ss} - \frac{1}{2} \sigma^2 \gamma \theta_s^2 + \frac{A}{k+\gamma} \left( e^{-k\delta^a_*} + e^{-k\delta^b_*} \right) = 0 \\
\theta(s,q,T) = q s
\end{cases} 
$$

The HJB PDE remains analytically intractable in full generality. The authors propose an asymptotic Taylor expansion of the reduced value function $\theta(t,q,s)$ around zero inventory:
$$
\theta(t,q,s) = \theta^0(t,s) + q\,\theta^1(t,s) + \frac{1}{2}q^2\,\theta^2(t,s) + O(q^3)
$$

Economically, this again reflects inventory risk control: market makers target $q \approx 0$ to minimize directional exposure $q\sigma\sqrt{T-t}$. Higher-order terms capture inventory aversion - $\theta^2 < 0$ penalizes large $|q|$ positions, driving the characteristic reservation price skew $r(t,q) = s - q\gamma\sigma^2(T-t)/2$.

The approximation converges rapidly since HFT positions remain small relative to typical volatility horizons.

Recall the bid reservation price is given by:
$$
r^b(t,s,q) = \theta(t,s,q+1) - \theta(t,s,q)
$$

We can expand the difference:
$$
\begin{aligned}
\theta(t,s,q+1) &= \theta^0 + (q+1)\theta^1 + \frac{1}{2}(q+1)^2\theta^2 + \cdots \\
&= \theta^0 + (q + 1)\theta^1 + \frac{1}{2}(q^2 + 2q + 1)\theta^2 + \cdots 
\end{aligned}
$$
and now subtracting to get:
$$
\begin{aligned}
r^b &= [\theta^0 + q\theta^1 - \theta^1 + \frac{1}{2}(q^2 + 2q + 1)\theta^2] - [\theta^0 + q\theta^1 + \frac{1}{2}q^2\theta^2] + \cdots \\
&= \theta^1 + q\theta^2 + \frac{1}{2}\theta^2 + \cdots \\
&= \theta^1 + (q + \frac{1}{2})\theta^2 + \cdots
\end{aligned}
$$

A similar argment gives us the ask price as well
$$
\begin{cases}
r^a(t,s,q) = \theta^1(s,t) + \left(-\frac{1}{2} + q\right)\theta^2(s,t) + \cdots \\
r^b(t,s,q) = \theta^1(s,t) + \left(q + \frac{1}{2}\right)\theta^2(s,t) + \cdots
\end{cases}
$$

Now approximate the order arrival terms via Taylor expansion around the optimal symmetric spread. Assuming $\delta^a \approx \delta^b \approx \delta^*$, expand:
$$
\lambda(\delta^a) + \lambda(\delta^b) = A \left( e^{-k\delta^a} + e^{-k\delta^b} \right) \approx A \left[ 2 e^{-k\delta^*} - k e^{-k\delta^*} (\delta^a + \delta^b - 2\delta^*) + O((\delta - \delta^*)^2) \right]
$$

Intuitively, this linearizes the fill rate around the profit-maximizing spread $\delta^*$, capturing first-order sensitivity of execution probability to quote aggressiveness while retaining the exponential decay structure from econophysics.

Notice that this does not depend on $q$, so matching orders of the PDEs in $q$ gives:

$O(q)$ terms:
$$
\begin{cases}
\theta^1_t + \frac{1}{2} \sigma^2 \theta^1_{ss} = 0 \\
\theta^1(s, T) = s
\end{cases}
$$

Solution: 
$$\boxed{\theta^1(s,t) = s}$$

$O(q^2)$ terms:
$$
\begin{cases}
\theta^2_t + \frac{1}{2} \sigma^2 \theta^2_{ss} - \frac{1}{2} \sigma^2 \gamma (\theta^1_s)^2 = 0 \\
\theta^2(s, T) = 0
\end{cases}
$$

**Since $\theta^1_s = 1$** this simplifies to:
$$
\boxed{\theta^2_t - \frac{1}{2} \sigma^2 \gamma = 0 \quad \to \quad \theta^2 = \frac{1}{2} \gamma \sigma^2 (T-t)}
$$

And we are finally to obtain an approximation to the solution of $\theta$.

## Reservation Price Rehashed


From the asymptotic expansion, the bid reservation price is 
$$
r^b(t,s,q) = \theta(t,s,q+1) - \theta(t,s,q)
$$

Using $\theta(t,s,q) = s + \frac{1}{2} q^2 \gamma \sigma^2 (T-t) + O(q^3)$, we have:
$$
\begin{aligned}
\theta(t,s,q+1) &= s + \frac{1}{2} (q+1)^2 \gamma \sigma^2 (T-t) \\
&= s + \frac{1}{2} (q^2 + 2q + 1) \gamma \sigma^2 (T-t)
\end{aligned}
$$

Subtracting gives:
$$
\begin{aligned}
r^b(t,s,q) &= \frac{1}{2} \gamma \sigma^2 (T-t) (2q + 1) \\
&= s + q \gamma \sigma^2 (T-t) + \frac{1}{2} \gamma \sigma^2 (T-t)
\end{aligned}
$$

The ask reservation price $r^a(t,s,q) = \theta(t,s,q) - \theta(t,s,q-1)$ follows similarly. To leading order, both converge to the symmetric reservation price 
$$
r(t,q) = s - \frac{1}{2} q \gamma \sigma^2 (T-t).
$$
Economically, positive inventory $q>0$ creates adverse selection risk: the mid-price $s_t$ may drop before liquidation. The market maker therefore skews quotes downward ($r(t,q) < s$) to encourage ask fills and reduce $q$. Negative inventory $q<0$ skews quotes upward to encourage bid fills. The term $\gamma \sigma^2 (T-t)/2$ quantifies this inventory risk premium.

# Summary

Assuming symmetric optimal spreads $\delta^a_* = \delta^b_* = \delta^*(t)$ (standard in literature), quotes center around the reservation price, given by:
$$
\boxed{
\begin{aligned}
p^b &= r(q, t) - \delta^*(t) \\
p^a &= r(q, t) + \delta^*(t)
\end{aligned}
}
$$

The reservation price is given by:
$$
\boxed{r(t,q) = s - \frac{1}{2}q \gamma \sigma^2 (T-t)}
$$.

and the symmetric spread is derived from the reservation bid/ask price upon substituting $\theta^1$ and $\theta^2$: 
$$
\boxed{\delta^*(t) = \frac{\gamma \sigma^2 (T-t)}{2} + \frac{1}{\gamma} \ln\left(1 + \frac{\gamma}{k}\right)}
$$

## Implementation (WIP)

In [9]:
import logging

import numpy as np
from importnb import Notebook

with Notebook():
    from notebooks.finance.order_book.__intermediate__lob_implementation import (
        OrderBook,
        Side,
    )

In [16]:
log = logging.getLogger(__name__)
log.setLevel(logging.INFO)
if not log.hasHandlers():
    handler = logging.StreamHandler()
    handler.setFormatter(logging.Formatter("[%(asctime)s] [%(levelname)s] %(message)s"))
    log.addHandler(handler)

In [None]:
class ASMarketMaker:
    def __init__(
        self,
        sigma: float = 0.02,
        gamma: float = 0.1,
        k: float = 1.5,
        A: float = 100,
        T: float = 1 / 252.0,
        tick_size: float = 0.01,
        order_book: OrderBook | None = None,
    ):
        self.sigma = sigma
        self.gamma = gamma
        self.k = k
        self.A = A
        self.T = T
        self.tick_size = tick_size

        self.order_book = OrderBook() if order_book is None else order_book
        self.inventory = 0
        self.cash = 0.0
        self.next_order_id = 1000
        self.active_orders = {}
        self.last_quote_step = -1
        self.quote_interval = 50

        def on_fill(order_id, price, qty, ts) -> None:
            if order_id in self.active_orders:
                side = self.active_orders.pop(order_id)
                if side == Side.BID:
                    # Customer SELLS to MM BID: MM BUYS
                    self.inventory += qty
                    self.cash -= price * qty  # MM pays customer
                else:
                    # Customer BUYS MM ASK: MM SELLS
                    self.inventory -= qty
                    self.cash += price * qty

        self.order_book.on_fill = on_fill

    def reservation_price(self, mid: float, t: float) -> float:
        tau = max(0.0, self.T - t)
        return mid - 0.5 * self.inventory * self.gamma * self.sigma**2 * tau

    def optimal_half_spread(self, t: float) -> float:
        tau = max(0.0, self.T - t)
        risk_term = 0.5 * self.gamma * self.sigma**2 * tau
        profit_term = (1 / self.gamma) * np.log(1 + self.gamma / self.k)
        return risk_term + profit_term

    def update_quotes(self, mid: float, t: float, step: int = 0):
        if step - self.last_quote_step < self.quote_interval:
            return

        # Cancel only own orders - we need to track them... this is bit silly
        for oid in list(self.active_orders):
            self.order_book.cancel(oid)
        self.active_orders.clear()

        # AS formulas
        r = self.reservation_price(mid, t)
        L = self.optimal_half_spread(t)

        bid_price = round((r - L) / self.tick_size) * self.tick_size
        ask_price = round((r + L) / self.tick_size) * self.tick_size

        bid_id = self.next_order_id
        self.next_order_id += 1

        ask_id = self.next_order_id
        self.next_order_id += 1

        self.order_book.insert(bid_id, Side.BID, bid_price, 1, t)
        self.order_book.insert(ask_id, Side.ASK, ask_price, 1, t)

        self.active_orders[bid_id] = Side.BID
        self.active_orders[ask_id] = Side.ASK
        self.last_quote_step = step

    def pnl(self, mid: float) -> float:
        return self.cash + self.inventory * mid

    def book_state(self) -> dict:
        bb = self.order_book.best_bid()
        ba = self.order_book.best_ask()
        return {
            "best_bid": bb,
            "best_ask": ba,
            "spread": (ba[0] - bb[0]) if bb and ba else 0,
            "inventory": self.inventory,
            "cash": self.cash,
            "mtm_pnl": self.pnl(bb[0] if bb else 0),
        }

In [None]:
def as_market_maker_simulation():
    mm = ASMarketMaker()

    S0 = 100.0
    dt = 1 / 252 / 100
    n_steps = 10000
    sigma = 0.10

    S_path, q_path, cash_path = [S0], [0], [0.0]
    np.random.seed(42)

    for i in range(1, n_steps):
        t = i * dt
        dW = np.random.normal(0, np.sqrt(dt))
        S = S_path[-1] + sigma * dW
        S_path.append(S)

        if i % 25 == 0:
            mm.update_quotes(S, t, i)

        if i % 10 == 0 and np.random.rand() < 0.3:
            state = mm.book_state()
            if state["best_ask"] and state["best_bid"]:
                # Market BUY crosses ASK
                mo_price = state["best_ask"][0] + 0.02
                mm.order_book.insert(9000 + i, Side.BID, mo_price, 1, t)

                # Market SELL crosses BID
                mo_price = state["best_bid"][0] - 0.02
                mm.order_book.insert(9001 + i, Side.ASK, mo_price, 1, t)

        q_path.append(mm.inventory)
        cash_path.append(mm.cash)

    final_pnl = mm.pnl(S_path[-1])

    log.info("=== Final Market Maker State ===")
    log.info(f"Duration: {n_steps * dt:.1f}s")
    log.info(f"Final mid: ${S_path[-1]:.2f}")
    log.info(f"Final inventory: {mm.inventory}")
    log.info(f"Final cash: ${mm.cash:.2f}")
    log.info(f"Terminal PnL: ${final_pnl:.2f}")
    log.info(f"Trades: {len([c for c in cash_path if c != cash_path[0]])}")


as_market_maker_simulation()

[2025-12-30 18:08:48,209] [INFO] === Final Market Maker State ===
[2025-12-30 18:08:48,210] [INFO] Duration: 0.4s
[2025-12-30 18:08:48,211] [INFO] Final mid: $99.94
[2025-12-30 18:08:48,212] [INFO] Final inventory: 0
[2025-12-30 18:08:48,212] [INFO] Final cash: $224.58
[2025-12-30 18:08:48,212] [INFO] Terminal PnL: $224.58
[2025-12-30 18:08:48,213] [INFO] Trades: 9950
