# Electronic Markets - Optimal execution with the Almgren-Chriss model

## 1. Model

The **Almgren-Chriss framework** (1999) is a mathematical model that looks to estimate the optimal pace to build/unwind a trading position, issues often faced by brokers and traders but has not seen as much interest from statisticians, economists and econophysicists in the past decades. Compared to Bertsimas-Lo (1998), Almgren-Chriss established the existence of a trade-off between fast execution (limited liquidity, high market impact, high transaction costs) and slow execution (exposure to price fluctuations). Other interesting facets of the Almgren-Chriss framework include:

- Allows one to graphically represent the trade-off between slow/fast execution;
- Ease to choose one's own prefered approach to model market impact costs;
- Avoids the temptation of modelling the market as a physical system fully enclosed in its own data (*physics envy*);
- Designed from a bottom-top approach: the modelling is based on the execution process itself;
- Provides a multitude of modelling approaches to compute the slow/fast execution trade-off (e.g. dynamic programming, stochastic control, reinforcement learning, etc.).

We will present 4 approaches to solve the Almgren-Chriss mode:

1. Dynamic programming
2. Closed-form solution (original method)
3. Stochastic control
4. Reinforcement learning (ex. Q-Learning)


## 2. Dynamic programming

### 2.1. Market impact functions

In [1]:
import numpy as np
import pandas as pd
import math

Utilities for the implementation of Almgren-Chriss:

- **Temporary market impact function**

$$ h(\frac{n_k}{\tau}) = \epsilon \, sign(n_k) + \eta \frac{n_k}{\tau} $$

- **Permanent market impact function**

$$ g(v) = \gamma v $$

- **Hamiltonian equation**

$$ H(x,n) = ng(\frac{n}{\tau}) + \gamma (x-n) \tau h(\frac{n}{\tau}) - \frac{1}{2} \gamma^2 (x-n)^2 \sigma^2 \tau $$

In [2]:
# Utilities
def h(u):
    """
    Temporary market impact function.
    
    Inputs
    - u/tau, speed of trading
    """
    tau = 1.
    epsilon = 1./16.
    eta = 2.5 * 10 ** (-3)
    return epsilon * np.sign(u) + eta * (u/tau)

def g(u):
    """
    Permanent market impact function.
    """
    gamma = 2.5 * 10 ** (-3)
    return gamma * u

def H(x,n):
    """
    Hamiltonian equation. To be minimized through DP.
    """
    tau = 1.
    gamma = 2.5 * 10 ** (-3)
    sigma = 0.3
    res = n*g(n/tau) + gamma*(x-n)*tau*h(n/tau) - 0.5*(gamma**2)*(sigma**2)*tau
    return res

### 2.2. Dynamic programming: a simplified version

Bellman equation:

In [47]:
def dynamic_programming(nb_T, X_total):
    """
    Bellman equation for solving Markov decision processes.
    
    Inputs
    - nb_T, number of time steps
    - X_total, number of shares to be liquidated
    """
    # Init
    tau = 1.
    gamma = 2.5 * 10 ** (-5)
    u = np.zeros(shape=(nb_T, X_total+1), dtype="float")  # value function
    b = np.zeros(shape=(nb_T, X_total+1), dtype="int")    # best move
    inventoryforX = np.zeros(shape=(nb_T,1),dtype="int") # evolution of inventory
    inventoryforX[0] = X_total
    
    # Terminal condition
    for x in range(X_total+1):
        u[nb_T - 1, x] = math.exp(gamma*x*h(x/tau))
        b[nb_T - 1, x] = x
    
    # Backwards induction
    for t in range(nb_T-2, -1, -1):
        
        for x in range(X_total+1):
            
            best_value = u[t+1,0] * math.exp(gamma*H(x,x))
            best_n = x
            
            for n in range(x):
                current_value = u[t+1,x-n] * math.exp(gamma*H(x,n)) # we compute the utility function if we sell n shares
                
                if current_value < best_value:
                    best_value = current_value
                    best_n = n #nb of shares to liquidate
               
            u[t,x] = best_value
            b[t,x] = best_n
                    
    for t in range(1,nb_T):
        inventoryforX[t] = inventoryforX[t-1] - b[t,inventoryforX[t-1]]
            
            
    
    return u, b,inventoryforX

In [48]:
model = dynamic_programming(nb_T=15, X_total=10)
#print(model[0])
print(model[1])
print(model[2])

[[ 0  1  0  0  0  0  0  0  0  0  0]
 [ 0  1  0  0  0  0  0  0  0  0  0]
 [ 0  1  0  0  0  0  0  0  0  0  0]
 [ 0  1  0  0  0  0  0  0  0  0  0]
 [ 0  1  0  0  0  0  0  0  0  0  1]
 [ 0  1  0  0  0  0  0  0  0  1  2]
 [ 0  1  0  0  0  0  0  0  1  2  2]
 [ 0  1  0  0  0  0  0  1  2  2  2]
 [ 0  1  0  0  0  0  1  2  2  2  2]
 [ 0  1  0  0  0  1  2  2  2  2  2]
 [ 0  1  0  0  1  2  2  2  2  3  3]
 [ 0  1  0  1  2  2  2  3  3  3  4]
 [ 0  1  1  2  2  3  3  4  4  5  5]
 [ 0  1  2  3  4  5  6  7  8  9 10]
 [ 0  1  2  3  4  5  6  7  8  9 10]]
[[10]
 [10]
 [10]
 [10]
 [ 9]
 [ 8]
 [ 7]
 [ 6]
 [ 5]
 [ 4]
 [ 3]
 [ 2]
 [ 1]
 [ 0]
 [ 0]]


In [5]:
#print(model[1])

In [6]:
import matplotlib.pyplot as plt
plt.plot(np.linspace(0, 10, len(model[1][-1][::-1])), model[1][-1][::-1], marker='o')
plt.grid(True)
plt.show()

<matplotlib.figure.Figure at 0x1d9196f4080>

In [7]:
model[1][-1][::-1]

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

### 3. Complete dynamic programming

In [8]:
# INITIALIZATION
T = 5     # nb. of time steps
X = 5     # nb. of shares to liquidate
V = np.zeros((T, X))   # Value function
b = np.zeros((T, X))   # Policy iteration

In [9]:
# STOCK PATH (S)
S = np.zeros((T, X))
zeta = np.random.standard_normal((T, X))
sigma = 0.05
dt = 0.5
S[0] = 100.0
tau = 1.0
r = 0.05

for x in range(X):
    for t in range(1,T):
        S[t,x] = S[t-1,x] * np.exp((r-0.5*sigma**2)*dt + sigma*np.sqrt(dt)*zeta[t,x])
        #print(S[t,x])
print(np.matrix(S))

[[100.         100.         100.         100.         100.        ]
 [100.58599138 102.4320257  101.88870864  99.66186424 100.86753896]
 [103.52902715  96.821238   104.70295863 107.97522333 104.25644185]
 [109.07718363 101.50237872 105.25466176 116.66872476 113.09398311]
 [113.20404459 109.0646642  116.26729903 130.37457562 114.70174641]]


In [10]:
# REVENU (R)
R = np.zeros((T, X))
for x in range(X):
    for t in range(1,T):
        R[t,x] = R[t-1,x] = x * (S[t,x] - g(x))

print(np.matrix(R))

[[  0.         102.4295257  203.76741727 298.96309271 403.43015583]
 [  0.          96.818738   209.39591726 323.90316998 416.98576741]
 [  0.         101.49987872 210.49932353 349.98367429 452.33593244]
 [  0.         109.0621642  232.52459806 391.10122686 458.76698564]
 [  0.         109.0621642  232.52459806 391.10122686 458.76698564]]


In [11]:
# TERMINAL CONDITION
gamma = 0.5
for x in range(X):
    V[T-1,x] = np.exp(-gamma * (R[T-1,x] + x * (S[T-1,x] - 1)))
print(np.matrix(V).round(2))

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0.]]


In [12]:
# BACKWARD INDUCTION
for t in range(T-2,-1,-1):
    for x in range(X):
        best_value = 

SyntaxError: invalid syntax (<ipython-input-12-4fdf4285eb16>, line 4)