# Black-Scholes-Model-Based Statistical Arbitrage in China 

Author: Jiaxuan Liang


## Setting

In the Black-Scholes framework of stock price dynamics, i.e. 

$$S_t=S_0 e^{(\alpha - \frac{\sigma^2}{2})t + \sigma W_t}$$ 
$$r_f \text{: the risk-free rate}$$ 

there exists statistical arbitrage if $\alpha− r_f \neq \frac{\sigma^2}{2}$. 
- If $\alpha−r_f > \frac{\sigma^2}{2}$, then there exists statistical arbitrage for the long until barrier strategies.
- If $\alpha−r_f < \frac{\sigma^2}{2}$, then there exists statistical arbitrage for the short until barrier strategies.

To exclude statistical arbitrage opportunities, the Sharpe ratio of any given stock must be bounded by $\frac{\sigma}{2}$. 

In a market where short sell is not allowed, the no-arbitrage condition is given by $0<\alpha−r_f<\frac{\sigma^2}{2}$.

If an investor knows or believes that he knows the stocks that satisfy the statistical arbitrage condition, then this is sufficient to design statistical arbitrage trading strategy. 


## Analytical Analysis

### Statistical Arbitrage

#### Definition
Given the stochastic process for the discounted cumulative trading profits, denoted as $\{v(t): t \geq 0\}$ that is defined on a probability space $(\mathcal \Omega,\mathcal F,\mathcal P)$ the statistical arbitrage is defined as follows:

*A statistical arbitrage is a zero initial cost, self-financing trading strategy $\{v(t): t \geq 0\}$ with cumulative discounted value $v(t)$ such that*
- $v(0) = 0$
- $\lim_{t \to \infty}\mathbb E[v(t)]>0$
- $\lim_{t \to \infty} P(v(t)<0)=0$
- $\lim_{t \to \infty} \frac{var(v(t))}{t}=0$ if $P(v(t)<0)>0$, $\forall t<\infty$ 


### Strategies

**Buy & Hold Until Deterministic Barrier Strategy**: a self-financing trading strategy with a deterministic boundary that consists of buying and holding one unit of stock financed by borrowing from the bank with constant risk-free rate. Whenever the stock price process hits to $S_0(1+k)e^{r_f t}$, sell the stock, realizing the profit of $k$, and invest immediately in the money market account.

$$v_t = \left\{
\begin{array} \
S_0 k, \text{ if } t^* \leq t & \\
S_te^{-r_f t} -S_0, \text{ else }, & \\
\end{array} \right. 
$$ where $t^* = min\left\{t \geq 0: S_t = S_0(1+k)e^{r_f t}\right\}$

Define $X_t = \frac{ln \left(\frac{S_t}{S_0}\right)}{\sigma}$, $\mu=\frac{\alpha-r_f-\frac{\sigma^2}{2}}{\sigma}$, $k^*=\frac{ln(1 + k)}{\sigma}$

The first passage time $t^*(\tau_{k^*}):\tau_{k^*} \sim IG(\frac{k^*}{\mu}, {k^*}^2)$.

- the expected discounted profit given $t$ is:
$$\mathbb E[v(t)] = \mathbb E[v(t)|\tau_{b^*}<t]F_{IG}(t)+\mathbb E[v(t)|\tau_{b^*}>t](1-F_{IG}(t)) $$

where
$$\mathbb E[v(t)|\tau_{b^*}<t] = k S_0$$

$$\mathbb E[v(t)|\tau_{b^*}>t] = S_0\left(e^{(\alpha−r_f)t}-1\right) $$

- the variance of the discounted profit given $t$ is:

$$var(v(t))=\mathbb E[v^2(t)] - \mathbb E[v(t)]^2$$

- probability of loss given $t$ is:
$$\begin{split} P(v(t)<0)
& = P(S_t < S_0 e^{r_f t},\tau_{k^∗} > t) \\
& = P(S_t < S_0 e^{r_f t}|\tau_{k^∗} > t)P(\tau_{k^∗} > t) \\
& = P((\alpha − r_f − \frac{\sigma^2}{2})t + \sigma W_t < 0|M(t) < k^∗)P(M(t) < k^∗) \\
& = P(Z < −\frac{(\alpha − r_f − \frac{\sigma^2}{2})\sqrt t}{\sigma}|\tau_{k^∗} > t)  P(\tau_{k^∗}>t) \\
& = P(Z < −\frac{(\alpha − r_f − \frac{\sigma^2}{2})\sqrt t}{\sigma}|\tau_{k^∗} > t) \times \left(\Phi(\frac{k^*-\frac{(\alpha−r_f-\frac{\sigma^2}{2})t}{\sigma}}{\sqrt t}) -e^{2k^*\frac{\alpha−r_f-\frac{\sigma^2}{2}}{\sigma}}\Phi(\frac{-k^*-\frac{(\alpha−r_f-\frac{\sigma^2}{2})t}{\sigma}}{\sqrt t})\right) \\
\end{split}$$
$$\text{where } M(t) \text{ is the maximum of the process} X_t \text{ in the time interval } [0,t],\; Z \text{ is the standard normal random variable, and } \Phi \text{ is the standard normal cdf}$$ 

For $\alpha−r_f \geq \frac{\sigma^2}{2}$ as $t \to \infty$ the probability of loss goes to zero, whereas for $\alpha−r_f < \frac{\sigma^2}{2}$ we almost surely make a loss as $t \to \infty$, the probability of loss does not decay to zero, with
$$\lim_{t \to \infty} P(v(t)<0) = 1-e^{2k^*\frac{\alpha−r_f-\frac{\sigma^2}{2}}{\sigma}} \text{ for } \alpha−r_f < \frac{\sigma^2}{2}$$

Statistical arbitrage opportunity exists when $\alpha−r_f \geq \frac{\sigma^2}{2}$ and $k > 0$.

Proof:

When $\mu > 0$ and $k^* > 0$ (i.e. $\alpha-r_f-\frac{\sigma^2}{2}>0$ given $\sigma>0$, and $k>0$), for sufficiently large t, $P(\tau_{k^*} < \infty)=1$, thus:

$$\lim_{t \to \infty} \mathbb E[v(t)] = \mathbb E [\lim_{t \to \infty} v(t)] = S_0k > 0$$
$$\lim_{t \to \infty} \frac{var(v(t))}{t} = 0 $$ 

$$\lim_{t \to \infty} P(v(t)<0) = 0 \text{ for } \alpha−r_f \geq \frac{\sigma^2}{2}$$


## Backtest in China

### Description

Assume the risk-free rate as 2%, based on the money account deposit rate in China. It is not a exact value, but rather an estimation, since the time span for reinvestment is not fixed.

Assume the transaction cost is 0.06% of the deal price, considering conmission fee, tax and others.

Set $k$ at 1.

The strategy goes as follows: at each day, if the stock price fall into the boundary, i.e. $\alpha_t−r_f > \frac{\sigma_t^2}{2}$ and $S_t \leq S_{t-1}(1+k)e^{r_f dt}$, buy and hold one unit of stock financed by borrowing from the bank with constant risk-free rate. Whenever the stock price fall out of the boundary, i.e. $\alpha_t−r_f \leq \frac{\sigma_t^2}{2}$, or $S_t > S_{t-1}(1+k)e^{r_f dt}$, sell the stock. Daily P&L is assumed to be invested immediately in the money market account.

During the life of the stock, each time it falls into the condition of statistical arbitrage, it is the start of a strategy until it falls out. Thus, more than one single strategy happens during the time period of backtest.

### Results on Single Stock

The strategy is valid for sz000001, sh600000, sh600001.

<img src= './demo/sz000001.png'>

<img src= './demo/sh600000.png'>

<img src= './demo/sh600001.png'>

The strategy perform badly on sh600004.

<img src= './demo/sh600004.png'>

### Conclusion

The estimation of stocks' $\alpha$s and $\sigma$s and be improved. And the parameters for the models can be optimized.

Further, it is possible to form portfolio that choose better performing stocks, reaching higher profit and reducing possibility of loss by decreasing the variance and avoiding stocks like sh600004.
