# Pairs Trading with Distance Method(DM) Theory Part

## References:
- Distance Approach in Pairs Trading: Part I.https://hudsonthames.org/distance-approach-in-pairs-trading-part-i/
- Introduction to Distance Approach in Pairs Trading: Part II.https://hudsonthames.org/introduction-to-distance-approach-in-pairs-trading-part-ii/
- ArbitrageLab Presentation by Illya Barziy. https://docs.google.com/presentation/d/1oFpv7OUi3W9F2D30rEQRy_Ra5ejLtlCMnk9nTJnMDr4/edit#slide=id.gb6709680a6_0_273
- Distance Approach in Paris Trading- Part 1. https://docs.google.com/presentation/d/1KrBUsROvx6aeFrsVbI1CgwIcH-VTD4xWVSlQKnknvR4/edit#slide=id.gd7dda458c4_0_232
- Pairs trading.Pair selection.Distance(Part 1). https://medium.com/@financialnoob/pairs-trading-pair-selection-distance-5ac4aeef0de0
- Distance Approach in Pairs Trading - Part 2. https://docs.google.com/presentation/d/1YjEHkEVn9T9K8UlWfO63qLU2qgU63cHxnO8xGFpxpHs/edit#slide=id.p1
- Statistical arbitrage pairs trading strategies: Review and outlook.
- Introduction to the Hurst exponent — with code in Python. https://towardsdatascience.com/introduction-to-the-hurst-exponent-with-code-in-python-4da0414ca52e

## Procedure:
1. Security selection
2. Price normalization
3. Eulidean distance calculation(can be replaced with alternative distance method)
4. Pairs selection
5. Position entry and exit rules


### Data Preperation:
- 1962-2002 US liquid stocks

### Pairs Formation

#### 1. Normalization
- Formula: $P_{normalized}=\frac{P-min(P)}{max(P)-min(P)}$
- Effects: shift the stock prices' range to $[0,1]$

#### 2. Distance Calculation for Normalized Prices
- Formula: $SSD=\sum_{t=1}^{N}(P_t^1-P_t^2)^2$
- picking $top \space n$ pairs with smallest distance, normally we select $n=20$

#### 3. Calculate the Historical Spread Volatility(std)
- Formula: $\sigma=\sqrt[2]{\frac{\sum(x_i-\mu)^2}{N-1}}$
- it is regarded as threshold to enter/exit the trade during the trading period

### Trading Period

1. **Normalizing** the prices in trading period using $min(P)$ and $max(P)$
2. Calculating `pairs spreads` in the trading period
  - the portfolio value series are differences between normalized price series of elements in pairs.
3. Generating trading signals
  - if spread value> 2$\sigma$  → **sell** signal
  - when spread crosses 0 → **close open** position
  - if spread value < -2$\sigma$ → **buy** signal
4. Strategy Application Details:
  - Time period: Trains(formation period) for 12 months, Test(trading period) for 6 months
  - Threshold: $2\sigma$
  - Number of Pairs: $n=20$

### Strategy Variations 

- Pairs formation/ trading period duration
- number of pairs chosen
- Signal generation sensitivity
- Weights in *portfolio* of pairs
- Weights of *assets* in each pair


## Other Alternative Distance Approach with Different Pairs Selection Method

1. pairs with smallest distance(basic approach with $SSD$)
2. pairs with **same industry group** calculating their SSDs, then n closest pairs are selected. 
  - More porfitable pairs will obtained by adding one more criterion than the basic one.
3. Choose $top \space n$ pairs with a higher **number of zero-crossings**
  - zero-crossing: the function has an intercept with x-axis
  - Intuition: 
    - it measures the frequency of *divergence and convergence* between two assets
    - track each other and exhibit frequent deviations that *reversed* under the force of arbitrage.
  - Statistical expression: $Pairs \space return_i=Constant+a_1TimeTrend+a_2SSD_i+a_3SSD_i^2+a_4log(Zero-crossings)+a_5SameIndustryFlag+a_6IndustryVolatility_i+a_7(IndustryVolatility_i)^2+e_i$
    - $a_4$: coefficient of factors affecting pairs return is *statistically signifiant*, has a **postive effect** on pairs return. 
    - pairs trading with higher number of zeros-crossings, the higher profitability.
4. Pairs with a higher historical pricing **standard deviation**
  - **Limitation** of SSD method:
    - Formula: $\bar{SSD_{ijt}}=\frac{1}{T}\sum_{t=i}^T(P_{it}-P_{jt})^2=Var(P_{it}-P_{jt})+[\frac{1}{T}\sum_{t=i}^T(P_{it}-P_{jt})^2]^2$
      - Rewrite the funciton: spread variance: $Var(P_{it}-P_{jt})=\frac{1}{T}\sum_{t=i}^T(P_{it}-P_{jt})^2-[\frac{1}{T}\sum_{t=i}^T(P_{it}-P_{jt})^2]^2$
    - Intuition: 
      - lowing SSD should companied with minimum <u>the sum of spread variance and squared spread mean</u>
      - But if we want to *get more profit potientially*, we should expect the variance of spread much more higher!
5. Selecting pairs with low enough **half-life of mean reversion**
6. Selecting only pairs with **Hurst exponent** < 0.5
  - Hurst Exponent($H$):
    - a measure of long-term memory and the amount series deviates from a <u>random walk</u>.
    - related to <u>trending pattern</u> and <u>mean-reverting pattern</u>
    - range $[0,1]$, the higher H, the stock market's future prices are likely to be more similar to its past prices.
      - $H < 0.5$: a **mean-reverting** series
         - anti-persistent, eg: high value closely followed by low value.
         - the closer the value to $0$, the stronger mean-reversion pattern is.
      - $H = 0.5$: geometric **random walk**
      - $H > 0.5$: a **trending series**
         - persistent series, eg: high value closely followed by a higher one.
7. Selecting $top \space n$ pairs with highest **Pearson correlation coefficient**
  - Pearson Correlation on returns formula: $D_{ijt}=\beta(R_{it}-R_f)-(R_{jt}-R_f)$
    - $D$: return divergence
    - $\beta$: regression coefficient of stock i monthly return on its pair's return.
  - Portfolio construction: order the stock in descend in terms of pervious month's return divergence($D$), pick $top \space n$ stocks.
  - Portfolio Formation:
    1. Data Preprocessing
      - Data split:
         - Training(Formation period) set: price data from year $t-4$ to year $t$(5 year in total)
         - Testing(Trading period) set: price data at year $t+1$
      - Data manipulation
        - for getting pairs' correlation, if there exists $m$ stocks, the numbers of correlation needed to compute is $m(m-1)/2$
        - for reduce computational burden, we use <u>monthly data</u>
    2. Pairs Portfolio Formation
      - Finding pairs with *top n highest* correlation coefficient
      - Weights allocation(3 ways):
         - Simple OLS of Returns of pairs portfolio:
           - Formula: $R_{jt}=\alpha+\beta*R_{it}$, we can get $\beta$ in this step by linear regression
         - Equal-weighted portfolio: with <u>equal-weighted average returns</u> of top n pairs of stocks
         - Correlation-weighted portfolio: 
           - Formula: $w_k=\frac{\rho_{k}}{\sum_{i=1}^n \rho_i}$
             - $k$: *benchmark portfolio* for each of stock
    3. Trading Signal Generation
      - Assumption: About $D$, if *individual stock return* <u>deviates</u> from its *pairs portfolio returns*, in the following month the divergence should be *reversed* in trading period($t+1$ year). 
      - Trading Signal:
         1. sort stocks in descending order in terms of perious monthly $D_{ijt}$
         2. given the percentages of long, short stocks: 
           - top $p$% of sorted stock→long 
           - bottom $q$% of sorted stock→short
        3. Special Case:
          -  *Dollar neutral* portfolio construction: long decile 10 and short decile 1. Then held for one month.
             - Dollar neutral definition: long side & short side in dollar will completely offset each other, so net=0 in dollar notional terms.
  - Comparison with Basic Distance Method
     1. Pearson correlation restricts less than SSD method
       - PC takes advantage of `Variance of Return Spread`
         - seeking higher pearson correlation needs **lower** vairance of return spreads.
         - Formula: $Var(R_{it}-R_{jt})=Var(R_{it})+Var(R_{jt})-2r(R_{it},R_{it})\sqrt[2]{R_{it}}\sqrt[2]{R_{jt}}$
           - return correlation: $2r(R_{it},R_{it})$
       - SSD takes advantage of `Variance of Price Spread`
         - minimizing SSD needs **lowing** variance of price spread
           - if the pairs are <u>perfectly correlated</u> and price time series have the <u>same variance</u>, the variance of price spread→0.
           - Finally, SSD method gets pairs with <u>similar low variance and higher correlation</u> of stock prices. As a result, the selection metric is stricter than that of PC.
         - Formula: $Var(P_{it}-P_{jt})=Var(P_{it})+Var(P_{jt})-2r(P_{it},P_{it})\sqrt[2]{P_{it}}\sqrt[2]{P_{jt}}$
           - Price correlation: $2r(P_{it},P_{it})$
    2. Pearson correlation provides higher level information
      - the return divergences mostly caused by **idiosyncratic movements** of stock i, have potential to *reverse*.
      - **Quasi-multivariate pairs trading**($P_{it}=\sum_{k=1}^nw_kP_{kt}$) leads to <u>higher and more robust</u> annual excess returns than univariate pairs trading ($P_{jt}$)
         - Notions:
           - $P_{kt}$: linear combination of price time series
           - $w_k$: the weights are rolling calculated, for every 10 days, the weights are re-estimated by three ways we have mentioned above, using past 2 years' return.
         - Trading signal: 
           - open: spread between 2 price time series, exceed threshold $k$
           - close: spread below threhold $k$
         - This method <u>reduces high transaction prices</u>.

