# Modeling Stock Movement and Algorithmic Trading
Brian Bahmanyar

***

## Introduction

* Understanding the movement of markets and stocks is fundamentally a very difficult problem.
    * Human behavior is unpredictable
* The goal of this project is outpreform the Standard and Poor's 500 which historically yeilds an average return of about 10%.
    * To do this we exploite patterns in stock movement

## Exploring a Single Stock

In [None]:
import pandas as pd
import sys

sys.path.append('./src/')
from plots import *
from random_walk_forcast import *
from pairs_trade import *

In [None]:
%matplotlib inline

In [None]:
tech = pd.read_csv('data/tech_bundle.csv', index_col=0)
tech.index = pd.to_datetime(tech.index)

In [None]:
plot_stocks(tech.index, [tech['AMZN']], ['Amazon'], label_annually=False )

* We see an obvious general positive trend, but we want to get a more granular look at returns

## Modeling Daily Returns

* Consider $\frac{y_t}{y_{t-1}}$

Lets use MLE to fit the appropriate distribution.

In [None]:
plt.figure(figsize=(15,5));
plt.title("Amazon's Daily Return Ratio");
plt.xlabel('Return Ratio');
plt.ylabel('Count');
sns.distplot(get_daily_return_ratio(tech['AMZN'].values), kde=False);

In [None]:
fit = mle_log_norm(get_daily_return_ratio(tech['AMZN'].values))
fit

In [None]:
plt.figure(figsize=(15,5));
plt.title(r'$LogN(0.0017, 0.00054)$');
plt.xlabel('Return Ratio');
plt.ylabel('Count');
sns.distplot(np.random.lognormal(fit[0], np.sqrt(fit[1]), size=1000000));

#### Simulating Forcast

In [None]:
plot_simulated_forcast(tech['AMZN'], window=100, ahead=135, train_on=650, n=10);

#### Deriving a K step ahead forcast:

* __Expected Value__

$\mathrm{E}[LogN(\mu, \sigma^2)] = e^{\mu+\sigma^2/2}$

$\frac{Y_t}{Y_{t-1}} \sim LogN(\mu, \sigma^2)$ $\Rightarrow$   $\hat{Y_{t+1}} = Y_{t} \cdot \mathrm{E}[LogN(\mu, \sigma^2)]$ 

$\hat{Y_{t+k}} = Y_{t+(k-1)} \cdot \mathrm{E}[LogN(\mu, \sigma^2)]$ $\Rightarrow$ $\hat{Y_{t+k}} = Y_t \cdot \mathrm{E}[LogN(\mu, \sigma^2)]^k$

* __Varience__

$\mathrm{E}[LogN(\mu, \sigma^2)] = (e^{\sigma^2}\!\!-1) e^{2\mu+\sigma^2}$

Suppose $Z$ $\sim$  $LogN(\mu, \sigma^2)$

$\mathrm{V}[Y_{t+k}|Y_t] = \mathrm{E}[Y^2_{t+k}|Y_t] - \mathrm{E}[Y_{t+k}|Y_t]^2$

$\mathrm{V}[Y_{t+k}|Y_t] = Y^2_t \cdot \mathrm{E}[Z^2_{t+1}]^k - Y^2_t \cdot \mathrm{E}[Z_{t+1}]^{2k}$

$\mathrm{V}[Y_{t+k}|Y_t] = Y^2_t \cdot [\mathrm{E}[Z^2_{t+1}]^k - \cdot \mathrm{E}[Z_{t+1}]^{2k}]$ 

$\mathrm{V}[Y_{t+k}|Y_t] = Y^2_t \cdot [(\mathrm{V}[Z_{t+1}] + \mathrm{E}[Z_{t+1}]^{2})^k - \mathrm{E}[Z_{t+1}]^{2k}]$ defined as a function below

In [None]:
plot_expected_forcast(tech['AMZN'], window=30, ahead=150, train_on=635, error=2)

## Pairs Trade Algorithm

In [None]:
pairs = pd.read_csv('data/pairs_bundle.csv', index_col=0)
pairs.index = pd.to_datetime(pairs.index)

In [None]:
plot_pair(pairs['CVX'], pairs['XOM'], ['Chevron','Exxon'])

In [None]:
positions_gas = identify_positions(pairs['CVX/XOM'], 1)
plot_ratio(pairs['CVX/XOM'], 'Chevron, Exxon', deviations=[1], positions=positions_gas)

In [None]:
def back_trade(init_investment, numer_prices, denom_prices, ratio, positions, swap_count=50):

    """
    Back trades with the given positions
    
    Args: init_investment (int)----initial total investment
          numer_prices (ndarray)---the series of the numerator stock w/ respect to the ratio
          denom_prices (ndarry)----the series of the denominator stock w/ respect to the ratio
          ratio (ndarray)----------the ratio of the series'
          positions (list of maps)-the postions to trade on
          swap_count (int)---------the number of stocks to swap at a given open position
    Returns: (map) the result object
    """
    cur_portfolio_value = init_investment
    
    
    for position in positions:
        if all(ratio[position['open']] > ratio.mean()):
            openings = len(position['open'])
            cur_portfolio_value += np.sum(swap_count*numer_prices[position['open']])
            cur_portfolio_value -= np.sum(swap_count*denom_prices[position['open']])

            cur_portfolio_value -= openings*swap_count*numer_prices[position['close']]
            cur_portfolio_value += openings*swap_count*denom_prices[position['close']]
        elif all(ratio[position['open']] < ratio.mean()):
            openings = len(position['open'])
            cur_portfolio_value -= np.sum(swap_count*numer_prices[position['open']])
            cur_portfolio_value += np.sum(swap_count*denom_prices[position['open']])

            cur_portfolio_value += openings*swap_count*numer_prices[position['close']]
            cur_portfolio_value -= openings*swap_count*denom_prices[position['close']]
    
    return {'init_investment': init_investment,
            'net_gain': cur_portfolio_value - init_investment,
            'net_gain/year': (cur_portfolio_value - init_investment) / (len(ratio) / 252) } # 252 trade days / year

In [None]:
back_trade(10000, pairs['CVX'].values, pairs['XOM'].values, pairs['CVX/XOM'].values, positions_gas, 100)