**Disclosure:** I am not responsible for any losses that may occur if you choose to employ this rebalancing strategy in a live trading environment. The code and accompanying commentary are strictly for informational purposes. Nothing in this document should be construed as a recommendation for or against a financial services product, trading strategy, portfolio strategy, or any specific asset.


---


My bases for the LSTM module:
- Github: [Google Stock Price Prediction Using RNN-LSTM](https://github.com/ms723528/Google-Stock-Price-Prediction-Using-RNN---LSTM)
- Github: CFA Institute Webinar: Neural Networks in Action (Apr 2020) [link text](https://github.com/kritim13/cfa_neural_networks_in_action_2020)

My basis for the portfolio rebalancing module:
- [Portfolio Selection with Graph Algorithms and Deep Learning](https://www.linkedin.com/pulse/portfolio-selection-graph-algorithms-deep-learning-maya-benowitz/)
- Github: [Hedgecraft](https://github.com/mayabenowitz/Hedgecraft)

Generally helpful in constructing the Alpaca API modules:
- Youtube: [Part Time Larry](https://www.youtube.com/c/parttimelarry/videos)
- [Alpaca API Documentation](https://alpaca.markets/docs/api-documentation/api-v2/)


---


**How it Works: Modules and Dataflow**

Put simply, the entire Colab notebook is composed of four modules, each performing one primary function:
1. Requesting end-of-day price data for your stocks via Alpaca's API.
2. Running the price data through a LSTM algorithm to predict the EOD price for the next trading day.
3. Running the historical + predicted price data for all your stocks through a portfolio optimization algorithm (Hedgecraft) to obtain optimal portfolio allocation.
4. Buying/selling discrete stock units through Alpaca's API to realize this optimal portfolio allocation.


---


How to run the algorithm in 8 steps
1. Apply for an Alpaca account (may take 1-2 days for approval)
2. After you're approved, set up Paper Trading in your Alpaca account
3. Copy the code from the Google Colab notebook provided
4. In cell 2, assign your Alpaca ID and Secret Key to "APCA_API_KEY_ID" and "APCA_API_SECRET_KEY", respectively.
5. In cell 2, input the tickers you want in your portfolio into "target_list"
6. Make sure it's between 930-1600 EST on a trading day
7. Press Ctrl+F9 or click "Runtime" -> "Run All" to run the notebook
8. Check your Alpaca Paper Trading account online
Done. 
(Note: You can set LSTM "epochs" to "1" if you want to just run the notebook to see that it works. If you're running the Colab notebook with the default # of epochs, I recommend that you enable the GPU hardware accelerator in "Runtime" -> "Change runtime type" to save time.)


---


[Link](https://drive.google.com/file/d/1l5tMJBX5ZEGcc6chH0cvjzITcmdZU9jp/view?usp=sharing) to Google Colab environment.

In [None]:
!pip3 install dcor
!pip3 install networkx
!pip3 install PyPortfolioOpt
!pip3 install alpaca-trade-api
!pip3 install trading-calendars


import alpaca_trade_api as tradeapi
import requests, json

import dcor
import networkx as nx
from pypfopt import discrete_allocation
from pypfopt.expected_returns import mean_historical_return
from pypfopt.efficient_frontier import EfficientFrontier

import sklearn
from sklearn.metrics import mean_squared_error

import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout

import pandas as pd
import numpy as np
from datetime import datetime
from datetime import date
import pytz
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")

In [None]:
#Put your target companies in "target_list"
#There may be a upper bound to the number of tickers as bounded by Alpaca's API when requesting datasets of daily prices
target_list = ['AMZN', 'MA', 'GOOG', 'ADBE', 'MSFT', 'AAPL', 'V', 'DIS', 'DPZ', 'PYPL', 'FB', 'COST']

#Lower bound to daily price data you will be requesting from Alpaca
start_date = '2018-01-01'

#Input your Alpaca ID & Secret Key so that you can access Alpaca's API endpoint from this notebook 
APCA_API_KEY_ID = 
APCA_API_SECRET_KEY = 

#This URL is used as a base url for accessing different endpoints
#DO NOT change BASE_URL to the live Alpaca URL unless you know what you are doing
#I am not responsible for any trading losses that may occur if you switch from Paper Trading to Live Trading
BASE_URL = 'https://paper-api.alpaca.markets'

#Change 'US/Eastern' to your own timezone
now = datetime.now(pytz.timezone('US/Eastern'))

#The number of days you want the LSTM model to use as the lookback period
lookback = 60

#Keep training_pct as 1.0 to use 100% of daily prices as training data
#This is a naive form of an online LSTM
training_pct = 1.0

In [None]:
#############################################################################
def df_distance_correlation(df_train):
    df_train_dcor = pd.DataFrame(index=stocks, columns=stocks)
    k=0

    for i in stocks:
        v_i = df_train.loc[:, i].values
        for j in stocks[k:]:
            v_j = df_train.loc[:, j].values
            dcor_val = dcor.distance_correlation(v_i, v_j)
            df_train_dcor.at[i,j] = dcor_val
            df_train_dcor.at[j,i] = dcor_val
        k+=1

    return df_train_dcor
#############################################################################
def build_corr_nx(df_train):
    cor_matrix = df_train.values.astype('float')
    sim_matrix = 1 - cor_matrix
    
    G = nx.from_numpy_matrix(sim_matrix)
    stock_names = df_train.index.values
    G = nx.relabel_nodes(G, lambda x: stock_names[x])

    G.edges(data=True)
    H = G.copy()
    
    for (u, v, wt) in G.edges.data('weight'):
        if wt >= 1 - 0.325:
            H.remove_edge(u, v)
        if u == v:
            H.remove_edge(u, v)
     
    return H
#############################################################################
def centrality_to_portfolio_weights(weights):
    for key, value in weights.items():
        weights[key] = 1/value
    norm = 1.0 / sum(weights.values())

    for key in weights:
        weights[key] = round(weights[key] * norm, 3)
    return weights
#############################################################################
def get_account():
    r = requests.get(ACCOUNT_URL, headers=HEADERS)

    return json.loads(r.content)
#############################################################################
def create_order(symbol, qty, side, type, time_in_force):
    data = {
        "symbol": symbol,
        "qty": qty,
        "side": side,
        "type": type,
        "time_in_force": time_in_force
    }

    r = requests.post(ORDERS_URL, json=data, headers=HEADERS)

    return json.loads(r.content)
#############################################################################
def get_orders():
    r = requests.get(ORDERS_URL, headers=HEADERS)

    return json.loads(r.content)
#############################################################################
ACCOUNT_URL = "{}/v2/account".format(BASE_URL)
ORDERS_URL = "{}/v2/orders".format(BASE_URL)
HEADERS = {'APCA-API-KEY-ID': APCA_API_KEY_ID, 'APCA-API-SECRET-KEY': APCA_API_SECRET_KEY}
api = tradeapi.REST(APCA_API_KEY_ID, APCA_API_SECRET_KEY, base_url=BASE_URL)
account = api.get_account()
today = date.today()
clock = api.get_clock()
#############################################################################

In [None]:
stuff = []
for i in target_list:
  globals()[i] =  api.polygon.historic_agg_v2(i , 1, 'day', _from=start_date, to=today).df
  globals()[i]['Ticker'] = i
  stuff.append(globals()[i])

empty = stuff[0].iloc[:0,:]
for i in range(0, len(stuff)):
  empty = empty.append(stuff[i])


empty_close = empty.pivot(columns='Ticker', values='close')
empty_close_copy = empty_close.copy()

data_training_raw, data_test_raw = np.split(empty_close, [int(training_pct*len(empty_close))])
array_data_training = empty_close.values

date_index = pd.DataFrame(data_training_raw.index)
for i in range(0, len(date_index)):
  date_index.iloc[i,0] = date_index.iloc[i,0].date()

date_index = date_index[lookback:]
date_index = date_index.reset_index().iloc[:,1:]
date_index.loc[len(date_index)] = clock.next_open.date()

In [None]:
historical_prices_plus_one = date_index

for c in range(0, len(target_list)):
      X_train = []
      y_train = []

      print(target_list[c])

      for i in range(lookback, array_data_training.shape[0]):
          X_train.append(array_data_training[i-lookback:i])
          y_train.append(array_data_training[i, c])

      X_train, y_train = np.asarray(X_train).astype('float32'), np.asarray(y_train).astype('float32')
      X_train, y_train = np.array(X_train), np.array(y_train)
 

      model = Sequential()
      model.add(LSTM(units = 60, activation = 'tanh', return_sequences = True, input_shape = (X_train.shape[1], X_train.shape[2])))
      model.add(Dropout(0.5))
      model.add(LSTM(units = 60, activation = 'tanh', return_sequences = True))
      model.add(Dropout(0.5))
      model.add(LSTM(units = 80, activation = 'tanh', return_sequences = True))
      model.add(Dropout(0.5))
      model.add(LSTM(units = 120, activation = 'tanh'))
      model.add(Dropout(0.5))
      model.add(Dense(units = 1))

      model.compile(optimizer='sgd', loss = 'mean_squared_error')
      model.fit(X_train, y_train, epochs=1, batch_size=64, verbose=0)

      inputs = np.array(data_training_raw.tail(lookback))
      inputs_reshaped = inputs.reshape(1, inputs.shape[0], inputs.shape[1])

      y_pred = model.predict(inputs_reshaped)
      
      new_shit = np.append(y_train, y_pred)
      new_shit_df = pd.DataFrame(data=new_shit)
      historical_prices_plus_one[target_list[c]] = new_shit_df

df_train_close = historical_prices_plus_one.set_index("timestamp")
df_train_close.index = pd.to_datetime(df_train_close.index)

In [None]:
df_train_close_copy = df_train_close.copy()
stocks = df_train_close.columns.tolist()

df_train_list = [df_train_close]

for df in df_train_list:
    for s in stocks:
        df[s] = df[s].diff()
#redundant if you normalize from the get go @ LSTM input stage by taking pct change

for df in df_train_list:
    df.dropna(inplace=True)

df_buy_in = df_train_close_copy.loc[today.strftime("%Y-%m-%d")].sort_index().to_frame('Buy In')

In [None]:
df_train_dcor_list = [df_distance_correlation(df) for df in df_train_list]
H_close = build_corr_nx(df_train_dcor_list[0])
H_master = H_close
weights = nx.communicability_betweenness_centrality(H_master)
centrality_to_portfolio_weights(weights)

In [None]:
capital = float(api.get_account().portfolio_value)

alloc = discrete_allocation.DiscreteAllocation(
    weights, 
    df_buy_in['Buy In'], 
    total_portfolio_value=capital
)

alloc = alloc.greedy_portfolio()[0]


alloc_series = pd.Series(alloc, name='Desired Shares')
alloc_series.index.name = 'Assets'
alloc_series.reset_index
df_alloc = alloc_series.sort_index().to_frame('Shares')

In [None]:
test = pd.DataFrame(df_alloc.values*df_buy_in.values,columns=["Total Price"], index=df_alloc.index)

allocation = pd.concat([df_alloc,df_buy_in,test],axis=1)
print(allocation)

print("\n")

counter = 0
for i in range(0, len(test[0:])):
  counter += test.iloc[i,0]
  print(counter)

In [None]:
portfolio = api.list_positions()

df_current = pd.DataFrame(columns = ['Assets', 'Current Shares']) 

for i in range(len(portfolio)):
  #portfolio[i].side
  print(portfolio[i].symbol, portfolio[i].qty)
  df_current = df_current.append({'Assets':portfolio[i].symbol,'Current Shares':float(portfolio[i].qty)}, ignore_index=True)
df_current = df_current.set_index("Assets")
df_current

In [None]:
reconcil_df = pd.concat([df_alloc, df_current], axis=1).fillna(0)
reconcil_df['Qty Diff'] = reconcil_df['Shares'] - reconcil_df['Current Shares']
reconcil_df = reconcil_df.sort_values(by="Qty Diff")

In [None]:
for i in range(0, len(reconcil_df)):
  symbol = reconcil_df['Qty Diff'].index[i]
  qty = abs(reconcil_df['Qty Diff'][i])
  if reconcil_df['Qty Diff'][i]>0:
    side = 'buy'
  else:
    side = 'sell'
  type = 'market'
  time_in_force = 'gtc'
  create_order(symbol, qty, side, type, time_in_force)
  print(symbol, qty, side, type, time_in_force)
