# Stage 1: Generate Stock Universe

- Gather stocks of interest
- Gather stocks from specific criteria (SP500 top 50...)
- Gather stocks from specific portfolio account
- Assemble stock universe 
- Gather price histories

In [29]:
from platform import python_version
import time
from datetime import datetime
import os
import pandas as pd
import numpy as np
import math
from tqdm.notebook import tqdm
import matplotlib.pyplot as plt

%matplotlib inline
plt.style.use('ggplot')
plt.rcParams['figure.figsize'] = (20, 8)

# Set the import path for the tools directiory
import sys
# insert at position 1 in the path, as 0 is the path of this file.
sys.path.insert(1, '../tools')
import importlib
import ameritrade_functions as amc
importlib.reload(amc)
import utils
importlib.reload(utils)

print(f'Python version: {python_version()}')
print(f'Pandas version: {pd.__version__}')

Python version: 3.8.10
Pandas version: 0.25.3


## Configure Ameritrade Information

Ameritrade credentials are stored in environment variables to keep from having unencrypted passwords stored on disk.

The module automatically masks the account numbers to protect the actual accounts. An Ameritrade user can have many investment accounts. We will be working with only one for this demonstration.

## Authentication Tokens

To get data from Ameritrade you will need to obtains a short time use token (there is a re-use token, but that has not been coded yet.) You only need to do this if you
are going to use an existing Ameritrade account to define an initial set of stocks to analyze.

To obtain a token, you will need to have a Chrome driver located somewhere on your system. This will allow the module to use your credentials to obtain an authentication token.

<span style="color:blue">Note: *Account numbers are masked for security purposes.*</span>

In [2]:
username = os.getenv('maiotradeuser')
password = os.getenv('maiotradepw')
client_id = os.getenv('maiotradeclientid')

# For Chromedriver
from pathlib import Path
chrome_executabel_path = str(Path.home()) + r'\Anaconda Projects\chromedriver\chromedriver'

# Make sure we have a data directory
Path('./data').mkdir(parents=True, exist_ok=True) 

# Which account are we interested in
masked_account_number = '#---5311'
account_portfolios_file_name = 'data/portfolio_data.csv'
portfolio_file_name = 'data/portfolio_' + masked_account_number[-4:] + '.csv'
price_histories_file_name = 'data/price_histories.csv'

In [3]:
td_ameritrade = amc.AmeritradeRest(username, password, client_id, chrome_executabel_path)
td_ameritrade.authenticate()

if len(td_ameritrade.authorization) == 0:
    print('Error: No authorization data: {}'.format(td_ameritrade.authorization))
else:
    print('You have authorization')

You have authorization


## Stock Universe

Here we setup the univers. This needs some work. The long term goal is to use a pipeline process to help select stock that are in the top 500 or something similare.

For now we will use stocks from the portfolio, but stocks of interest (high news items), a list of well known stocks (this also has been augmented with some stocks that made Ameritrade's top 10 movers for a couple of days. This Ameritrade funciton has not been coded yet, but should be add down the line to automate pulling these tickers.

## First lets see why stocks we already own for a specific account

I only want to work with Equity investments. This is kind of confusing, but at the account level assets that can be traded are call "EQUITY". When you get quotes for each asset, the same asset can be something like "ETF".

I also use Ameritrade's portfolio planner tool to create an asset mix based off of their reccomendations. I don't want these stocks (or in my case mutual funds and ETFs) to be part of this analysis. So I'll remove them here.

In [4]:
# Specific Portfolio Account
account_portfolio_df = utils.get_account_portfolio_data(td_ameritrade.parse_portfolios_list(), masked_account_number)
equity_investments_df = utils.get_investments_by_type(account_portfolio_df, investment_type='EQUITY')

# Filter out non Equity investments
current_stocks = amc.AmeritradeRest(username, password, client_id).get_quotes(utils.get_investment_symbols(equity_investments_df)).query('assetType == "EQUITY"').index.tolist()
current_investments_df = equity_investments_df[equity_investments_df['symbol'].isin(current_stocks)]
current_investments_df

Unnamed: 0,account,shortQuantity,averagePrice,currentDayProfitLoss,currentDayProfitLossPercentage,longQuantity,settledLongQuantity,settledShortQuantity,marketValue,maintenanceRequirement,currentDayCost,previousSessionLongQuantity,assetType,cusip,symbol,description,type
12,#---5311,0.0,21.62526,0.0,0.0,783.0,783.0,0.0,13491.09,13491.09,0.0,783.0,EQUITY,88688T100,TLRY,0,0
19,#---5311,0.0,0.14474,0.0,0.0,45000.0,45000.0,0.0,9900.0,0.0,0.0,45000.0,EQUITY,Q3860H107,FGPHF,0,0
20,#---5311,0.0,15.05,0.0,0.0,1000.0,1000.0,0.0,14170.0,14170.0,0.0,1000.0,EQUITY,98138J206,WKHS,0,0
21,#---5311,0.0,0.0496,0.0,0.0,250.0,250.0,0.0,8.58,0.0,0.0,250.0,EQUITY,Q3394D101,EEENF,0,0
22,#---5311,0.0,59.22,0.0,0.0,50.0,50.0,0.0,2564.0,2564.0,0.0,50.0,EQUITY,26142R104,DKNG,0,0


## Remove other assets

There may be some stocks that you are speculating on and do not want to be part of the analysis. Being a conservative investor, I have a percentage of my active portfolio (that is not part of the portfolio planner), that I have personally speculated on and are using for a long term play. These stocks will not be part of the portfolio optimization.

In [7]:
speculative_stocks = ['FGPHF', 'EEENF']
final_investments_df = current_investments_df[~current_investments_df['symbol'].isin(speculative_stocks)]
final_existing_stocks = utils.get_investment_symbols(final_investments_df)
final_existing_stocks 

['TLRY', 'WKHS', 'DKNG']

In [30]:
symbols_of_interest = ['MGM', 'PDYPF', 'NNXPF', 'WKHS']
# Hardcoded for now
symbols_via_specific_criteria = ['AAPL',
                                 'MSFT',
                                 'GOOG',
                                 'TSLA',
                                 'COKE',
                                 'IBM',
                                 'BABA',
                                 'GMGMF',
                                 'OEG',
                                 'LX',
                                 'AIH',
                                 'NMRD',
                                 'CAN',
                                 'MOSY',
                                 'QFIN',
                                 'OCG',
                                 'PRTK',
                                 'ZKIN', 
                                 'EFOI',
                                 'CONN',
                                 'LEDS',
                                 'TELL',
                                 'JZXN',
                                 'VTNR',
                                 'AEI',
                                 'IPQ',
                                 'RCON'
                                ]

stock_universe = set(symbols_of_interest + symbols_via_specific_criteria + final_existing_stocks)
holdings = utils.get_holdings(final_investments_df, stock_universe)['marketValue']
display(holdings)
utils.save_port_data(holdings.reset_index(), portfolio_file_name)

symbol
AAPL         0.00
AEI          0.00
AIH          0.00
BABA         0.00
CAN          0.00
COKE         0.00
CONN         0.00
DKNG      2564.00
EFOI         0.00
GMGMF        0.00
GOOG         0.00
IBM          0.00
IPQ          0.00
JZXN         0.00
LEDS         0.00
LX           0.00
MGM          0.00
MOSY         0.00
MSFT         0.00
NMRD         0.00
NNXPF        0.00
OCG          0.00
OEG          0.00
PDYPF        0.00
PRTK         0.00
QFIN         0.00
RCON         0.00
TELL         0.00
TLRY     13491.09
TSLA         0.00
VTNR         0.00
WKHS     14170.00
ZKIN         0.00
Name: marketValue, dtype: float64

## Portfolio weights

With the portfolio stocks and the additional stocks, show how each of them contribute to the portfolio. Later, once we produce an optimized portfolio, we can generate a report on how much stock to buy/sell based of what we already have.

In [31]:
holding_weights = utils.get_portfolio_weights(holdings)
display(holding_weights)

symbol
AAPL     0.000000
AEI      0.000000
AIH      0.000000
BABA     0.000000
CAN      0.000000
COKE     0.000000
CONN     0.000000
DKNG     0.084830
EFOI     0.000000
GMGMF    0.000000
GOOG     0.000000
IBM      0.000000
IPQ      0.000000
JZXN     0.000000
LEDS     0.000000
LX       0.000000
MGM      0.000000
MOSY     0.000000
MSFT     0.000000
NMRD     0.000000
NNXPF    0.000000
OCG      0.000000
OEG      0.000000
PDYPF    0.000000
PRTK     0.000000
QFIN     0.000000
RCON     0.000000
TELL     0.000000
TLRY     0.446354
TSLA     0.000000
VTNR     0.000000
WKHS     0.468816
ZKIN     0.000000
dtype: float64

# Price History data

One you have a set of investments you want to work with, you will need to pull some historical data for them.

In [10]:
number_of_years = 2
price_histories = td_ameritrade.get_price_histories(stock_universe, datetime.today().strftime('%Y-%m-%d'), num_periods=number_of_years)
utils.save_price_histories(price_histories, price_histories_file_name)

Empty candle data for IPQ


In [11]:
price_histories.head()

Unnamed: 0,open,high,low,close,volume,ticker,date
0,3.15,3.15,3.0,3.15,25986,LEDS,2019-07-02
11668,7.7,7.78,7.26,7.31,1522801,TELL,2019-07-02
912,175.13,175.55,174.25,175.45,14155171,BABA,2019-07-02
11162,311.49,313.7434,308.1225,312.88,58777,COKE,2019-07-02
10656,4.08,4.0973,3.92,4.01,179243,PRTK,2019-07-02


In [18]:
price_histories = utils.read_price_histories(price_histories_file_name)
close = utils.get_close_values(price_histories)
close.tail()

ticker,AAPL,AEI,AIH,BABA,CAN,COKE,CONN,DKNG,EFOI,GMGMF,...,PDYPF,PRTK,QFIN,RCON,TELL,TLRY,TSLA,VTNR,WKHS,ZKIN
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2021-06-28 00:00:00+00:00,134.78,6.11,7.96,228.59,7.95,402.0,26.43,52.71,4.03,1.85,...,189.0,6.9,41.72,4.19,4.41,18.63,688.72,9.94,16.96,4.31
2021-06-29 00:00:00+00:00,136.33,6.02,8.28,229.44,8.15,398.39,25.43,52.03,4.08,1.72,...,185.5,6.81,42.55,4.1,4.23,17.86,680.76,10.27,17.2,4.2
2021-06-30 00:00:00+00:00,136.96,5.68,7.96,226.78,8.15,402.13,25.5,52.17,3.98,1.69536,...,181.615,6.82,41.84,4.33,4.65,18.08,679.7,13.23,16.59,4.29
2021-07-01 00:00:00+00:00,137.27,5.31,8.1,221.87,7.62,397.16,25.9,51.8,3.9,1.7,...,187.375,7.24,38.79,4.2,4.66,17.835,677.92,12.04,15.64,4.23
2021-07-02 00:00:00+00:00,139.96,5.63,8.22,217.75,7.32,393.39,25.77,51.28,4.42,1.72,...,186.0,6.94,35.74,4.09,4.43,17.23,678.9,11.04,14.17,3.95
