<a href="https://colab.research.google.com/github/Jakub-MFP/My_FIRE_Project/blob/master/portfolio_management/cashposition_backtest.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Objective**

**Problem**

Lots of resources advocate for investors to have between 10% to 30% of their portfolio in cash so we can seize upon market opportunities. The problem is that it is never explained when we should have 30% or 20% or 10% cash position, what factors determine how much cash we should have at any given time?


**Question**

What is the optimal cash position to keep in our portfolio at anytime for the past 20 years. We will assume that our portfolio only has one asset and it is the ETF SPY. Which is an ETF that tracks the SP500 Index and is a good overall indicator for the status of the "market"


**Hypothesis**

I think we will figure out that just investing 100% of the money into a portfolio during our regular deposit cycles and keeping no cash position will under-perform having a cash position at all times. 

*For example*

* If the SPY is up, we keep 30% of our portfolio in cash
* IF SPY is down 10% from ATH (all time high) than we keep 20% of portfolio in cash
* IF SPY is down 20% from ATH than we keep 10% of portfolio in cash
* IF SPY is down 30% from ATH than we keep 5% of portfolio in cash 

I believe that having a cash position on the side and investing more when the market is down will yield a higher ROI and CAGR over the same 20 year period. 

If SPY just regular investments returned a 9% CAGR, than I believe doing something as simple as just holding some cash will return 12% CAGR+. Something significant like 30-40% higher returns than otherwise possible. 

This will be tested not just on SPY but a number of stocks and ETFs with the eventually goal of doing this  with multiple stocks in a portfolio and using the efficient frontier to determine how much we allocate to any stock when the market trigger is in effect to reduce or increase our cash position. 

**The Procedure**
1. We run the experiment to see how much 10,000 in 2000 would yield by end of Dec 2019 with no additional investments. We want to calcualte the CAGR or Compound Annual Growth Rate
2. We run another experiment to see how much 10,000 in 2000 would yield by end of Dec 2019 with a 1,000 monthly deposit. Does it change the CAGR? 
3. Finally we will run a combinatorics trial 

   1. Start with 10,000
        1. Deposit each month 1,000 into portfolio
        2. Adjust the invested / cash ratio based on the market performance
            1. If the SPY is all time high than keep 70% invested and 30% in cash
            2. If the SPY drops 10% from ATM than keep 80% invested and 20% cash
            3. If SPY drops 20% from ATM than keep 90% invested and 10% cash
            4. If SPY drops 30% from ATM than keep 95% invested and 5% cash
        3. Do every combination of cash allocation based on how much market drops between -1% and -50%
        4. Calcualte the CAGR for each combination

Daily Progress Update 
https://myfireproject.com/topic/301-portfolio-management-research/


# **Initial Setup**

Installs Alpha Vantage and imports required modules

**Install Alpha Vantage API**

In [None]:
# Install Alpha Vantage
!pip install alpha_vantage

**Import All The Modules You Need**

In [153]:
import pandas as pd
import json
import requests
import sqlite3
import time
import datetime
import numpy as np
import matplotlib.pyplot as plt 
import sys
import math

from pandas import DataFrame
from datetime import datetime as dt
from alpha_vantage.timeseries import TimeSeries
from dateutil.relativedelta import relativedelta
from math import log

# **Alpaha Vantage API**

This will create a pandas dataframe of all historical prices

https://www.alphavantage.co/documentation/



Assign Stock Ticker and API Key

In [None]:
stock_ticker = 'SPY'
   
    # Update this for your own API key
api_key = open('/content/drive/My Drive/Colab Notebooks/key').read()
#api_key = '4545sdsad5s4dsd'

Run API Request to get stock data

    # Date / Open / High / Low / Close / Adjusted Close / Volume / Dividend / Split

In [None]:
ts = TimeSeries (key=api_key, output_format = "pandas")
data_daily, meta_data = ts.get_daily_adjusted(symbol=stock_ticker, outputsize ='full')

Assign metrics from dataframe to individual variables

In [None]:
    # data_daily['column name'][row number] 
# stock_price_open = data_daily['1. open'][0]
# stock_price_close = data_daily['5. adjusted close'][0]
# dividend_amount = data_daily['7. dividend amount'][0]

#**Procedure**
Investment Period = Jan 1, 2000 to Dec 31, 2019
1. ***ROI*** and ***CAGR*** of Investing `$10,000` and just allocating `100%` Position
2. ***ROI*** and ***CAGR*** of Investing `$10,000` and depositing `$1,000` each month and allocating `100%` position 
3. ***ROI*** and ***CAGR*** of Investing `$10,000` and depositing `$1,000` each month and allocating a `variable` position using `market_status`

## Step 1. `$10,000` and `100% Position` 

1. The goal is to get the dataframe to start on `start_date` and end on `end_date`
2. Grab the `adjusted_close_price` for `start_date` or the next trading day therafter. 
3. `starting_shares` = `initial_investment` / `adjusted_close_price` for the `start_date`
4. Grab the `adjusted_close_price` for the `end_date` 
5. `step1_final_value` = `starting_shares` * `adjusted_close_price`
6. `step1_roi` = (`step1_final_value` - `initial_investment`) / `initial_investment`
7. `step7_cagr` = round(((math.log(`step1_final_value`/`initial_investment`)) / (math.log(1+`number_of_years`))),2)

In [185]:
start_date = datetime.datetime(2000, 1, 1)
end_date = datetime.datetime(2019, 12, 31)
time_difference = relativedelta(end_date, start_date)
number_of_years = time_difference.years + 1

date_filter = data_daily[(data_daily.index > start_date) & (data_daily.index <= end_date)]

    # Starting Settings
initial_investment = 10000
monthly_deposit = 1000

    # Start Price
start_adjusted_close_price = date_filter['5. adjusted close'][-1]
starting_shares = initial_investment / start_adjusted_close_price

    # End Price

end_adjusted_close_price = date_filter['5. adjusted close'][0]


    # Final Math
step1_final_value = round((starting_shares * end_adjusted_close_price), 2)
step1_profit = round((step1_final_value - initial_investment), 2)
step1_roi = round((((step1_final_value - initial_investment) / initial_investment)* 100), 2)
step1_cagr = round((((step1_final_value / initial_investment) ** (1 / (number_of_years - 1)) - 1)* 100), 2)

    # Print out results

print("Start Share Price : ${} ".format(start_adjusted_close_price))
print("Starting Shares : {} ".format(starting_shares))
print("Number of years : {} ".format(number_of_years))
print("")

print("Ending Share Price : ${} ".format(end_adjusted_close_price))
print("")
print("Total Deposits : ${} ".format(initial_investment))
print("Portfolio Value : ${} ".format(step1_final_value))
print("Investment Profit : ${} ".format(step1_profit))
print("ROI : {}% ".format(step1_roi))
print("CAGR : {}% ".format(step1_cagr))

Start Share Price : $98.5115 
Starting Shares : 101.51099110256163 
Number of years : 20 

Ending Share Price : $317.2003 

Total Deposits : $10000 
Portfolio Value : $32199.32 
Investment Profit : $22199.32 
ROI : 221.99% 
CAGR : 6.35% 


## Step 2. `$10,000` + `$1,000` Monthly and `100% Position` 

In [192]:
start_date = datetime.datetime(2000, 1, 1)
end_date = datetime.datetime(2019, 12, 31)
time_difference = relativedelta(end_date, start_date)
number_of_years = time_difference.years + 1

initial_investment = 1000
monthly_deposit = 1000


    # Create a filtered dataframe, and change the order it is displayed. 
date_filter = data_daily[(data_daily.index > start_date) & (data_daily.index <= end_date)]
date_filter = date_filter.sort_index(ascending=True)

    # Set starting balances
start_current_balance = initial_investment
start_adjusted_close_price = date_filter['5. adjusted close'][0]
start_current_shares = start_current_balance / start_adjusted_close_price

    # Settings for loop
current_balance = 0
current_shares = 0
deposit_month = 1
total_deposits_count = 0

    # Settings to create age of investments
transactions = [] # log all dates and transactions amounts
transactions.append([start_date, initial_investment])

    # add starting balance to deposit_dates_list
# deposit_dates.append(start_date, int(initial_investment))
#deposit_dates[start_date] = int(initial_investment)

    # step 4 - Itterate trough all the rows in the dataframe and buy shares if it is a new month
for index, row in date_filter.iterrows():

    current_date = str(index) # set current index date row to current_date 
    current_month = int(current_date[5:7]) # grab the current month segement from current date

    if current_month != deposit_month: # if it is a new month than run loop
        current_balance += monthly_deposit # add $1000 to current_balance
        shares_purchased = (current_balance / row['5. adjusted close'])
        current_shares += (current_balance / row['5. adjusted close']) # figure out how many new shares we are buying 
        total_deposits_count += 1 # add to total_deposits that we made a new deposit
        current_balance -= monthly_deposit # take of $1000 deposit for purchase of shares

        new_date = index
        new_transaction = monthly_deposit
        transactions.append([new_date, new_transaction]) #add current date for each deposit to the dic 
        
        deposit_month = current_month
        #print(deposit_month)
        #print(shares_purchased)
        
    else:
        continue

    # Figure out average age of deposits
age_list = []
for trans_date, deposit in transactions: # for items in list transactions
    calculation = (end_date - trans_date) #get difference in days
    age_years = float((int(calculation.days))/365) # convert to how many years
    age_weights = age_years * deposit #figure out the weight of each dollar invested
    age_list.append(age_weights) #add to list


age_ratio = sum(age_list)



    # Ending Metrics
end_adjusted_close_price = date_filter['5. adjusted close'][-1]
final_balance = round((current_shares * end_adjusted_close_price), 2)
total_deposits = int((total_deposits_count * monthly_deposit) + initial_investment)
investment_profit = final_balance - total_deposits
roi = round(((investment_profit / total_deposits) * 100), 2)
age = age_ratio / total_deposits
cagr = round((((final_balance / total_deposits) ** (1 / (age - 1)) - 1)* 100), 2)

Analysis

In [191]:
print("STARTING METRICS")
print("Initial Balance : $ {} ".format(start_current_balance))
print("Share Price : $ {} ".format(start_adjusted_close_price))
print("Purchased Shares : {} ".format(starting_shares))

print("")
print("-----------------------------------------------------------")
print("")

print("ENDING METRICS")
print("Number of years : {} ".format(number_of_years))
print("Average Age of each Dollar invested : {} years ".format(age))
print("Share Price : $ {} ".format(end_adjusted_close_price))
print("Final Shares : {} ".format(current_shares))

print("")
print("-----------------------------------------------------------")
print("")

print("RESULTS")
print("Total Deposits : $ {} ".format(total_deposits))
print("Final Balance : $ {} ".format(final_balance))
print("Investment Profit : ${} ".format(investment_profit))
print("ROI : {}% ".format(roi))
print("CAGR : {}% ".format(cagr))

STARTING METRICS
Initial Balance : $ 10000 
Share Price : $ 98.5115 
Purchased Shares : 101.51099110256163 

-----------------------------------------------------------

ENDING METRICS
Number of years : 20 
Average Age of each Dollar invested : 10.024342951926098 years 
Share Price : $ 317.2003 
Final Shares : 43675.943714604946 

-----------------------------------------------------------

RESULTS
Total Deposits : $ 4790000 
Final Balance : $ 13854022.45 
Investment Profit : $9064022.45 
ROI : 189.23% 
CAGR : 12.49% 


## Step 3. `$10,000` + `$1,000` Monthly and `Variable Position` 

In [200]:
start_date = datetime.datetime(2000, 1, 1)
end_date = datetime.datetime(2019, 12, 31)
time_difference = relativedelta(end_date, start_date)
number_of_years = time_difference.years + 1

initial_investment = 1000
monthly_deposit = 1000


    # Create a filtered dataframe, and change the order it is displayed. 
date_filter = data_daily[(data_daily.index > start_date) & (data_daily.index <= end_date)]
date_filter = date_filter.sort_index(ascending=True)


    # Set starting balances
start_current_balance = initial_investment
start_adjusted_close_price = date_filter['5. adjusted close'][0]
start_current_shares = start_current_balance / start_adjusted_close_price

    # Settings for loop
current_balance = 0
portfolio_value = 0
current_shares = 0
deposit_month = 1
total_deposits_count = 0
total_shares_purchases_count = 0

    # Creating all the lists we need
transactions_deposits = [] #log every deposit in dictionary 
transactions.append([start_date, initial_investment])
transactions_trades = [] #log every trade in dictrionary
market_status_list = [] #log every market status for each day

    # settings for market status
market_all_time_high = date_filter['5. adjusted close'][0] #defautl all time high is the first price of the share 

    # Main Loop
for index, row in date_filter.iterrows():

    current_date = str(index) # set current index date row to current_date 
    current_month = int(current_date[5:7]) # grab the current month segement from current date

    if current_month != deposit_month: # if it is a new month than run loop
        current_balance += monthly_deposit # add $1000 to current_balance
        total_deposits_count += 1 # add to total_deposits that we made a new deposit
        new_date = index
        new_transaction = monthly_deposit
        transactions.append([new_date, new_transaction]) #add current date for each deposit to the dic 
        deposit_month = current_month #set the deposit month to be current month
        
    else:
        continue

    # Figure out what the market status is, if it is down or up compared to all time high
    current_market_price = row['5. adjusted close'] #setting current market price to current row

    if (current_market_price > market_all_time_high): #checking if the current price is a new record for all time high
        market_all_time_high = current_market_price
    else:
        continue

    current_market_change = float(current_market_price - market_all_time_high) # dollar value difference between all time high and current price
    current_market_rating = float(current_market_change / market_all_time_high) # % difereance gain or drop from all time high
    market_status_list.append([current_date, current_market_price, market_all_time_high, current_market_change, current_market_rating]) #logs market status into our list


        # How much cash do we need to kee on hand based on the market status
            # https://www.tutorialspoint.com/python/comparison_operators_example.htm
    if (current_market_status > 0):  #greater than 0
        current_cash_required_equity = 0.30
    elif (current_market_status > -0.05): #less than 5%
        current_cash_required_equity = 0.25
    elif (current_market_status > -0.10): #less than 10%
        current_cash_required_equity = 0.20 
    elif (current_market_status > -0.15): #less than 15%
        current_cash_required_equity = 0.15
    elif (current_market_status > -0.20): #less than 20%
        current_cash_required_equity = 0.10
    elif (current_market_status > -0.25): #less than 25%
        current_cash_required_equity = 0.05
    elif (current_market_status > -0.30): #less than 30%
        current_cash_required_equity = 0

        # What is our current cash balance, and portfolio value
    portfolio_value = float((current_balance + (current_shares * current_market_price))) #figuring out the total value of portfolio
    current_cash_required = float(current_cash_required_equity * portfolio_value) #figuring ouit the dollar cash amount we need to keep in portoflio 

        # How many shares do we need to buy if any based on market status
    if (current_cash_required < current_balance): 
        cash_to_purchase_shares = float(current_balance - current_cash_required)
        current_shares =+ (cash_to_purchase_shares / row['5. adjusted close'] ) # figure out how many new shares we are buying
        current_balance =- (cash_to_purchase_shares) # adjust current balance for shares purchased
        total_shares_purchases_count =+ 1 # add 1 to total count
        portfolio_trades.append(current_date, cash_to_purchase_shares, current_shares, current_balance)
    else:
        continue
  

    # Figure out average age of deposits
age_list = []
for trans_date, deposit in transactions: # for items in list transactions
    calculation = (end_date - trans_date) #get difference in days
    age_years = float((int(calculation.days))/365) # convert to how many years
    age_weights = age_years * deposit #figure out the weight of each dollar invested
    age_list.append(age_weights) #add to list

age_ratio = sum(age_list)



    # Ending Metrics
end_adjusted_close_price = date_filter['5. adjusted close'][-1]
final_balance = round((current_shares * end_adjusted_close_price), 2)
total_deposits = int((total_deposits_count * monthly_deposit) + initial_investment)
investment_profit = final_balance - total_deposits
roi = round(((investment_profit / total_deposits) * 100), 2)
age = age_ratio / total_deposits
cagr = round((((final_balance / total_deposits) ** (1 / (age - 1)) - 1)* 100), 2)


NameError: ignored

Analysis

In [None]:
print("STARTING METRICS")
print("Initial Balance : $ {} ".format(start_current_balance))
print("Share Price : $ {} ".format(start_adjusted_close_price))
print("Purchased Shares : {} ".format(starting_shares))

print("")
print("-----------------------------------------------------------")
print("")

print("ENDING METRICS")
print("Number of years : {} ".format(number_of_years))
print("Average Age of each Dollar invested : {} years ".format(age))
print("Share Price : $ {} ".format(end_adjusted_close_price))
print("Final Shares : {} ".format(current_shares))

print("")
print("-----------------------------------------------------------")
print("")

print("RESULTS")
print("Total Deposits : $ {} ".format(total_deposits))
print("Final Balance : $ {} ".format(final_balance))
print("Investment Profit : ${} ".format(investment_profit))
print("ROI : {}% ".format(roi))
print("CAGR : {}% ".format(cagr))

# **Reasearch Notes**

Testing out elif loops

In [194]:
current_cash_required = 0
current_market_status = -0.09

if (current_market_status > 0):  #greater than 0
    current_cash_required_equity = 0.30
elif (current_market_status > -0.05): #less than 5%
    current_cash_required_equity = 0.25
elif (current_market_status > -0.10): #less than 10%
    current_cash_required_equity = 0.20 
elif (current_market_status > -0.15): #less than 15%
    current_cash_required_equity = 0.15
elif (current_market_status > -0.20): #less than 20%
    current_cash_required_equity = 0.10
elif (current_market_status > -0.25): #less than 25%
    current_cash_required_equity = 0.05
elif (current_market_status > -0.30): #less than 30%
    current_cash_required_equity = 0

print(float(current_cash_required_equity))

0.2


## **Possible structure for the loop**

In [195]:
current_cash = 10000
deposit_amount = 1000
deposit_frequency = 5 #every 7 days
deposit_dates = [] # this list is of dates that a deposit will execute one 
    # we need to create a loop that will append 7 week to the current date and generate a list of all mondays
    # If monday is closed for trading, than we will use next avaiable trading day
    # than it resets back to original 
    # print out the generated deposit dates to test if this works 



count_trades = 0 #number of times 
count_deposits = 0 #count number of deposits
count_cash_deposited = 0 # count total value of cash that was depoisited 
count_profit = 0 #total profit of portfolio

current_shares_count = 0 # total number of shares owend
current_stock_price = 0 # current stock market price



for i in df.index:
     
    current_trading_date = # the next date in the loop trough the dataframe
    current_stock_price = # the next stock price in  the loop troough the dataframe
    current_market_all_time_high = # this needs to be an if statement
        # if current price is above market high than set new current_market_all_time_high
        # if its not ,t han the current one stays 
    current_market_change = current_stock_price - current_market_all_time_high

    # Setting market status
    current_market_status = current_market_change - current_market_all_time_high
    
    current_cash_required = # loop based on current marekt status
        if current_market_status > 0 than current_cash_required = 0.30
        elif current_market_status < 0  >=-5 than current_cash_required = 0.25
        elif current_market_status < -5  >=-10 than current_cash_required = 0.20
        elif current_market_status < -15  >=-15 than current_cash_required = 0.10


    # now that we know market status, we need see if we can deposit or not
    if current_trading_date == a date inside the lsit deposit_dates:
        current_cash =+ deposit_amount # we deposit the amoutn set into current cash
    elif continue 

    current_cash_equity = current_cash / current_portfolio_value
    
    # now we need to figure out if we need to buy shares or not
    if current_cash_equity > required_cash_equity
        stock_purchase_cash = current_cash_equity - required_cash_equity
        total_shares_purchased = stock_purchase_cash / current_stock_price
        current_shares_count =+ total_shares_purchased

    # Now we set current portfolio value 
    current_portfolio_value = (current_shares_count * current_stock_price) + current_cash

    # now we add all of this data to new dataframe that will track the results for each day
        # save date, and all the metrics that where set bc of this 

# Now at the end print out the to tal profit, total tradess, ect 


SyntaxError: ignored

## **Itteration Test**

In [None]:
    # Creating a list of all days thaht will have a trade executed

matches = []
for index, row in df.iterrows():
    item = index
for match in date_list:
    if index in match:
        matches.append(match)

    #print(index, row['1. open'])
print(matches)



##  **Interacting With DataFrame**

For each row in the dataframe, displaying the date and the open price

In [None]:
for index, row in data_daily.iterrows():
    print(index, row['1. open'])

In [None]:
for index, row in data_daily.iterrows():
    

print out all columns names in dataframe

In [None]:
for (columnName, columnData) in data_daily.iteritems():
   print('Colunm Name : ', columnName)
   print('Column Contents : ', columnData.values)

Plotting Charts

In [None]:
data_daily['4. close'].plot()
plt.title('Stock chart')
plt.show()

Filtering Date Ranges in the pandas dataframe 

In [None]:
start_date = "2019-1-1"
end_date = "2019-1-31"

after_start_date = data_daily['index'] >= start_date
before_end_date = data_daily['index'] <= end_date
between_two_dates = after_start_date & before_end_date
filtered_dates = df.loc[between_two_dates]

print(filtered_dates)

In [None]:
start_date = '03-01-1996'
end_date = '06-01-1997'

mask = (df['date'] > start_date) & (df['date'] <= end_date)

df = df.loc[mask]
df

>**Prints Empty DataFrame Error**

>```
	1. open	2. high	3. low	4. close	5. adjusted close	6. volume	7. dividend amount	8. split coefficient
date	
```
- https://stackoverflow.com/questions/22898824/filtering-pandas-dataframes-on-dates

In [None]:
print(data_daily.loc[datetime.date(year=2014,month=1,day=1):datetime.date(year=2015,month=2,day=1)])

In [None]:
df.loc['2014-01-01':'2014-02-01']

Adding days to `start_date` and adding them to `date_list=[]`

In [None]:
# Test example to create a list of possible deposit days
date_list = []

start_date = datetime.date(2018, 12, 31)
end_date = datetime.date(2020, 1, 4)
delta = datetime.timedelta(days=30)

while start_date <= end_date:
    print(start_date)
    start_date += delta
    date_list.append(start_date)
    
print(date_list)

## Date range filters

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.filter.html

In [None]:
print(data_daily)

In [None]:
data_daily.filter(like='1999', axis = 0)