<a href="https://colab.research.google.com/github/uprotom/espp/blob/main/eric_stock_explorer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

---
DISCLAIMER: Not a single thing here is financial advice and definitely not  *professional*. 

---

What is done here is running scenarios on historical data.

Do your *own* analysis and take your *own* decisions before investing any of your money into anything.

This includes ESPP.

Have fun.

# 1. Data preparation and Configuration 


## 1.1 Load data
Load clean combined stock and currency data from a manually prepared csv file stored on github

In [None]:
import pandas as pd
import altair as alt

#bf : base data frame, has all datapoints
bf = pd.read_csv("https://raw.githubusercontent.com/uprotom/espp/main/eri_stock_sekpln.csv")

## 1.2 Set configuration
Set your own values of monthly investment, company contribution and Computershare fees or use the defauls.

It is also possible to select the stock price type when buying.

In [None]:
#@title Modify default values:

cfg = {
}

def storeConfig():
  #@markdown Monthly investment (PLN) : equals approx 250-4166 SEK
  monthlyInvestment = 1000 #@param {type:"slider", min:110, max:1820, step:10}

  #@markdown Company contribution [%]
  companyContribution = 15 #@param {type: "number"}

  #@markdown Computershare buy fee [%]
  buyFee = 0.2  #@param {type: "number"}

  #@markdown Computershare sell fees [% / SEK] (whichever is highest)
  sellFee = 0.25 #@param {type: "number"}
  sellFeeFlat = 150 #@param {type: "number"} 

  #@markdown Which stock price to use when buying (@market open, close, average or high/low)
  buyType = "average"  #@param ['open', 'close', 'high', 'low', 'average']

  cfg['monthlyInvestment'] = monthlyInvestment
  cfg['monthlyInvestmentPostFee'] = (1 - buyFee / 100) * monthlyInvestment
  cfg['companyContribution'] = monthlyInvestment * companyContribution / 100
  cfg['buyFee'] = buyFee / 100
  cfg['sellFee'] = sellFee / 100
  cfg['sellFeeFlat'] = sellFeeFlat
  cfg['buyType'] = buyType
  # ranges for graphs:
  cfg['5years'] = -62
  cfg['3years'] = -38
  cfg['1year'] = -14

storeConfig()

## 1.3 Basic pre-calculations

Calculate the number of shares than can be bought each month given the declared monthly investment, buy fee, share price and SEK/PLN rate.

Buying date is every 15th day of each month (or first business day just after that date). All other days should remain at 0.

In [None]:
def findBuyDays():
  # pre-fill with number of shares = 0
  bf['buyShares'] = 0
  bf['companyContribution'] = 0

  # calculate/set sellPrices for all days based on the configured price type:
  if cfg['buyType'] == "open":
    bf['buySellPrice'] = bf['Open']
  elif cfg['buyType'] == "close":
    bf['buySellPrice'] = bf['Close']
  elif cfg['buyType'] == "average":
    bf['buySellPrice'] = (bf['Open'] + bf['Close']) / 2
  elif cfg['buyType'] == "high":
    bf['buySellPrice'] = bf['High']
  else:
    bf['buySellPrice'] = bf['Low']

  # set starting data for search loop
  counter = 0
  prevMonth = -1

  # iterate through all rows to fill in the number of shares on buy dates
  for index in bf.index:

    # clear foundBuyDate flag and set new latest month at month change
    if bf.loc[index, 'Date'].month != prevMonth:
      prevMonth = bf.loc[index, 'Date'].month
      foundBuyDate = False

    if bf.loc[index, 'Date'].day >= 15:
      if foundBuyDate == False:
        #debug: print("found a new buy date", bf.loc[index, 'Date'])
        foundBuyDate = True
        counter += 1
        bf.loc[index, 'companyContribution'] = cfg['companyContribution']
        bf.loc[index, 'buyShares'] = cfg['monthlyInvestmentPostFee'] / bf.loc[index, 'SekPln'] / bf.loc[index, 'buySellPrice']

  print("Found ", counter, " buy days in the dataset")


# parse dates from text format to dates
bf['Date'] = pd.to_datetime(bf['Date'],dayfirst=True)

#df : locate buy dates and create a filtered data frame with only buy dates to speed up most calculations
findBuyDays()

df = bf.loc[bf["buyShares"] > 0,["Date","SekPln","buySellPrice","buyShares","companyContribution"]]
df = df.reset_index()

# 2. Scenario exploration

In this section we will explore a couple of different approaches to the ESPP. 

The aim is to evaluate their real-life performance based on the historical stock price and currency exchange rates. 

All models will follow the buy dates (set on 15th of every month, or the first following business days).
Models will only differ in when they recommend to sell the stock (some more than once).

The resulting *value* is calculated per model for all starting days in the dataset. Which means we will get at least 266 results per each approach.

*Note: some results might be different depending on the size of monthly investment. e.g. Buy & Sell model from 2.1 has a better return rate for larger values*

## 2.0 Do not invest in ESPP

pros: no risk

cons: no gains

Simplest approach first. Any other model will have to beat this approach to be viable.

Resulting value is calculated as number of buying periods from the starting date to the end of the range.

### 2.0.1 Code

In [None]:
df['valNoInvestment'] = 0

# to simplify calculations iterate in reverse order and take the previous result as input to the next step
# add full monthly investment value on buyDays
for index in df.index[df.buyShares != 0][::-1]:
    df.loc[index, 'valNoInvestment'] = df.loc[min(df.index.max(),index + 1), 'valNoInvestment'] + cfg['monthlyInvestment']


## 2.1 Invest and sell immediately after

pros: 
* very low risk
* still get the company contribution bonus

cons: 
* relatively small gains as the flat fee eats a big part of company contribution

Second simplest approach. Should be better than not participating depending on the monthly investment size.
Any other model will have to beat this one too to be viable.

Resulting value is calculated as sum of investments decreased by the flat sale fee and buy/sell fees.

**Note: transfer fees and currency spread will apply as well, but are not included for any model**

It will also most likely be impossible to sell on the day that the shares are bought. Might want to redo the code to add 3-4 days

### 2.1.1 Code

In [None]:
def calculateValBuyAndSell():

  df['valBuyAndSell'] = 0

  # use two helper variables, we'll iterate in reverse order and keep increasing both
  # accBuyAndSellInv - accumulated buy&sell cash from this day until the end of data
  # accContribInv - accumulated company contributions from this day until the end of data
  accBuyAndSellInv = 0
  accContribInv = 0

  for index in df.index[::-1]: 
    accBuyAndSellInv += cfg['monthlyInvestmentPostFee'] - cfg['sellFeeFlat'] * df.loc[index, 'SekPln']
    accContribInv += df.loc[index, 'companyContribution']   
    df.loc[index, 'valBuyAndSell'] = accBuyAndSellInv + accContribInv 

calculateValBuyAndSell()

## 2.2 Buy and hold model

pros: 

*   gets a % of company contribution for each stock purchase
*   averages buy price over long term
*   generally successful long term as long as there is some growth

cons: 

*   volatile end result making it hard to figure out when to sell

    note: this is best shown when using final price from 15.02 (buy day) vs 16.02 (last data point)

The buy and hold model is the simplest participation approach. It will not sell any of the aquired stock aiming for a super long-term investment.
Resulting value is calculated as number of shares purchased from given date until the end of data counted at last known price and divided by SEKPLN.

Sell fees are applied.
Company contribution is counted.

### 2.2.1 Code

In [None]:
#TODO: use price from +3-4 days after buy date? [ ]

def calculateValBuyAndHold():

  df['valBuyAndHold'] = 0

  # use two helper variables, we'll iterate in reverse order and keep increasing both
  # accSharesInv - accumulated number of shares from this day until the end of data
  # accContribInc - accumulated company contributions from this day until the end of data
  accSharesInv = 0
  accContribInv = 0

  # sale prices are last known prices from full data set (16.02) - note the 15% drop
  finalPrice = bf.loc[bf.index.max(), 'buySellPrice']
  finalSekPlnRate = bf.loc[bf.index.max(), 'SekPln']

  for index in df.index[::-1]: 
    accSharesInv = accSharesInv + df.loc[index, 'buyShares']
    accContribInv = accContribInv + df.loc[index, 'companyContribution']
    
    # TODO: decide if we want to include the fee or not
    fee = max(cfg['sellFeeFlat'], cfg['sellFee'] * accSharesInv * finalPrice)
    df.loc[index, 'valBuyAndHold'] = (accSharesInv * finalPrice - fee) * finalSekPlnRate + accContribInv 
    # don't use fee since we're not actually selling
    # df.loc[index, 'valBuyAndHold'] = (accSharesInv * finalPrice) * finalSekPlnRate + accContribInv 

calculateValBuyAndHold()


### 2.2.2 Plots

2.2.2.1 Long term

In [None]:
# plot
data = df.melt(id_vars =['Date'], value_vars = ['valNoInvestment','valBuyAndHold','valBuyAndSell'])
alt.Chart(data).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy & Hold vs Buy & Sell vs no investment. 2000-2022"
).interactive()

2.2.2.2 Shorter term. Interactive comparison

In [None]:
# plot
data = df[cfg['3years']:].copy()
data['referenceNoInvestment'] = data['valNoInvestment'] - data['valNoInvestment']
data['compToNIBuyAndHold'] = data['valBuyAndHold'] - data['valNoInvestment']
data['compToNIBuyAndSell'] = data['valBuyAndSell'] - data['valNoInvestment']

pData = data.melt(id_vars =['Date'], value_vars = ['referenceNoInvestment','compToNIBuyAndHold','compToNIBuyAndSell'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy & Hold vs Buy & Sell vs no investment. 2019-2022"
).interactive()

2.2.2.3 Last 12M

In [None]:
# plot

pData = data[cfg['1year']:].melt(id_vars =['Date'], value_vars = ['referenceNoInvestment','compToNIBuyAndHold','compToNIBuyAndSell'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
     title="Buy & Hold vs Buy & Sell vs no investment as baseline. 2021"
)

### 2.2.3 Conclusions

No surprise that Buy and sell is a better approach than no participation.

But short term (1Y or less) or in flat/decresing market conditions (like on 16.02) it can match or beat buy & hold. 
This approach will work only as long as company contribution is larger than buy/sell/transfer fees. Because of this it's easier to do with larger contributions (rerun the computations for e.g 600 / 1200 and 1800 PLN monthly investments)

Buy and hold is still the most effective long term, but then it needs to be compared to other non-ESPP investments - as B&S can potentially provide better overall gains by immediately reinvesting elsewhere. 

## 2.3 Buy and keep for N months

pros: 

*   gets a % of company contribution for each stock purchase
*   averages buy price over a given term
*   lowers sell and transfer fees by selling less frequently 
*   generally successful as long as there is not a big decrease
*   for low values of N cash is not frozen for a long period of time

cons: 

*   risky if sold during a drop in price
*   low gain for short hold periods
*   volatile for longer hold periods

'Buy and keep for N months' model will aggregate the stock during a predefined time to get full company contribution and then sell regardless of the stock price.

This is used as a benchmark and will be tuned in the following models.

### 2.3.1 Code


In [None]:
def calculateValBuyAndHoldPeriod(buyAndHoldPeriod):

  result = []  
  # we'll iterate in normal order, nested loop, multiple times
  # there has to be a smarter way to do it, but nevermind
  # don't do this at home

  for index in df.index[::1]: 
    # clear helper variables:
    # accSharesSinglePeriod - temporarily accumulated number of shares in the buy and hold period
    # allAccSharesValue - value of shares in all buy and hold periods
    # accContrib - accumulated company contribution
    accSharesSinglePeriod = 0
    allAccSharesValue = 0
    accContrib = 0
    iix = 0
    
    # debug
    #loggerList = []
    #loggerList2 = []

    while index + iix <= df.index.max():
      # buy shares and collect contribution on each step
      accSharesSinglePeriod += df.loc[index + iix, 'buyShares']
      accContrib += cfg['companyContribution']

      #'sell' shares at the end of each hold period
      if (iix % buyAndHoldPeriod) == (buyAndHoldPeriod - 1):
        saleIndex = index + iix
        salePrice = df.loc[saleIndex, 'buySellPrice']
        saleSekPlnRate = df.loc[saleIndex, 'SekPln']
        fee = max(cfg['sellFeeFlat'], cfg['sellFee'] * accSharesSinglePeriod * salePrice)
        allAccSharesValue += (accSharesSinglePeriod * salePrice - fee) * saleSekPlnRate 
        #loggerList2.append((accSharesSinglePeriod * salePrice - fee) * saleSekPlnRate)
        accSharesSinglePeriod = 0
      
      iix += 1
    
    # calculate the value of the remaining shares if there are any at the end
    if accSharesSinglePeriod > 0:
      #'sell' shares and count contribution 
      saleIndex = bf.index.max()
      salePrice = bf.loc[saleIndex, 'buySellPrice']
      saleSekPlnRate = bf.loc[saleIndex, 'SekPln']
      allAccSharesValue += (accSharesSinglePeriod * salePrice) * saleSekPlnRate

    # store
    result.append(allAccSharesValue + accContrib)

  return result

In [None]:
# run calculations for a couple of N 
df['valBuyAndHold3M'] = calculateValBuyAndHoldPeriod(3)
df['valBuyAndHold6M'] = calculateValBuyAndHoldPeriod(6)
df['valBuyAndHold12M'] = calculateValBuyAndHoldPeriod(12)
df['valBuyAndHold24M'] = calculateValBuyAndHoldPeriod(24)
df['valBuyAndHold36M'] = calculateValBuyAndHoldPeriod(36)
df['valBuyAndHold48M'] = calculateValBuyAndHoldPeriod(48)


### 2.3.2 Plots

2.3.2.1 Full period (2000-2022)


In [None]:
# plot
data = df.melt(id_vars =['Date'], value_vars = ['valBuyAndSell','valBuyAndHold','valBuyAndHold3M','valBuyAndHold6M','valBuyAndHold12M','valBuyAndHold24M','valBuyAndHold36M','valBuyAndHold48M'])
alt.Chart(data).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy and hold periods vs Buy&Sell. 2000-2022"
).interactive()

2.3.2.2 Last 5 years (2017-2022)


In [None]:
# plot
data = df[cfg['5years']:].copy()
data['referenceBuyAndSell'] = data['valBuyAndSell'] - data['valBuyAndSell']
data['compToBSBuyAndHold'] = data['valBuyAndHold'] - data['valBuyAndSell']
data['compToBSBuyAndHold3M'] = data['valBuyAndHold3M'] - data['valBuyAndSell']
data['compToBSBuyAndHold6M'] = data['valBuyAndHold6M'] - data['valBuyAndSell']
data['compToBSBuyAndHold12M'] = data['valBuyAndHold12M'] - data['valBuyAndSell']
data['compToBSBuyAndHold24M'] = data['valBuyAndHold24M'] - data['valBuyAndSell']
data['compToBSBuyAndHold36M'] = data['valBuyAndHold36M'] - data['valBuyAndSell']
data['compToBSBuyAndHold48M'] = data['valBuyAndHold48M'] - data['valBuyAndSell']

pData = data.melt(id_vars =['Date'], value_vars = ['referenceBuyAndSell','compToBSBuyAndHold','compToBSBuyAndHold3M','compToBSBuyAndHold6M','compToBSBuyAndHold12M','compToBSBuyAndHold24M','compToBSBuyAndHold36M','compToBSBuyAndHold48M'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy and hold periods against Buy&Sell reference. 2017-2022"
).interactive()

2.3.2.3 Last 12M

In [None]:
pData = data[cfg['1year']:].melt(id_vars =['Date'], value_vars = ['referenceBuyAndSell','compToBSBuyAndHold','compToBSBuyAndHold3M','compToBSBuyAndHold6M','compToBSBuyAndHold12M'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy and hold periods against Buy&Sell reference. 2021"
)

### 2.3.3 Conclusions

For longer investment periods simple Buy and Hold for N months models are almost always better than Buy and Sell. With no additional conditions they are most often too simple and unpredictable, sometimes (rarely!) ending up with worse results if applied at some starting points.

However, in recent years, some of them would have performed better than even Buy and Hold. More so, considering the latest drop.

3 and 6 month Buy and Holds have consistently performed better than a simple B&S (for all data in long and mid term).
However, Buy & Sell was more effective for investments that are closer to the maximum allowed contribution.

This means that the following sections will be even more interesting as we're now getting closer to more real-life scenarios.

In [None]:
# plot
data = df[cfg['3years']:].copy()
data['referenceBuyAndHold'] = data['valBuyAndHold'] - data['valBuyAndHold']
data['compToBH_BS'] = data['valBuyAndSell'] - data['valBuyAndHold']
data['compToBH_BH3M'] = data['valBuyAndHold3M'] - data['valBuyAndHold']
data['compToBH_BH6M'] = data['valBuyAndHold6M'] - data['valBuyAndHold']

pData = data.melt(id_vars =['Date'], value_vars = ['referenceBuyAndHold','compToBH_BS','compToBH_BH3M','compToBH_BH6M'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy and hold periods against Buy&Sell reference. 2017-2022"
)

## 2.4 Buy and keep until X PLN of value

pros: 

*   gets a % of company contribution for each stock purchase
*   averages buy price over a period of time
*   aims to minimize fees by paying out in predefined parts

cons: 

*   cash is potentially frozen for a long period of time for larger X
*   unknown time scope of investment

'Buy and keep until X PLN' model will aggregate the stock until defined value in PLN is reached then sell regardless of the stock price.



### 2.4.1 Code


In [None]:
# This version uses all datapoints and allows a sale in between buy days
# note: this is surprisingly worse in some cases compared to the old version that checked buy days

def calculateValBuyAndHoldUntilValue(targetSellValue):

  # precalculate resulting values of first sale and end indices when starting the simulation at any given step  
  # ---------------------------------------------------------------------------------------
  # clean store
  valuesWhenStartingAtN = []
  endIxWhenStartingAtN = []

  # start at each index in df, need to do a brute-force sale once
  for index in df.index[::1]: 
    # accSharesThisPeriod - shares accumulated if calc period starts @ index
    # accContrib - company contribution
    accSharesThisPeriod = 0
    accContrib = 0
    
    # calculate as usual, go forward from index, but just until one sale
    iix = 0
    stop = False

    while not stop and (index + iix <= df.index.max()):
      # gather shares and contribution on each step
      accContrib += cfg['companyContribution']
      accSharesThisPeriod += df.loc[index + iix, 'buyShares']

      # set the correct lookahead range
      bfStart = df.loc[index + iix]['index']
      if index + iix == df.index.max():
        bfEnd = bf.index.max()
      else:
        bfEnd = df.loc[index + iix + 1]['index']

      # look for a good price between buy days          
      for bfIx in range(bfStart, bfEnd):

        # calculate with High price each day - we assume the sell order is automated
        currSekPlnRate = bf.loc[bfIx, 'SekPln']
        currHighPrice = bf.loc[bfIx,'High']
        currBuySellPrice = bf.loc[bfIx,'buySellPrice']
        targetSharePrice = targetSellValue / currSekPlnRate / accSharesThisPeriod
        
        if currHighPrice >= targetSharePrice:
          # take bigger of the values in case targetSharePrice is too low
          salePrice = max(targetSharePrice, currBuySellPrice)         
          fee = max(cfg['sellFeeFlat'], cfg['sellFee'] * accSharesThisPeriod * salePrice)

          valuesWhenStartingAtN.append((accSharesThisPeriod * salePrice - fee) * currSekPlnRate + accContrib)
          endIxWhenStartingAtN.append(index + iix + 1)
              
          #stop and skip rest of the range since we sold
          stop = True
          break
      
      iix += 1
    
    # add the value of shares remaining at the end, skip fee (not selling at this point)
    if not stop and accSharesThisPeriod > 0:
      #'sell' shares and count contribution 
      saleIndex = bf.index.max()
      salePrice = bf.loc[saleIndex, 'buySellPrice']
      saleSekPlnRate = bf.loc[saleIndex, 'SekPln']
      valuesWhenStartingAtN.append(accSharesThisPeriod * salePrice * saleSekPlnRate + accContrib)
      endIxWhenStartingAtN.append(index + iix)

  result = []
  # do the actual calculations by traversing the preresults
  # ---------------------------------------------------------------------------------------
  # start at each index in df, need to do it once
  for index in df.index[::1]: 
    # accValueAtIndex - sum of precalculated values
    accValueAtIndex = 0
    iix = index
    while (iix <= df.index.max()):
      accValueAtIndex += valuesWhenStartingAtN[iix]
      iix = endIxWhenStartingAtN[iix]
    
    result.append(accValueAtIndex)

  return result

In [None]:
# calculate for some predefined values
df['valBuyAndHold5K'] = calculateValBuyAndHoldUntilValue(5000)
df['valBuyAndHold10K'] = calculateValBuyAndHoldUntilValue(10000)
df['valBuyAndHold20K'] = calculateValBuyAndHoldUntilValue(20000)
df['valBuyAndHold40K'] = calculateValBuyAndHoldUntilValue(40000)
df['valBuyAndHold60K'] = calculateValBuyAndHoldUntilValue(60000)
df['valBuyAndHold100K'] = calculateValBuyAndHoldUntilValue(100000)

### 2.4.2 Plots

2.4.2.1 Full period (2000-2022)

In [None]:
# plot
data = df.melt(id_vars =['Date'], value_vars = ['valBuyAndSell','valBuyAndHold','valBuyAndHold5K','valBuyAndHold10K','valBuyAndHold20K','valBuyAndHold40K','valBuyAndHold60K','valBuyAndHold100K'])
alt.Chart(data).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy until X value vs B&S. 2000-2022"
).interactive()

2.3.2.2 Last 5 years (2017-2022)

In [None]:
# plot
data = df[cfg['5years']:].copy()
data['referenceBuyAndSell'] = data['valBuyAndSell'] - data['valBuyAndSell']
data['compToBSBuyAndHold'] = data['valBuyAndHold'] - data['valBuyAndSell']
data['compToBSBuyAndHold5K'] = data['valBuyAndHold5K'] - data['valBuyAndSell']
data['compToBSBuyAndHold10K'] = data['valBuyAndHold10K'] - data['valBuyAndSell']
data['compToBSBuyAndHold20K'] = data['valBuyAndHold20K'] - data['valBuyAndSell']
data['compToBSBuyAndHold40K'] = data['valBuyAndHold40K'] - data['valBuyAndSell']
data['compToBSBuyAndHold60K'] = data['valBuyAndHold60K'] - data['valBuyAndSell']
data['compToBSBuyAndHold100K'] = data['valBuyAndHold100K'] - data['valBuyAndSell']

pData = data.melt(id_vars =['Date'], value_vars = ['referenceBuyAndSell','compToBSBuyAndHold','compToBSBuyAndHold5K','compToBSBuyAndHold10K','compToBSBuyAndHold20K','compToBSBuyAndHold40K','compToBSBuyAndHold60K','compToBSBuyAndHold100K'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy until X value against Buy&Sell reference. 2017-2022"
).interactive()

2.4.2.3 Last 12M

In [None]:
pData = data[cfg['1year']:].melt(id_vars =['Date'], value_vars = ['referenceBuyAndSell','compToBSBuyAndHold','compToBSBuyAndHold5K','compToBSBuyAndHold10K','compToBSBuyAndHold20K'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy until X value against Buy&Sell reference. 2021"
)

## 2.5 Buy and keep until X% increase in value

pros: 

*   gets a % of company contribution for each stock purchase
*   at most X% of return long term as long as there is some growth

cons: 

*   cash is potentially frozen for a long period of time in bad market conditions
*   might miss better results if selling too early
*   unknown time scope of investment

'Buy and keep until X% increase of value' model will aggregate the stock until a defined % increase in value is reached then sell at needed price when possible.


### 2.5.1 Code

In [None]:
# This version uses all datapoints which allows a sale in between buy days
# note: it's also 'worse' in most cases because it sets a price to match the % increase and sells as soon as possible
# previous version checked only on buy days and sold when the % change was over the threshold

def calculateValBuyAndHoldUntilPercentIncrease(targetPercentIncrease):

  # precalculate resulting values of first sale and end indices when starting the simulation at any given step  
  # ---------------------------------------------------------------------------------------
  # clean store
  valuesWhenStartingAtN = []
  endIxWhenStartingAtN = []

  for index in df.index[::1]: 
    # store number of bought shares and money spent on them - needed for avgPrice
    # accContrib - count company contribution
    accSharesThisPeriod = 0
    accMoneySpentThisPeriod = 0    
    accContrib = 0
    
    # same as before, go forward starting from index, but just until one sale is possible
    iix = 0
    stop = False

    while not stop and (index + iix <= df.index.max()):
      # gather shares and contribution on each step
      accContrib += cfg['companyContribution']
      accSharesThisPeriod += df.loc[index + iix, 'buyShares']
      accMoneySpentThisPeriod += cfg['monthlyInvestmentPostFee']

      # recalculate the average and target share prices in pln (+X %)
      avgPricePln = accMoneySpentThisPeriod / accSharesThisPeriod 
      targetSharePricePln = (1 + targetPercentIncrease / 100) * avgPricePln

      # set the correct lookahead range
      bfStart = df.loc[index + iix]['index']
      if index + iix == df.index.max():
        bfEnd = bf.index.max()
      else:
        bfEnd = df.loc[index + iix + 1]['index']

      # look for a good price between buy days          
      for bfIx in range(bfStart, bfEnd):
        # calculate target price in SEK and compare with High price each day - we assume the sell order is automated
        currSekPlnRate = bf.loc[bfIx, 'SekPln']        
        currHighPrice = bf.loc[bfIx,'High']
        targetSharePriceSek = targetSharePricePln / currSekPlnRate
      
        # sell if targetSharePriceSek is reached, use this price for counting value
        if currHighPrice >= targetSharePriceSek:
          fee = max(cfg['sellFeeFlat'], cfg['sellFee'] * accSharesThisPeriod * targetSharePriceSek)
          valuesWhenStartingAtN.append((accSharesThisPeriod * targetSharePriceSek - fee) * currSekPlnRate + accContrib)
          # set next df ix as next starting point
          endIxWhenStartingAtN.append(index + iix + 1)

          #skip rest of the range since we sold, and stop the iteration for iix as well
          stop = True
          break
      
      iix += 1
    
    # if we have not sold for some reason (end of table or too high % req)
    # store the value of shares remaining at the end, skip fee (not selling at this point)
    if not stop and accSharesThisPeriod > 0:
      #'sell' shares and count contribution 
      saleIndex = bf.index.max()
      salePrice = bf.loc[saleIndex, 'buySellPrice']
      saleSekPlnRate = bf.loc[saleIndex, 'SekPln']
      valuesWhenStartingAtN.append(accSharesThisPeriod * salePrice * saleSekPlnRate + accContrib)
      endIxWhenStartingAtN.append(index + iix)

  result = []
  # do the actual calculations by traversing the preresults
  # ---------------------------------------------------------------------------------------
  # start at each index in df
  for index in df.index[::1]: 
    # accValueAtIndex - sum of precalculated values
    accValueAtIndex = 0
    iix = index
    # and value and move to the previously calculated next starting ix
    while (iix <= df.index.max()):
      accValueAtIndex += valuesWhenStartingAtN[iix]
      iix = endIxWhenStartingAtN[iix]
    # store the result when done
    result.append(accValueAtIndex)

  return result

In [None]:
# calculate for some predefined values of X
# even with thie optimization this might take 1 min because of the last 2 calculations
df['valBuyAndHold5Perc'] = calculateValBuyAndHoldUntilPercentIncrease(5)
df['valBuyAndHold7Perc'] = calculateValBuyAndHoldUntilPercentIncrease(7)
df['valBuyAndHold9Perc'] = calculateValBuyAndHoldUntilPercentIncrease(9)
df['valBuyAndHold15Perc'] = calculateValBuyAndHoldUntilPercentIncrease(15)
df['valBuyAndHold25Perc'] = calculateValBuyAndHoldUntilPercentIncrease(25)
df['valBuyAndHold50Perc'] = calculateValBuyAndHoldUntilPercentIncrease(50)
df['valBuyAndHold66Perc'] = calculateValBuyAndHoldUntilPercentIncrease(66)

### 2.5.2 Plots


2.5.2.1 Full period (2000-2022)

In [None]:
# plot
data = df.melt(id_vars =['Date'], value_vars = ['valBuyAndSell','valBuyAndHold','valBuyAndHold9Perc','valBuyAndHold5Perc','valBuyAndHold7Perc','valBuyAndHold15Perc','valBuyAndHold25Perc','valBuyAndHold50Perc','valBuyAndHold66Perc'])
alt.Chart(data).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy until +X% vs B&S. 2000-2022"
)

Note: Hold until +50% is surprisingly effective in the past of the dataset, also if started closer to present (until 2015 or in H1 2018)

2.5.2.2 Last 5 years (2017-2022)


In [None]:
# plot
data = df[cfg['5years']:].copy()
data['referenceBuyAndSell'] = data['valBuyAndSell'] - data['valBuyAndSell']
data['compToBSBuyAndHold'] = data['valBuyAndHold'] - data['valBuyAndSell']
data['compToBSBuyAndHold5Perc'] = data['valBuyAndHold5Perc'] - data['valBuyAndSell']
data['compToBSBuyAndHold7Perc'] = data['valBuyAndHold7Perc'] - data['valBuyAndSell']
data['compToBSBuyAndHold9Perc'] = data['valBuyAndHold9Perc'] - data['valBuyAndSell']
data['compToBSBuyAndHold15Perc'] = data['valBuyAndHold15Perc'] - data['valBuyAndSell']
data['compToBSBuyAndHold25Perc'] = data['valBuyAndHold25Perc'] - data['valBuyAndSell']
data['compToBSBuyAndHold50Perc'] = data['valBuyAndHold50Perc'] - data['valBuyAndSell']
data['compToBSBuyAndHold66Perc'] = data['valBuyAndHold66Perc'] - data['valBuyAndSell']

pData = data.melt(id_vars =['Date'], value_vars = ['referenceBuyAndSell','compToBSBuyAndHold','compToBSBuyAndHold9Perc','compToBSBuyAndHold5Perc','compToBSBuyAndHold7Perc','compToBSBuyAndHold15Perc','compToBSBuyAndHold25Perc','compToBSBuyAndHold50Perc','compToBSBuyAndHold66Perc'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy until +X% vs B&S. 2017-2022"
).interactive()

2.5.2.3 Last 12M

In [None]:
pData = data[cfg['1year']:].melt(id_vars =['Date'], value_vars = ['referenceBuyAndSell','compToBSBuyAndHold','compToBSBuyAndHold9Perc','compToBSBuyAndHold5Perc','compToBSBuyAndHold7Perc','compToBSBuyAndHold15Perc'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Buy until +X% vs B&S. 2021"
)

### 2.5.3 Conclusions

Now this seems like a fun model. It takes a while to calculate, and seems that lower values of X% are quite easy to achieve - hence the 'linear' results for most of the run.

For larger Xs there are two outcomes - either it's achievable and it ends up better than Buy & Hold at a specific start date, or it's not achievable and it becomes Buy & Hold.

3, 5 and 7% were good candidates in 2021 giving decent results.
9% was even better in H2 21, but not possible when starting in H1.

## 2.6 Base for 'smart' scenarios

This model is an attempt to combine several approaches and to use the whole dataset including the stock prices between the buy dates.

In the smart scenario we'll have a set of configurable rules to decide when to sell:
*   hold for at least X buy periods (to reduce fees)
*   AND don't sell until the price is X% higher than average buy price
*   OR sell immediately if the price is X% higher than the average buy price

This way it'll be a good base to add sell-stop or other approaches.

Narrowing down the simulation to a 5Y period to speed up calculations.
Some good values:

4 1 5

4 0 8

4 0 8

6 -2 10

In [None]:
#@title Modify default values:

sellCfg = {
}

def storeSellConfig():
  #@markdown Minimum period to aggregate the stock before selling [months]
  minHoldPeriod = 4 #@param {type:"slider", min:0, max:24, step:1}

  #@markdown Minimum gain to allow a sale [%]
  minimumGainThr = 6 #@param {type:"slider", min:-10, max:50, step:1}

  #@markdown Threshold for immediate sell (override of holdPeriod)
  immediateSaleThr = 7 #@param {type:"slider", min:1, max:50, step:1}

  sellCfg['minHoldPeriod'] = minHoldPeriod
  sellCfg['minimumGainThr'] = (1 + minimumGainThr / 100)
  sellCfg['immediateSaleThr'] = (1 + immediateSaleThr / 100)

storeSellConfig()

### 2.6.1 Code

In [None]:
# TODO: rework like 2.4 and 2.5 to use a precalculated table to speed things up and allow for a full scope of calculations

# store number of sale types, used for tuning
intraRangeSales = 0
immediateSales = 0
holdAndGrowthRule = 0

result = []

# starting points are taken from a subset of df (last 5 years)
for index in df.index[cfg['5years']::1]: 
  # clear helper variables:
  # accSharesThisPeriod - shares accumulated in current period
  # accMoneySpentSinglePeriod - own investment in current period
  # accAllSoldSharesValue - value of shares sold in all periods
  # accContrib - company contribution
  accSharesThisPeriod = 0
  accMoneySpentSinglePeriod = 0
  accAllSoldSharesValue = 0
  accContrib = 0
  buyPeriodsSinceLastSale = 0
  
  iix = 0

  while index + iix <= df.index.max():
    # gather shares and contribution on each step, count spending
    buyPeriodsSinceLastSale += 1
    accContrib += cfg['companyContribution']
    accSharesThisPeriod += df.loc[index + iix, 'buyShares']
    accMoneySpentSinglePeriod += cfg['monthlyInvestmentPostFee'] / df.loc[index + iix, 'SekPln']
    
    # precalc so we compare only share prices in the inner loop to speed this up
    avgSharePrice = accMoneySpentSinglePeriod / accSharesThisPeriod

    # set the correct lookahead range
    bfStart = df.loc[index + iix]['index']
    if index + iix == df.index.max():
      bfEnd = bf.index.max()
    else:
      bfEnd = df.loc[index + iix + 1]['index']
#lg print("st/stp: ", bfStart, bfEnd)
    
    for bfIx in range(bfStart, bfEnd):
      # compare with High price each day - we assume the sell order is automated
      maxSalePrice = bf.loc[bfIx,'High']
      maxPercentChange = maxSalePrice / avgSharePrice
      
      sellAllowed = False
        
      # check for immediate sale
      if maxPercentChange >= sellCfg['immediateSaleThr']:
        sellAllowed = True
        immediateSales += 1
        # we don't take the highest price but the one that was set in the order
        salePrice = avgSharePrice * sellCfg['immediateSaleThr']
        #lg saleType = "immediate"

      # check for hold period and minimumGain:
      elif buyPeriodsSinceLastSale >= sellCfg['minHoldPeriod']:
        if maxPercentChange >= sellCfg['minimumGainThr']:
          sellAllowed = True
          holdAndGrowthRule += 1
          # same here we take the normal buySell price (usually the average)
          salePrice = avgSharePrice * sellCfg['minimumGainThr']
          #lg saleType = "minGain"

      if sellAllowed:
        #lg print("sell price: ", round(salePrice,2) , "sale type: ", saleType, "sell bfix:", bfIx)

        intraRangeSales += 1
        #sell shares
        fee = max(cfg['sellFeeFlat'], cfg['sellFee'] * accSharesThisPeriod * salePrice)
        saleSekPlnRate = bf.loc[bfIx, 'SekPln']
        accAllSoldSharesValue += (accSharesThisPeriod * salePrice - fee) * saleSekPlnRate 
        accSharesThisPeriod = 0
        accMoneySpentSinglePeriod = 0
        buyPeriodsSinceLastSale = 0
        #and skip rest of the range since we sold
        break
    
    iix += 1
  
  # add the value of shares remaining at the end, skip fee (not selling at this point)
  if accSharesThisPeriod > 0:
    #'sell' shares and count contribution 
    saleIndex = bf.index.max()
    salePrice = bf.loc[saleIndex, 'buySellPrice']
    saleSekPlnRate = bf.loc[saleIndex, 'SekPln']
    accAllSoldSharesValue += accSharesThisPeriod * salePrice * saleSekPlnRate

  df.loc[index, 'valSmartScenario'] = accAllSoldSharesValue + accContrib

print("all intra range sales:", intraRangeSales)
print("number of immediateSales:", immediateSales)
print("number of hold and growth sales:",holdAndGrowthRule)

### 2.6.2 Plots

In [None]:
# plot
data = df[cfg['5years']:].copy()
data['referenceBuyAndSell'] = data['valBuyAndSell'] - data['valBuyAndSell']
data['compToBSBuyAndHold'] = data['valBuyAndHold'] - data['valBuyAndSell']
data['compToBSSmartScenario'] = data['valSmartScenario'] - data['valBuyAndSell']
data['compToBSBuyAnd7Perc'] = data['valBuyAndHold7Perc'] - data['valBuyAndSell']
data['compToBSBuyAndHold10K'] = data['valBuyAndHold10K'] - data['valBuyAndSell']

pData = data.melt(id_vars =['Date'], value_vars = ['referenceBuyAndSell','compToBSBuyAndHold','compToBSSmartScenario','compToBSBuyAndHold10K','compToBSBuyAnd7Perc'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Smart scenario 2017-2022"
)

In [None]:
# plot
pData = data[cfg['3years']:].melt(id_vars =['Date'], value_vars = ['referenceBuyAndSell','compToBSBuyAndHold','compToBSSmartScenario','compToBSBuyAndHold10K','compToBSBuyAnd7Perc'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Smart scenario 2021"
).interactive()

In [None]:
# plot
pData = data[cfg['1year']:].melt(id_vars =['Date'], value_vars = ['referenceBuyAndSell','compToBSBuyAndHold','compToBSSmartScenario','compToBSBuyAndHold10K','compToBSBuyAnd7Perc'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='yearmonth(Date):T',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Smart scenario 2021"
).interactive()

# 3. Summary

In [None]:
# results of short term (2Y) of smart model:
data['compToBSSmartScenario'].describe()

In [None]:
# results of short term (1Y) of all models. Sorted:
df[cfg['1year']:][df.columns[6:]].describe().transpose().sort_values(by=['75%'],ascending=False)

# Random visualizations and checkers

In [None]:
alt.data_transformers.disable_max_rows()

base = alt.Chart(bf.reset_index()).encode(x='Date').properties(
)

alt.layer(
    base.mark_line(color='orange').encode(y='SekPln'),
).properties(
    width=1200,
    height=200,
    title="SekPln rate from 2000"
).interactive()

In [None]:
base = alt.Chart(bf.reset_index()).encode(x='Date').properties(
)

alt.layer(
    base.mark_line(interpolate='step-after',color='orange').encode(y='buySellPrice'),
).properties(
    width=1200,
    height=400,
    title="ERIC-B avg stock price. Interactive"
).interactive()

In [None]:
# plot
bf['dailyPriceRange'] = bf['High'] - bf['Low']
bf['dailyPriceDiff'] = bf['Close'] - bf['Open']

pData = bf.melt(id_vars =['Date'], value_vars = ['dailyPriceRange','dailyPriceDiff'])
alt.Chart(pData).mark_line(interpolate='step-after').encode(
    x='Date',
    y='value',
    color='variable',
).properties(
    width=1200,
    height=400,
    title="Daily price range and price difference. Interactive"
).interactive()