# Investing in the stock market (the eighth world wonder)

**Project deadline:** This project is due for submission on Wednesday, 11.05.2022, 23:59. Please check carefully the *About the Projects* section below for further details.

**Important:** You have the choice to work either on this project or on another one from Nina. We strongly advise you to read through both project notebooks completely before you take a decision.

## About the Projects
- You will get one project approximately every other week.
- Besides the homework-assignmentts, you need to solve the projects in order to pass the course. Your final course mark consists of the mean of your project marks. We aim to hand-out six projects during the term and we do not consider the worst project mark for your final course mark. Projects that you do not hand in are counted with a mark of 4.
- The projects needs to be submitted by uploading a modified version of this notebook to [Projects/Project 1](https://ecampus.uni-bonn.de/goto_ecampus_exc_2645968.html) on eCampus. Your upload must be done by Wednesday, 11.05.2022, 23:59. **No late uploads can be accepted!**
- **In contrast to the homework exercises, each student must hand in an own solution for the projects! Of course you can and should discuss problems with each other! However, you need to be able to explain your solution in detail to your tutor and/or the lecturers! We might ask you for an interview about your project if the solution is (close to) identical to another students submission.**

**Note:** The tutors, Nina and I are very happy to help you out with difficulties you might have with the project tasks! You can ask questions any time but please do so well in advance of the deadlines!

## Introduction

In this project, we want to look at a long-term strategy to invest money in the stock market.

On the stock market, we can invest into individual companies such as Apple, Sony, Deutsche Telekom, McDonalds etc. While investments in individual companies can result in very high profits if you pick the right ones, they also carry the risk of significant or even total losses - perhaps some of you remember WireCard (a once very famous German tech-company) which went bankrupt within a few days during 2020 (featured in this very well done [video](https://www.youtube.com/watch?v=ivACzzW5wyA)). To minimise risks, a stock market portfolio should be broadly diversified over countries and indistrial sectors. An attractive option to realise this for private investors are so-called index funds. The so-called `MSCI-World` index allows you to invest with little money in $\approx 1500$ large companies from 23 countries of the developed world. 

If we believe that the world economy steadily grows (it better does!), an investment in such an index has an expected return larger than 0 **in the long term**! We want to investigate in this notebook what time horizon an investment should have to not loose money.

The figure below shows the chart-development of the MSCI-index from December 1987 to October 2021.

<img src="figs/MSCI_chart.png" width="400" height="200" />

The value of the index is in arbitrary units (normalised to 1 for the first data point). The absolute values (e.g. in Dollars or Euros) are irrelevant for the entire project. 

## 0. Chart developement of the MSCI-World Index

In [None]:
%matplotlib inline

# noetige Imports
import numpy as np
import numpy.random as nr
import matplotlib.pyplot as plt

The file [MSCI.txt](MSCI.txt) contains data for the MSCI-World index from December 1978 to October 2021:

In [None]:
!head data/MSCI.txt

The two columns contain date - one data-point for each month - and mean value of the index during that month. The dates are represented with numbers such as 1979.XX. The exact meaning of the decimal place (XX) does not matter for the tasks ahead.

In the following cell, I show you how to read the file, decompose its contents into two `numpy`-arrays `dates` and `values` and reproduce the figure above. We did not yet cover this in class.

In [None]:
# read data into a 'two-dimensional'
# numpy-array 'data' and split it up in
# two one-dimensional arrays:
data = np.loadtxt('data/MSCI.txt')
dates = data[:,0]
values = data[:,1]

plt.plot(dates, values)
plt.xlabel('year')
plt.ylabel('MSCI-World index [dimensionless units]')

## 1. Wonderful returns

An investment in the `MSCI-World` is often advertised with phrases as *The MSCI-World yielded a mean yearly interest of about 11% since 1979!* 

Let us first verify this statement.

Investing an amount $x$ with a fixed-interest of $z$ (in percent), your money has increased after $n$ years to an amount $y$ of

$$
y = x(1+z)^{n}
\label{eq:interest} \tag{1}
$$

(compound interest formula). 

Assume in the following that you invest the value of the MSCI-index to a given time and, at the end of the investment, you obtain the index value at the corresponding date. Hence, the value of your investment always corresponds to the value of the index. If you had invested a different amount of money, you only would need to multiply that amount with the corresponding values of the index. This however is irrelevant for the task because we only are interested in interest rates.

**Example:**
Investing in Dec. 1978, we do this with one unit of money (first data point). In October 2021 (last data point), the investment returns 94.78 units.

Given those values, calculate with eq. (\ref{eq:interest}) the annual percentage rate of an investment into the MSCI-World from Dec. 1978 to Oct. 2021.

**Hint:** For this **very long-term investment**, the advertisement is correct and you should obtain 11%.

In [None]:
# Your solution here please

In [None]:
## SOLUTION

# We solve eq. (1) for z:.
time = dates[-1] - dates[0]  # investment time in years
z = np.exp(np.log(values[-1] / values[0]) / time) - 1

# Wir can verify the advertisements promise
# of an 11% interest rate over a 40 years
# investment:
print(z)

In charts with an exponential growth, one often gets the misleading impression that recent years were very special with extreme growth rates. Please plot the MSCI index together with eq. (1) and the interest rate from the last cell with a logarithmic scale to demonstarte that nothing special happened in recent years (in contrast to the first years of the period).

**Hint:** To obtain a logarithmic y-axis in a plot, you can use the command `plt.yscale('log')`.

In [None]:
# Your solution here please

In [None]:
## SOLUTION

# eq. (1) with the calculated interest rate of 11.2%
years = dates - dates[0]
y = (1.112)**years

#fig, ax = plt.subplots()

# fixed interest curve
plt.plot(dates, y, label='fixed interest')

# MSCI index
plt.plot(dates, values, label='MSCI World')

plt.yscale('log')
plt.xlabel('year')
plt.ylabel('MSCI-World index [dimensionless units]')
plt.legend()

#plt.savefig('MSCI_zins_log.png')

Please write now a function `interest` with the following signature:

```python
def interest(x, y, n):
    """
    The functions input are the money
    to invest, the amount of money at the
    end of the investment and the investment
    time in years. The functions result (return
    value) is the fixed yearly interest rate
    according to eq. (1)

    The function works for scalar input as well
    as for numpy-arrays- In the latter case, a
    corresponding array with interest rates is returned.
    """
```

In [None]:
# Your solution here please

In [None]:
## SOLUTION

def interest(x, y, n):
    """
    The functions input are the money
    to invest, the amount of money at the
    end of the investment and the investment
    time in years. The functions result (return
    value) is the fixed yearly interest rate
    according to eq. (1)

    The function works for scalar input as well
    as for numpy-arrays- In the latter case, a
    corresponding array with interest rates is returned.
    """
    
    # ATTENTION: We do not test whether x is 0 or negative!!
    return np.exp((np.log(y / x)) / n) - 1.

In [None]:
# This cell is for testing purposes. The following
# code-line should output an interest rate of 11 %:
interest(values[0], values[-1], dates[-1] - dates[0])

## 2. Shorter time-horizons for an investment
The MSCI index-chart suggests that the high 11% *mean* interest rate is not what you can expect for shorter investments. Indeed, you can loose a lot of money. For instance, the markets went down significantly after the year 2000 (the famous [dotcom bubble](https://en.wikipedia.org/wiki/Dot-com_bubble) which resulted from excessive speculations on internet-related companies).

In [None]:
# Please convince yourself that the following lines calculate
# the interest rate for an investment from 2000 to 2003
# (we have data for each month starting from Dec. 1978).
# I include this cell to give you some help for the tasks below.

# You should get a result of -0.259 (another test of your function
# 'interest' above).
x = values[1 + 12 * 22]
y = values[1 + 12 * 24]
n = dates[1 + 12 * 24] - dates[1 + 12 * 22]
print(dates[1 + 12 * 22], dates[1 + 12 * 24])
print(interest(x, y, n))

Please create now a `numpy-array` `interests` which contains the interests *for all possible MSCI investments of exactly one year*. The first entry of your array is the interest rate for an investment between Dec. 1978 and Dec. 1979, the second entry represents an investment from Jan. 1979 to Jan. 1980, the third one from Feb. 1979 to Feb. 1980 an so on.

Plot a histogram of your `interests` array - please have a look at the function `plt.hist` for this.

Calculate mean and standard deviation of your `interests` array. Which was to-date the best and which the worst year for a one-year investment into the MSCI-World. Print all these quantities.

**Hints:**
(1) The function `interest` accepting `numpy`-arrays and array-slicing of the `dates` and `values` arrays allows you to get the `interests` array in about three very short lines; (2) For the last part of this task, the functions `np.argmin` and `np.argmax` might be helpful.

In [None]:
# Your solution here please

In [None]:
## SOLUTION

# I immediately solve the task in the general form (variable years)
# needed below.
years = 1
months = 12 * years  # 12 months in each year
x = values[:-months]
y = values[months:]
interests = interest(x, y, years)
maxind = np.argmax(interests)
minind = np.argmin(interests)
print(f"Best year started at {dates[maxind]}; interest rate: {interests[maxind]}")
print(f"Worst year started at {dates[minind]}; interest rate: {interests[minind]}")
mean_interest = interests.mean()
std_interest = interests.std()
print(f"mean interest for {years} year(s): {mean_interest:.2f} +/- {std_interest:.2f}.")

fig, ax = plt.subplots()
a = ax.hist(interests * 100, color="lightblue", edgecolor='black')
ax.set_xlabel("interest rate [%]")
ax.set_ylabel("N [years]")

Please generalise your code from the last cell to work with an arbitrary investment period. Introduce a variable `years`. This variable should be the only place you need to change to obtain your histogram for an investment time-horizon `years`. Look at the histograms for different times between one and 30 years. What do you observe?

You ask a student peer for the expected interest-rate of an $n$-year investment into the MSCI-World. She answers something like:

$$
z_{n} = z_{n, \rm{mean}} \pm \sigma_{n},
$$

where $z_{n, \rm{mean}}$ and $\sigma_{n}$ are the mean and standard deviations from above. Is this description of interest-rate and its error justified statistically (hint: gaussian random variable)? What would you do if somebody asked you for the expected interest-rate and an uncertainty on that quantity? 

## Your answer here please

!! SOLUTION

You should observe less and less negative interest rates once the time-horizon of your investment gets longer.

Obviously the stock market variations are not a gaussian random variable. In such cases, one *should not* try to quantify the error of a measurement with a single number (the standard deviation). A fair and complete characterisation of the error in such cases is to show the complete histogram.

Please use what you have done up to now to reproduce the following plot:

<img src="figs/interest_rates.png" width="400" height="200" />

In [None]:
# Your solution here please

In [None]:
## SOLUTION

invest_years = np.array([1, 5, 10, 15, 20, 25, 30])
mean_invest = np.zeros(len(invest_years))
max_invest = np.zeros(len(invest_years))
min_invest = np.zeros(len(invest_years))

# we did not treat enumerate in class. Of course
# the following loop can be done in many different
# ways:
for i, years in enumerate(invest_years):
    months = years * 12
    x = values[:-months]
    y = values[months:]
    result = interest(x, y, years)

    mean_invest[i] = result.mean()
    max_invest[i] = result.max()
    min_invest[i] = result.min()
    
fig, ax = plt.subplots()

ax.plot(invest_years, np.array(mean_invest) * 100, 
        label="mean interest")
ax.plot(invest_years, np.array(min_invest) * 100, 'ro', 
        label="min interest")
ax.plot(invest_years, np.array(max_invest) * 100, 'go', 
        label="max interest")
ax.plot(invest_years, np.zeros(len(invest_years)))

ax.set_xlabel("time-horizon [years]")
ax.set_ylabel("interests [%]")
ax.legend()

#plt.savefig("interest_rates.png", dpi=200)

How should you invest into the MSCI-World if you would like to be (statistically) sure not to loose money with your investment?

## Your solution here please

!! SOLUTION

Stocks are only a solid investment on a very long-term basis - under the assumption that you want to invest your money without having a close watch on it all the time. With a time-horizon of 15 years or more, one can be statistically sure (from historical data) that no money is lost. Of course the matter is more compley in reality. We did not take into account taxes or inflation. The best and worst interest rates approach for long-term invstement the mean growth rate of the world economy (regression to the mean).

## 3. When is the right time to invest?

We have three friends who want to invest into the stock market:

- Amelie is a bit clumsy and she invests her money at a peak of the MSCI-index chart, i.e. stocks are very expensive at that time.
- Linda is very smart and she manages to time her investment to a dip in the MSCI-index chart.
- Lydia does not care and she just decides to invest *right now*, no matter what the chart says or what analysts tell her.

You would like to investigate for your own strategy whether anyone of the three friends does significantly better or worse than the others. From the previous task, you obtained a time-horizon with which you are quite sure not to loose many in the long-term. Choose that timeframe `my_time` in the following. 

1. Determine all the local maxima of the MSCI-World index curve and calculate all the interests starting at those maxima. Visualise your result in a histogram. This obviously shows the expected results from Amelies strategy.
2. Repeat subtask 1 but with the minima of the index curve as starting points for the investement. This simulates Lindas investements.
3. Finally, to get Lydias concept, perform the analysis with random starting points for her investments.

**Hints:** (1) Please see [this task from the lecture preview](03_Lecture_Review.ipynb/#min_max)) on how to get the minima and maxima; (2) Have a look at the function `nr.randint` to obtain starting points for the investement investments.

What are your conclusions from these simulations?

In [None]:
# your solution here please

In [None]:
## SOLUTION

# I choose 10/15 years as my time-horizon
my_time = 15

# get all the minima, maxima and random points from the MSCI-curve.
# Note that we get rid of points not allowing for an investement
# of my_time_years
max_MSCI = np.where((values[1:-1] >= values[:-2]) & (values[1:-1] >= values[2:]))[0] + 1
max_MSCI = max_MSCI[max_MSCI < (len(values) - my_time * 12)]
min_MSCI = np.where((values[1:-1] <= values[:-2]) & (values[1:-1] <= values[2:]))[0] + 1
min_MSCI = min_MSCI[min_MSCI < (len(values) - my_time * 12)]
# I saw that the min and max arrays contained about 80 points:
rand_MSCI = nr.randint(0, len(values) - 15 * 12, size = 80)

# just plot them for visualisation
fig, axs = plt.subplots(2, 3, figsize = (10,10))
axs[0,0].plot(dates, values)
axs[0,0].plot(dates[max_MSCI], values[max_MSCI], '.', 
              label='local maxima')
axs[0,0].set_yscale('log')
axs[0,0].legend()
axs[0,1].plot(dates, values)
axs[0,1].plot(dates[min_MSCI], values[min_MSCI], '.', 
              label='local minima')
axs[0,1].set_yscale('log')
axs[0,1].legend()
axs[0,2].plot(dates, values)
axs[0,2].plot(dates[rand_MSCI], values[rand_MSCI], '.', 
              label='random')
axs[0,2].set_yscale('log')
axs[0,2].legend()

# now get the interests and plot the histograms
max_int = interest(values[max_MSCI], 
                   values[max_MSCI + 12 * my_time], 
                   my_time)

axs[1,0].hist(max_int * 100, color="lightblue", edgecolor='black')

min_int = interest(values[min_MSCI], 
                   values[min_MSCI + 12 * my_time], 
                   my_time)

axs[1,1].hist(min_int * 100, color="lightblue", edgecolor='black')

rand_int = interest(values[rand_MSCI], 
                   values[rand_MSCI + 12 * my_time], 
                   my_time)

axs[1,2].hist(rand_int * 100, color="lightblue", edgecolor='black')

print(f"Amelie: expected interest: {max_int.mean():.3f} +/- {max_int.std():.3f}")
print(f"Linda: expected interest: {min_int.mean():.3f} +/- {min_int.std():.3f}")
print(f"Lydia: expected interest: {rand_int.mean():.3f} +/- {rand_int.std():.3f}")

## Your solution here please

!! SOLUTION

If you plan to invest your money on a long-term basis, the best time to start is always *right now*. As discussed above, the expected interest rate approaches a mean and it does not matter significantly when you start. This is of course **not** valid anymore for short term investments! But typically this is not an investment but a speculation!

The simulations show that Linda does a tiny bit better than the other two - as expected. But to time the market for a dip is very unrealistic. We also note that the worst possible strategy is not significantly worse than the best. The most important decision is to start investing money at all!

## Epilogue

We investigated the consequences of a *one-time* investment into the MSCI-World index. In most cases, people do not have a large amount of money they can invest in a single-shot. Instead, a typical investment is with a regular savings rate, e.g. an investment of 100 Euros per month. Although the calculations for this case are a bit more complicated, you could do all of them with what you know already! You will notice that the results are very similar but the time-frame of your investment needs to be longer to reach the *no-loss* zone.

I am doing monthly investments into the MSCI-World myself as part of my retirement provision. If you are interested, here is a [very good german video playlist](https://www.youtube.com/watch?v=R8fUq8e8I-I&list=PLIRB0hpiwW9D5vpLBeTucW9EAM_ub7SfE) on how you can start yourself. With a time-horizon of about 40 years until retirement and a lot of time for the compound interest effect to be effective, you could easily obtain a significant bonus for your retirement with only a moderate investment each month. Albert Einstein once called the compound interest effect the eigth world wonder! 