#**Introduction to Financial Python**
##Rate of Return, Mean and Variance

###**Introduction**

In this chapter we are going to introduce some basic concepts in quantitative finance. We start with rate of return, mean and variance. You may think it's simple to calculate these values, however, there are number of different methods to calculate them. It's important to choose the appropriate calculation methods case by case.

###**Rate of Return**

**Single-period Return**

The single-period rate of return can be calculated as following:
\
	<center>$$r =\frac{p_t}{p_0} = \frac{p_t - p_0}{p_0}$$</center>

Where $r$ is the rate of return, $p_t$ is the asset price at time $t$, and $p_0$ is the asset price at time 0.


In [None]:
import numpy as np
rate_return = 102.0/100 - 1
print (rate_return)

0.020000000000000018


In [None]:
#Mi ejemplo
result = (2/3)**9
print(result)

0.02601229487374891


Let's say we bought a stock at $\$100$, and half a year later it will grow to $\$102$. A year later the price will come to $\$104$. How to calculate our total return? Well, we can either deem it as a single-period:
\
	<center>$$r =\frac{104}{100} - 1 = 0.04$$</center>

or as a two-stage period:
\
	<center>$$r = (1+ r_1)*(1+r_2)- 1=\frac{102}{100}* - \frac{104}{102}-1 = 0.04$$</center>

Here we make calculations twice a year. It's called semi-annual compounding. How about quarterly compounding? Let's assume the stock prices at the end of each quarter are $p1,p2,p3,p4$ respectively.
\
	<center>$$r = (1+ r_1)*(1+r_2)*(1+r_3)*(1+r_4)- 1$$</center>

The rate of return we calculate here is called **cumulative return** or **overall return**. It measures the total return of this asset over a period of time.

Now consider the following situation: we have two strategies: strategy A and strategy B. We backtested strategy A for 1 years and the cumulative return is 20%, while we backtested strategy B for 3 months(one quarter) and the cumulative return is 6%. Which strategy has a high rate of return? Our commonly used method is to convert all the returns into **compounding annual return**, regardless of the investing horizon of each strategy. We can compare the returns of strategies with different time horizon now. Since there are four quarters in a year,the annual return of strategy B is
\
	<center>$$r = (1+0.06)^{4}=1 +r$$</center>
	<center>$$r = 0.262$$</center>

Strategy B has an higher compounding annual return when we compare 26% with 20%.

**Logarithm Return**

In the above example, strategy A has 6% return over three months. Nominally, the annual return would be 4*6% = 24%. This nominal annual interest rate is called the stated annual interest rate. It is calculated as the periodic interest rate times the number of periods per year. It works according to the simple interest and does not take into account the compounding periods, while the effective annual interest rate is 26% as we calculated above and it does account for intra-year compounding. The effective annual interest rate is an essential tool that allows the evaluation of the real return on investment. If we assume the number of compounding periods in one year is n, the formula to convert the stated annual interest rate to the effective annual interest rate is
\
	<center>$$r_{effective} = (1+\frac{r_{nominal}}{n})^{n} -1$$</center>

Now imagine the price of asset is changing every second or even every millisecond, the period of compounding n approaches infinite. This is called **continuous compounding**. The calculation formula is given below:

<center>$$\lim\limits_{n\to\infty}(1+\frac{r}{n})^{n}= e^{r}$$</center>

From the above limitation equation, we know that if we assume continuous compounding:

<center>$$e^{r_{nominal}}=1+r_{effective}= \frac{p_t}{p_0}$$</center>

Then we take $ln$

<center>$${r_{nominal}}=ln\frac{p_t}{p_0}=lnp_t-lnp_0$$</center>

Here we got the **logarithmic return**, or **continuously compounded return**. This return is the nominal return with the interest compounding every millisecond. To see how it is close to effective interest rate, recall the equation above

<center>$$e^{r_{nominal}}=1+r_{effective}$$</center>

then we have

<center>$$r_{effective}= e^{r_{nominal}} - 1\thickapprox r_{nominal}$$</center>

where the second equality holds due to Taylor Expansion and the interest rate being small. This is frequently used when calculating returns, because once we take the logarithm of asset prices, we can calculate the logarithm return by simply doing a subtraction. Here we use Apple stock prices as an example:

In [None]:
!pip3 install quandl



In [None]:
import quandl
import numpy as np
import quandl
quandl.ApiConfig.api_key = 'TWQ9Jm65-UXz3qV75FhB'
#get quandl data
aapl_table = quandl.get('WIKI/AAPL')
aapl = aapl_table.loc['2017-3',['Open','Close']]
#take log return
aapl['log_price'] = np.log(aapl.Close)
aapl['log_return'] = aapl['log_price'].diff()
print (aapl)

               Open   Close  log_price  log_return
Date                                              
2017-03-01  137.890  139.79   4.940141         NaN
2017-03-02  140.000  138.96   4.934186   -0.005955
2017-03-03  138.780  139.78   4.940070    0.005884
2017-03-06  139.365  139.34   4.936917   -0.003153
2017-03-07  139.060  139.52   4.938208    0.001291
2017-03-08  138.950  139.00   4.934474   -0.003734
2017-03-09  138.740  138.68   4.932169   -0.002305
2017-03-10  139.250  139.14   4.935481    0.003311
2017-03-13  138.850  139.20   4.935912    0.000431
2017-03-14  139.300  138.99   4.934402   -0.001510
2017-03-15  139.410  140.46   4.944923    0.010521
2017-03-16  140.720  140.69   4.946559    0.001636
2017-03-17  141.000  139.99   4.941571   -0.004988
2017-03-20  140.400  141.46   4.952017    0.010446
2017-03-21  142.110  139.84   4.940499   -0.011518
2017-03-22  139.845  141.42   4.951734    0.011235
2017-03-23  141.260  140.92   4.948192   -0.003542
2017-03-24  141.500  140.64   4

In [None]:
#Mi ejemplo
#get quandl data
aapl_2_table = quandl.get('WIKI/AAPL')
aapl_2 = aapl_2_table.loc['2017-4',['Open','Close']]
#take log return
aapl_2['log_price'] = np.log(aapl_2.Close)
aapl_2['log_return'] = aapl_2['log_price'].diff()
print (aapl_2)

                Open     Close  log_price  log_return
Date                                                 
2017-04-03  143.7100  143.7000   4.967728         NaN
2017-04-04  143.2500  144.7700   4.975146    0.007418
2017-04-05  144.2200  144.0200   4.969952   -0.005194
2017-04-06  144.2900  143.6600   4.967449   -0.002503
2017-04-07  143.7300  143.3400   4.965219   -0.002230
2017-04-10  143.6000  143.1700   4.964033   -0.001187
2017-04-11  142.9400  141.6300   4.953218   -0.010815
2017-04-12  141.6000  141.8000   4.954418    0.001200
2017-04-13  141.9100  141.0500   4.949114   -0.005303
2017-04-17  141.4800  141.8300   4.954629    0.005515
2017-04-18  141.4100  141.2000   4.950177   -0.004452
2017-04-19  141.8800  140.6800   4.946488   -0.003690
2017-04-20  141.2200  142.4400   4.958921    0.012433
2017-04-21  142.4400  142.2700   4.957727   -0.001194
2017-04-24  143.5000  143.6400   4.967310    0.009584
2017-04-25  143.9100  144.5400   4.973556    0.006246
2017-04-26  144.4700  143.65

Here we calculated the daily logarithmic return of Apple stock. Given that we know the daily logarithm return of in this month, we can calculate the monthly return by simply sum all the daily returns up.

In [None]:
month_return = aapl.log_return.sum()
print (month_return)

0.0273081001636184


In [None]:
#Mi ejemplo
month_return_2 = aapl_2.log_return.sum()
print (month_return_2)

-0.0003480076596806825


It may sounds incorrect to sum up the daily returns, but we can prove that it's mathematically correct. Let's assume the stock prices in a period of time are represented by $[p_0,p_1,p_2,p_3.....p_n]$

<center>$$1+r_{effective}\thickapprox 1+r_{nominal}=ln\frac{p_t}{p_0}=ln\frac{p_t}{p_{t-1}}+ln\frac{p_{t-1}}{p_{t-2}}+ ......+ln\frac{p_1}{p_0}$$</center>

According to the equation above, we can simple sum up each logarithmic return in a period to get the cumulative return. The convenience of this method is also one of the reasons why we use logarithmic return in quantitative finance.

###**Mean**

**Arithmetic Mean**

Mean is a measure of the central tendency of a data series. It capture the key character of the distribution of the data series. When we talk about mean, by default it refers to arithmetic mean. It's defined as the sum of the values divided by the number of observations:

<center>$$\mu=\frac{\sum_{i=1}^n x_i}{n}$$</center>

Where $(x_1,x_2,x_3.....x_n)$

In python we can use NumPy.mean() to do the calculation:

In [None]:
print (np.mean(aapl.log_price))

4.94597446550658


In [None]:
#Mi ejemplo
print (np.mean(aapl_2.log_price))

4.962010799550092


**Geometric Mean**

The geometric mean is an average that is useful for data series of positive numbers that are better interpreted according to their product, such as growth rate. It's calculated by:

<center>$$\overline{x}=\sqrt[n]{x_1x_2x_3...x_n}$$</center>

Let's calculate the geometric mean of a series of single-period return:

<center>$$1+\overline{r}=\sqrt[t]{\frac{p_t}{p_{t-1}}*\frac{p_{t-1}}{p_{t-2}}* ......*\frac{p_2}{p_1}*\frac{p_1}{p_0}}$$</center>

<center>$$(1+\overline{r})=\sqrt[t]{\frac{p_t}{p_{0}}}$$</center>

Now the equation becomes the form which we are familiar with:

<center>$$(1+\overline{r})={\frac{p_t}{p_{0}}}$$</center>

This is why we said it make sense when applied to growth rates.

###**Variance and Standard Deviation**

**Variance**

**Variance** is a measure of dispersion. In finance, most of the time variance is a synonym for risk. The higher the variance of an asset price is, the higher risk the asset bears. Variance is usually represented by $σ^{2}$

<center>$$σ^{2}=\frac{\sum_{i=1}^n (x_i- \mu)^2}{n}$$</center>

In python we can use NumPy.var to calculate it:

In [None]:
print (np.var(aapl.log_price))

0.00014203280448152512


In [None]:
#Mi ejemplo
print (np.var(aapl_2.log_price))

7.032898268435797e-05


**Standard Deviation**

The most commonly used measure of dispersion in finance is standard deviation. It's usually represented by $σ$

<center>$$σ=\sqrt{σ^2}=\sqrt{\frac{\sum_{i=1}^n (x_i- \mu)^2}{n}}$$</center>

NumPy also provides us a method to calculate standard deviation.

In [None]:
print (np.std(aapl.log_price))

0.011917751653794651


In [None]:
#Mi ejemplo
print (np.std(aapl_2.log_price))

0.008386237695436373


###**Summary**

We introduced different types of rate of return in this chapter, which could be a little bit tricky when we calculate them. Mean and standard deviation are also very important concepts when we conduct hypothesis test or measure the risk associated with a asset. We will use those concepts intensively in our later chapter.