# Key Takeaways

This notebook explains the differences between simple and log returns.
We will almost always use simple returns, and this notebook explains why.

***The key takeaways from this notebook are:***

1. Calculating simple and log returns
1. Simple returns are always appropriate
1. But log returns provide a more computationally efficient way to compound returns

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

In [None]:
plt.rcParams['figure.dpi'] = 150
np.set_printoptions(precision=4, suppress=True)
pd.options.display.float_format = '{:.4f}'.format

# Introduction

We will typically calculate *simple* returns as in finance 101:
$$R_{simple,t} = \frac{P_t + D_t - P_{t-1}}{P_{t-1}} = \frac{P_t + D_t}{P_{t-1}} - 1.$$
The simple return is the return that investors receive on invested dollars.
We can calculate simple returns from Yahoo Finance data with the `.pct_change()` method on the adjusted close column.
The adjusted close column is a reverse-engineered close price that incorporates dividends and splits and makes this simple return calculation correct.

However, we may see *log* returns elsewhere, which are the (natural) log of one plus simple returns:
\begin{align*}
    R_{log,t} &= \log(1 + R_{simple,t}) \\
    &= \log\left(1 + \frac{P_t - P_{t-1} + D_t}{P_{t-1}} \right) \\
    &= \log\left(1 +  \frac{P_t + D_t}{P_{t-1}} - 1 \right) \\
    &= \log\left(\frac{P_t + D_t}{P_{t-1}} \right) \\
    &= \log(P_t + D_t) - \log(P_{t-1})
\end{align*}
Therefore, we can calculate log returns as either the log of one plus simple returns or the difference of the logs of the adjusted close column.[^cc]

[^cc]: *Log* returns are also known as *continuously-compounded* returns.

Again, we will typically calculate (and apply) *simple* returns instead of *log* returns.
However, for completeness, this notebook explains the differences between simple and log returns and when each is appropriate.

# Simple and Log Returns are Similar for Small Returns

Simple and log returns are similar for small changes because $\log(1 + R) \approx R$ for small values of $R$.
Returns are typically small for daily and monthly returns, so the difference between simple and log returns is small at daily and monthly frequencies.
The following figure shows that $R_{simple,t} \approx R_{log,t}$ for small $R$s.

In [None]:
R = np.linspace(-0.75, 0.75, 100)

In [None]:
plt.plot(R, np.log(1 + R))
plt.plot([-1, 1], [-1, 1])
plt.xlabel('Simple Return')
plt.ylabel('Log Return')
plt.title('Log Versus Simple Returns')
plt.legend(['Actual Relation', 'If Log = Simple'])
plt.show()

# Simple Return Advantage: Portfolio Calculations

We can only perform portfolio calculations with simple returns.
For a portfolio of $N$ assets held with portfolio weights $w_i$, the portfolio return $R_{p}$ is the weighted average of the returns of its assets: $$R_{p} = \sum_{i=1}^N w_i \cdot R_{i}.$$
For two stocks with portfolio weights of 50%, our portolio return is: $$R_{portfolio} = 0.5 \cdot R_{\text{stock 1}} + 0.5 \cdot R_{\text{stock 2}}.$$
However, the same is not true with log returns because the log of sums is not the same as the sum of logs.
*So we cannot calculate a portfolio return as the weighted average of the log returns.*

# Log Return Advantage: Log Returns are Additive

The advantage of log returns is that log returns are additive instead of multiplicative.
The additive property of log returns helps make code fast (and mathematical proofs easy) because we have to multiply (compound) returns when we consider more than one period.
We calculate the return from $t=0$ to $t=T$ as follows.
$$1 + R_{0, T} = (1 + R_1) \times (1 + R_2) \times \cdots \times (1 + R_T)$$
With log returns instead of simple returns, this calculation becomes additive instead of multiplicative.

First, take the (natural) log of both sides of the previous equation, then use the principal that the log of products is the sum of logs.
\begin{align*}
    \log(1 + R_{0, T}) &= \log((1 + R_1) \times (1 + R_2) \times \cdots \times (1 + R_T)) \\
    &= \log(1 + R_1) + \log(1 + R_2) + \cdots + \log(1 + R_T) \\
    &= \sum_{t=1}^T \log(1 + R_t) \\
\end{align*}
Second, take the exponential of both sides of the previous equation.
\begin{align*}
    e^{\log(1 + R_{0, T})} &= e^{\sum_{t=0}^T \log(1 + R_t)} \\
    1 + R_{0,T} &= e^{\sum_{t=0}^T \log(1 + R_t)} \\
    R_{0 ,T} &= e^{\sum_{t=0}^T \log(1 + R_t)} - 1
\end{align*}
So we can say that $R_{0,T}$, the return from $t=0$ to $t=T$, is the exponential of the sum of the log returns.
This description does not roll off the tongue, but its code executes fast (at least in pandas, because the `.sum()` method is optimized).

The following code generates 10,000 random log returns.
The `np.random.standard_normal()` call generates normally distributed random numbers.
To generate the equivalent simple returns, we take the exponential of the log returns, then subtract one.
Note that the minimum simple return is greater than -100%, so these are well-behaved stock returns.

In [None]:
np.random.seed(42)
df = pd.DataFrame({'R': np.exp(np.random.standard_normal(10000)) - 1}, index=np.arange(10000, dtype='int'))

Now we can time the calculation of 12-observation rolling returns.
We use `.apply()` for the simple return version because `.rolling()` does not have a product method.
We find that `.rolling()` is slower with `.apply()` than with `.sum()` by a factor of 2,001.

In [None]:
%%timeit
df['R12_via_simple'] = df['R'].rolling(12).apply(lambda x : np.prod(1 + x) - 1)

In [None]:
%%timeit
df['R12_via_log'] = np.exp((np.log(1 + df['R'])).rolling(12).sum()) - 1

In [None]:
np.allclose(df['R12_via_simple'], df['R12_via_log'], equal_nan=True)

We get the same answer with both approaches, but the log-return approach saves us time!

# Conclusion

Simple and log returns have their advantages.
When periods are short and returns are small, simple and log returns are similar.
However, log returns are a fast way to compound returns but inappropriate for portfolio calculations.
If we compound via log returns, we will typically convert them back to simple returns, which are easier to interpret.