# Lecture Note: Yield to Maturity and Duration

## By Albert S. (Pete) Kyle

## BUFN400---University of Maryland 


$\require{\newcommand}$
$\require{\renewcommand}$
$\newcommand{\E}{\mathrm{E}}$
$\newcommand{\e}{\mathrm{e}}$
$\newcommand{\drm}{\mathrm{\, d}}$
$\newcommand{\var}{\mathrm{var}}$
$\newcommand{\stdev}{\mathrm{stdev}}$
$\newcommand{\sm}{ {\scriptstyle{\text{*}}}}$ 
$\renewcommand{\mm}{{\scriptsize @}}$
$\renewcommand{\t}{^{\mathsf{T}}}$
$\renewcommand{\comma}{\, , \,}$
$\renewcommand{\vec}[1]{\mathbf{#1}}$


In [1]:
import datetime
# timestamp = datetime.datetime.now().strftime('%Y-%m%d-%H:%M:%S.%f %p')
timestamp = datetime.datetime.now().strftime('%B %d, %Y, %H:%M:%S.%f %p')

try:
    from IPython.display import display, Markdown
    display(Markdown(rf"### This version: {timestamp}"))
except:
    print("This version: ", timestamp)


### This version: September 22, 2023, 23:45:24.530691 PM

In [2]:
import os
import pandas as pd
import numpy as np
import scipy
import scipy.special
import scipy.optimize
import matplotlib
import matplotlib.pyplot as plt
import nbconvert
import sys
import math
import string
import time
import timeit
import io
from pprint import pprint

pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 50)
pd.set_option('display.width', 100)

print('Python version ' + sys.version)
print('Pandas version ' + pd.__version__)
print('NumPy version ' + np.__version__)
print('SciPy version ' + scipy.__version__)
print('matplotlib version ' + matplotlib.__version__)

tstart = timeit.default_timer()


Python version 3.8.11 (default, Aug  6 2021, 09:57:55) [MSC v.1916 64 bit (AMD64)]
Pandas version 1.5.3
NumPy version 1.23.5
SciPy version 1.10.1
matplotlib version 3.7.1


## Yield to Maturity

Before the era of computers, when securities were valued using pencil-and-paper calculations, it was too much trouble for traders to value bonds by constructing a yield curve with potentially different interest rates determining discount factors for pricing cash flows at different dates.  Instead, the more intuitive concept of **yield to maturity** was developed.

The **yield to maturity** on a fixed income security is defined mathematically as the constant interest rate which makes the present value of the security equal to its current price. The yield to maturity is also called the **internal rate of return** on a fixed income security. From a static perspective, the constant interest rate can be interpreted as a **flat** yield curve which uses the same interest rate to calculate different discount factors for cash flows at every future date. From a dynamic perspective, the constant interest rate can also be interpreted as the hypothetical assumption that future short-term interest rates are known to be the same as the yield to maturity; they do not change over time. This implies that when cash flows on the security are received, they can be reinvested at the same rate (yield to maturity). Together, these assumptions imply that current interest rates are the same for all maturities, and future interest rate are known to be the same as the current rate for all maturities as well. 

Defined this way, the yield to maturity is associated with two functions: (1) a function mapping to yield to maturity to the price of the security, and (2) an inverse function mapping the price of a security to its yield.



### Present Value and Net Present Value as a functions of yield to maturity

The function mapping the yield to the hypothetical price present value is straightforward. 

Consider a sequence (vector) of known cash flows denoted $\vec{c} = (c[0], c[1], \ldots, c[K])$ at dates $\vec{t} = (\vec{t}=t[0], t[1], \ldots, t[K])$. Here, the date $t[0]$ is typically assumed to the the current date and the dates $t[1]$, $\ldots$, $t[K]$ are assumed to be monotonically increasing and greater than $t[0]$, so they represent dates in the future. The initial cash payment $c[0]$ is typically negative and represents the price paid for a security. The future cash flows $c[1]$, $\ldots$, $c[K]$ are typically assumed to be positive and represent the fixed (known, nonrandom) cash flows obtained from investing in a security. 

One complication is that these functions depend on the compounding convention used to quote the yield.  Typically, the compounding convention is chosen to match the timing of cash flows on a bond.  For example, with U.S. Treasury notes and bonds, the coupon interest is paid semi-annually, so a **bond-equivalent yield** $y_2$ with semi-annual compounding is used.  For mortgages, a monthly payment convention is used, so the yield might be written $y_{12}$ to denote monthly payments. 

Given a number of compounding periods $N$ per year (e.g. $N=2$ or $N=12$), an annual yield $y_N$, with current date $t_0$, the value of the cash flows can be represented in two slightly different ways, which we call **present value** and **net present value**. 

The term **present value** is shorthand for "present value of future cash flows" and therefore ignores the first cash flow $c[0]$ which occurs at the current date, not the future. We can defined a present value function as corresponding to a given yield to maturity $y_N$ as

$$
PV(y_N ; \vec{c}, \vec{t}, N) = \sum_{k=1}^K \frac{c[k]}{(1 + \tfrac{y_N}{N})^{N \sm (t[k] - t[0])}}.
$$

Since the sum in this formula starts with $k=1$, it ignores the first cash flow $c[0]$.

If we include the first cash flow $c[0]$ (which is presumably negative) in the calculation, we call the result **net present value**:

$$
NPV(y_N ; \vec{c}, \vec{t}, N) = \sum_{k=0}^K \frac{c[k]}{(1 + \tfrac{y_N}{N})^{N \sm (t[k] - t[0])}}.
$$

Since the first cash flow is not discounted, we can write the net presetn value equivalently as

$$
NPV(y_N ; \vec{c}, \vec{t}, N) = c[0] + \sum_{k=1}^K \frac{c[k]}{(1 + \tfrac{y_N}{N})^{N \sm (t[k] - t[0])}}.
$$

The term **net present value** is shorthand for "present value of future cash flows with the initial cost netted out".


### PV and NPV are inner products

Notice that the present value function $PV(y_N ; \vec{c}, \vec{t}, N)$ and the net present value function $NPV(y_N ; \vec{c}, \vec{t}, N)$ are essentially implemented as inner products. Let $\vec{df}$ be a vector of discount factors defined by $\vec{df}[k] = (1 + y_N / N)^{-k}$ for $k=0$, $\ldots$, $K$. Then the presenet value and net present value are given by

$$
NPV(y_N ; \vec{c}, \vec{t}, N) = \vec{c} \mm \vec{df},
\qquad
PV(y_N ; \vec{c}, \vec{t}, N) = \vec{c}[1:] \mm \vec{df}[1:].
$$


### Zero NPV investments

In a typical application, an investor contemplating an investment opportunity uses a yield to maturity to calculate the security's present value, then compares this present value with the it market price $p_0$. The market price defines the initial negative cash flow $c[0] := -p_0$, which is used as an input into the NPV formula.  If the net present value is positive, this means that the present value of the future cash flows is greater than the cost $p_0 = -c[0]$. In this case, the investment opportunity has **positive net present value**, and the investor will want to purchase the security for a price $p_0 = -c[0]$. If the net present value is negative, the investor will want to avoid purchasing the security and may in fact want to sell it if the security is held in the investor's portfolio. He may even want to **sell the security short** by borrowing it from the portfolio of another investor; this requires a **security lending transaction**, which requires paying something to borrow the security, so the borrowing cost must compared to the NPV to determine whether a short sale is worthwhile.

If the NPV is zero, we have a **zero NPV investment opportunity** and often say that the security is **fairly priced** in the sense that the investor neither wants to buy nor sell the security at the price $p_0=-c[0]$.

If the NPV is zero $-c[0] = PV(y_N ; \vec{c}, \vec{t})$, then the NPV of the modified vector of cash flows $\vec{c}$ is zero.

$$
\begin{aligned}
NPV(y_N ; \vec{c}, \vec{t}, N) &= c[0] + \sum_{k=1}^K \frac{c[k]}{(1 + \tfrac{y_N}{N})^{N \sm (t[k] - t[0])}}\\
&= -PV(y_N ; \vec{\vec{c}}, \vec{t}, N) + \sum_{k=1}^K \frac{c[k]}{(1 + \tfrac{y_N}{N})^{N \sm (t[k] - t[0])}}\\
&= 0.
\end{aligned}
$$



### Inverse relationship between present value and yield to maturity

Obviously, regardless of whether the initial undiscounted cash flow $c[0]$ is positive, negative, or zero, the present value function $PV(y_N ; \vec{c}, \vec{t})$ and net present value function $NPV(y_N ; \vec{c}, \vec{t})$ are monotically decreasing in the yield to maturity.  A higher yield to maturity corresponds to a lower price. There is an **inverse relationship** between yield to maturity and present value.  If an investor is willing to accept a lower yield to maturity, the investor will be willing to pay a higher price for a fixed income investment.

 

## Example

Problem: Calculate the present values and net present values of a two-year note offered for sale at a price of 99.98 percent of par on its day of issuance. The two-year note has a 3.00 percent coupon. Use 11 different hypothetical bond-equivalent-yields to maturity $\vec{y}_2[0] = 2.00$, $\vec{y}_2[1] = 2.20$, $\ldots$, $\vec{y}_2[10] = 4.00$ percent:

We will use the typical market convention that the **par value** (principal paid back at maturity) is 100, which means "100 percent of the par value" or "per 100 dollars of face value". The coupons are 3.00 percent of the par value per year, which corresponds to 1.50 percent of par value per half-year.  The yield to maturity is a fraction: 0.0200, 0.0220, $\ldots$, 0.0400, so the semi-annual rate corresponding to a yield of 4.00 percent per year is $0.0400 / 2 = 0.0200 = 2$ percent per half-year.

The key calculation is an inner product of a cash flow vector and a vector of discount factors.

Implementation details: The following example illustrates some generic programming principles:

1. Define assumptions with descriptive names. (Avoid hard-coding assumptions.)

2. Use built-in numpy functions to perform calculations efficiently. (Avoid loops coded by hand. Avoid writing explicit loops in Python or numpy. NB: List comprehensions may use Python loops, so I have used `np.linspace` to create arrays; `np.linspace` is typically better than the very similar function `np.arange`, which could also be used.)

3. Use pandas for display and manipulation of data. (In this problem, there is little data manipulation, so all the computational work is done with numpy. But pandas is used to illustrate display of information.)

4. Make code easy to update or modify. For example, it is sometimes convenient to be able to change the `dtype` easily. I have explicitly defined the dtype as `np.float64`. This is useful if you may switch to a different dtype, such as `np.float32`, for computational efficiency. In our example, this line is not really necessary since numpy tends to create arrays of type `np.float64` by default. By defining the dtype once, the line "`dtyp=np.float64`" makes it easy to change the dtype for the entire example. For finance calculations, accuracy requirements often make it better to use `np.float64` instead of `np.float32`. Statistical estimation is sometimes computationally more efficient using `np.float32`.


In [3]:
# Define assumptions

dtyp = np.float64

coupon = 3.00 # annual
par_value = 100.00 # percent of par
maturity = 2 #years (integer!)
Nfreq = 2 # bond-equivalent yield (integer!), 2 payments per year
price = 99.98

yield_min = 0.0200
yield_max = 0.0400
yield_num = 11
yield_inc = (yield_max - yield_min) / (yield_num - 1)

# Set up data as numpy arrays:

# ytm = np.arange(start=yield_min, stop=yield_max + 0.0001, step=yield_inc, dtype=dtyp)
ytm = np.linspace(start=yield_min, stop=yield_max, num=yield_num, endpoint=True, dtype=dtyp)

cf_num = maturity * Nfreq + 1
cf = np.full(shape=(cf_num,), fill_value=coupon / Nfreq, dtype=dtyp)
cf[0] = -price
cf[-1] = cf[-1] + par_value

t = np.linspace(start=0.00, stop=dtyp(maturity), num=cf_num, dtype=dtyp)

df = (1.00 / (1.00 + ytm / Nfreq)).reshape(yield_num, 1)**(Nfreq * t.reshape(1, cf_num))

# Perform calculations using numpy:

npv = df @ cf # net-present-value operator is linear ("@" means "matrix-vector product"!)
pv = -cf[0] + npv 

print(f"{coupon=}\n{par_value=}\n{maturity=}\n{Nfreq=}\n{price=}\n")
print(f"{t=}\n{cf=}\n{ytm=}\n{df=}\n")

dframe = pd.DataFrame({'yield' : ytm, 'PV' : pv, 'NPV' : npv})
dframe['yield_pct'] = dframe['yield'] * 100.00
display(dframe)


coupon=3.0
par_value=100.0
maturity=2
Nfreq=2
price=99.98

t=array([0. , 0.5, 1. , 1.5, 2. ])
cf=array([-99.98,   1.5 ,   1.5 ,   1.5 , 101.5 ])
ytm=array([0.02 , 0.022, 0.024, 0.026, 0.028, 0.03 , 0.032, 0.034, 0.036,
       0.038, 0.04 ])
df=array([[1.        , 0.99009901, 0.98029605, 0.97059015, 0.96098034],
       [1.        , 0.98911968, 0.97835775, 0.96771291, 0.95718388],
       [1.        , 0.98814229, 0.97642519, 0.96484703, 0.95340615],
       [1.        , 0.98716683, 0.97449835, 0.96199245, 0.94964704],
       [1.        , 0.98619329, 0.97257721, 0.95914913, 0.94590644],
       [1.        , 0.98522167, 0.97066175, 0.95631699, 0.94218423],
       [1.        , 0.98425197, 0.96875194, 0.953496  , 0.93848032],
       [1.        , 0.98328417, 0.96684776, 0.95068609, 0.93479459],
       [1.        , 0.98231827, 0.96494919, 0.94788722, 0.93112693],
       [1.        , 0.98135427, 0.9630562 , 0.94509931, 0.92747725],
       [1.        , 0.98039216, 0.96116878, 0.94232233, 0.92384543

Unnamed: 0,yield,PV,NPV,yield_pct
0,0.02,101.950983,1.970983,2.0
1,0.022,101.55695,1.57695,2.2
2,0.024,101.164846,1.184846,2.4
3,0.026,100.774661,0.794661,2.6
4,0.028,100.386383,0.406383,2.8
5,0.03,100.0,0.02,3.0
6,0.032,99.615502,-0.364498,3.2
7,0.034,99.232877,-0.747123,3.4
8,0.036,98.852116,-1.127884,3.6
9,0.038,98.473205,-1.506795,3.8


## Problem for self-study

Study both the finance content and the Python/numpy syntax for the previous cell carefully to make sure you know all of the commands and concepts backwards and forwards.

1.  Why is it better to use function `np.linspace` than the seemingly almost equivalent function `np.arange` (which is commented out)?  Relatedly, why is the scalar 0.001 added to `yield_max` in the example?

2.  When the yield to maturity of the security is equal to its coupon, is the present value exactly 100.00, or is the value of 100.00 in the dataframe a "coincidence" which depends on something specific to this example?

3. Can you develop a one-sentence theory for why bond-equivalent yield `y2` is used for calculations involving bonds with semi-annual coupons?  What intuitive financial tradeoffs occur for choice between using $r_\infty$ and using $r_2$ (or $r_{12})?


# Yield to maturity: hypothetical meaning

To examine what the yield-to-maturity concept means financially and economically, consider a hypothetical market in which market participants believe that short-term interest rates will be constant forever, and these beliefs are correct.

In this hypothetical market, the market price of a fixed income security should be given by the the present value formula in the previous equation, with the yield to maturity equal to the known short rate. Why?  The short answer is that under the hypothetical assumptions we are making, fixed income securities are all **priced by arbitrage**. An **arbitrage opportunity** is a trading strategy which generates a positive cash flow with no initial investment (or, alternatively, generates positive cash flow today with no negative cash flows in the future). The phrase **priced by arbitrage** means that there is a price at which there is no arbitrage opportunity available for market participants. For all other prices, an arbitrage opportunity is available to market participants. This no-arbitrage price is the zero-NPV price. Why? At lower prices, investors would want to buy the security. At higher prices, investors would want to sell the security. At the no-arbitrage price, investors want to neither buy nor sell the security.

The longer answer explains how the arbitrage works: If the security is priced in the market at a level above its present value, a market participant could sell the security, then invest the proceeds in a hypothetical **money-market-fund** (a fund which rolls over safe short-term securities and earns a return equal to the short-term interest rate). The investor can use the cash in the money-market fund to replicate the cash flows on the overvalued security by withdrawing from the fund an amount equal to each cash flow on each date a cash flow is paid. To replicate a cash flow of $c[k]$ on date $t[k]$, the investor needs to have deposited $\frac{c[k]}{(1 + \tfrac{y_N}{N})^{N \sm (t[k] - t[0])}}$ into the money market fund at date $t_0$ because this will allow the amount deposited to grow to exactly $c[k]$ over the period $t[k] - t[0]$, *assuming that the future short rate, which defines the reinvestment rate earned by the fund, is constant and equal to the yield to maturity on the fixed income security*.  Since the amount needed to replicate all of the cash flows is precisely the present value of the bond, the investor who sells the bond for more than its present value makes an arbitrage profit equal to the difference between the higher market value and the lower present value. If the investor only deposits the present value into the fund, while keeping the difference and spending it on personal consumption, he still has enough cash in the fund to replicate all of the cash flows on the security.

This arbitrage strategy also works in reverse.  If the market price is less than the present value, the investor can buy the asset and pay for it by withdrawing cash from a money market fund, then deposit all of the cash flows from the asset bought back into the fund as they are earned.  Eventually, the asset bought will mature, at which point the investor will have more cash in his money-market fund than he would have had otherwise, *assuming that the future short rate, which defines the reinvestment rate earned by the fund, is constant and equal to the yield to maturity on the fixed income security*.

If there are no arbitrage opportunities in this hypothetical market, *all securities should have exactly the same yield to maturity*, and this yield to maturity makes the NPV equal to zero.

A related arbitrage strategy makes it easy to show that if the yield to maturity on a coupon bond is equal to its coupon (divided by its par value of 100), then the present value of the bond is equal to its par value.  Suppose the bond has coupon rate $c$ and one coupon payment left. If it is priced at par, then I can buy it for 100, and I obtain $100 + c / N$ at maturity. If the yield to maturity is $y = c / 100.00$, then investing 100.00 in a money market fund paying rate $y$ gives me the same amount, $100 \sm (1.00 + y / N) = 100.00 + c / N$. I buy the bond at par two periods before maturity and sell it one period before maturity, then the same argument goes through because I have already shown that the price with one period to maturity is equal to the bond's par value.  Now I can go back one period at a time to show that the same result holds for every period:  Buying the bond at par, holding it for one period, then selling it at par gives converts the par value of 100 into $100 + c / N)$; putting 100 into a money market fund and holding for one period pays off $100 + 100 \sm y / N$.  These two quantities are the same if $y = c / 100$.


# Yield to maturity: practical meaning

Obvously, the hypothetical discussion in the paragraph above does not describe the real financial world:

1. Different securities have different yields to maturity corresponding to their market prices.

2. The short-term interest rate is not constant over time. Instead, it fluctuates randomly (unpredictably) over time. Even when central banks try to keep the short rate constant for specific intervals of time, central banks do in practice change the short rate from time to time.

3. There may be arbitrage opportunties available in the market.  In other words, securities may be **mispriced**.  In a more **efficient capital market**, the mispricing should be very small.

If the yield curve happens to be approximately flat at a given point in time, then comparing yields to maturity on similar securities is a good way to look for mispriced securities.  A security with a relatively high yield to maturity is likely to be underpiced, and a security with a relatively low yield to maturity is likely to be overpriced.

Most of the time, the yield curve is not flat. In this case, we can compare the valuation based on a non-constant yield curve model with a valuation base on a yield-to-maturity assumption:

$$
PV(y_N ; \vec{c}, \vec{t}) = \sum_{k=1}^K \frac{c[k]}{\left( 1 + \tfrac{y_N}{N} \right)^{N \sm (t[k] - t[0])}},
$$

$$
PV^{model}(\pi ; \vec{c}, \vec{t}) = \sum_{k=1}^K \e^{-f_{yc}^{model}(\pi, m[k]) \sm m[k]} \sm c[k],
\qquad \text{where} \qquad
m[k] = t[k] - t[0].
$$

A comparison of these two formulas reveals two differences:

1. The yield in the yield-to-maturity formula is based on a quotation convention of compounding $N$ times per year, but the interest rate in the yield-curve model is based on a quotation convention of continuous compounding.  To provide an apples-to-apples comparison, this issue can be dealt with either by converting the yield-to-maturity quotation convention to continuous compounding or by converting the yield-curve quotation convention to compounding $N$ times per year.  This requires solving an equation like
$$
\left( 1 + \tfrac{y_N}{N} \right)^{N} = \e^{-y_\infty \sm 1}
$$
for $y_N$ in terms of $y_\infty$ or vice versa.  We can think of this as a minor difference which is easy to deal with.

2.  In the yield-curve model, the rates used to discount cash flows at different dates are different because $f_{yc}^{model}(\pi, m[k])$ is a function of $m[k] = t[k] - t[0]$, but all of the rates in the various terms of the yield-to-maturity are the same value $y_N$.  This is a fundamental difference betwen a valuation based on a yield-curve model (with nonconstant term structure of interest rates) and a valuation based on yield to maturity (which "implies" a flat term structure of interest rates).

Given these differences, what is a useful intuitive financial interpretation of yield to maturity? I like to think of yield to maturity as a nonlinear way of "averaging together" the different interest rates on the yield curve to obtain one "summary interest rate" which gives the same valuation as would be obtained by discounting cash flows at different dates using different rates from the "correct" yield curve model. Of course, this averaging works differently for different securities; different securities will have different yields to maturity, even if their values correspond with those from a "correct" non-flat yield curve.  For example, if the yield cureve is upward sloping, then longer maturity securities will tend to have higher yields to maturity than shorter maturity securities.

How do we use yield to maturity to spot mispriced securities? Having corrected for quotation conventions, we must realize that yield differences result both from different temporal patterns of cash flows and from mispricing. To identify mispricing, we need to adjust for the differences in yield which might be implied by different maturities.  It may be difficult to do these calculations in one's head, but this is indeed how such calculations were frequently done in the past!



# Annuities

An **annuity** is loosely defined as a sequence of payments made at regular time intervals, such as annual, monthly, quarterly, or semi-annually. This loose definition leaves flexibility concerning whether the number of payments is known in advance or random and whether the payments are constant, time varying according to a fixed, nonstochastic formula, or uncertain. The term **annuity** has many different definitions in practice.

For these notes, I will focus attention on the simplest annuities, which have a fixed number of cash flows of fixed nominal size made at constant intervals of time.

Annuities occur frequently in finance:

1.  The coupons on a U.S. Treasury note or bond are annuities, with payments twice per year. The number of payments on a note is a contractually defined constant.

2.  The payments on standard 30-year, fixed rate mortgages are annuities which pay both principal and interest over time. In this case, the number of payments made may not be precisely known in advance because the homeowner may **prepay** (and perhaps **refinance**) the mortgage.

3. Retirees sometimes convert a nest-egg in a retirement account into a monthly annuity which continues as long as the beneficiary is alive. Obviously, the number of payments made is not known in advance since the date of death is random.



# Perpetuities

An annluity which continues forever is called a **perpetuity**. Conceptually, fixed-coupon bonds with infinite maturty are perpetuities. There are historical examples of perpetuities in finance. For example, the British government issued perpetuities called **consols**; these were bonds which payed fixed coupons forever and had no mandatory maturity date. Nowadays, perpetuities are mostly used in hypothetical examples to derive annuity formulas (like we are about to do).

If the yield to maturity on a perpetuity with coupon rate $c$ per year is $y$, what is the present value of the perpetuity?

There is more than one way to obtain the correct answer.

1.  Use the result that the yield to maturity on a bond priced at par is equal to its coupon.  Now let us assume that this result extends to bonds with infinite maturity; this will be the case if the present value of payments beyond period $t$ goes to zero as $t \rightarrow \infty$ because the value of a bond with very, very long maturity will converge to the value of a bond with infinite maturity due to the very distant cash flows having a vanishing present value. Then a bond with coupon equal to $100.00 \sm y$ will have a present value of $100.00$. If we multiply the coupon by a factor of $c / 100.00$, the present value will be multiplied by the same factor. We obtain the result that the present value of an annuity with coupon $c$ is $c / y$. This result depends on the quoted yield having a compounding convention corresponding to the number of coupon payments per year. For example, if the coupon is paid twice per year, then the value is $c / y_2$, where $y_2$ is the bond-equivalent yield corresponding to compounding twice per year. With continuous compounding, this result says

$$
\int_{m=0}^{\infty} c \sm \e^{-y \sm m} \mathrm{d}t = \frac{c}{y}.
$$

2. Use mathematical formulas for the sums of infinite serios. For $0 < \alpha < 1$, the sum of the series is given by

$$
\sum_{n=1}^{\infty} \alpha^n = \alpha + \alpha^2 + \alpha^3 + ... = \frac{\alpha}{1 - \alpha}
$$

This can be proved by beginning with the finite sum

$$
\sum_{k=1}^{K} \alpha^k \sm(1 - \alpha) = 1 - \alpha^{K+1}.
$$

Notice that the equality holds because all the terms except the first and last cancel, then notice that the "leftover" quantity satisfies $\alpha^{K+1} \rightarrow 0$ as $K \rightarrow \infty$.

Compare this with the present value formula for a perpetuity corresponding to a yield of $y$:

$$
PV = \sum_{n=1}^{\infty} \frac{\frac{c}{N}}{\left( 1 + \frac{y}{N} \right)^n}.
$$

Now define $\alpha = \frac{1}{1 + y / N}$ and apply the annuity formula to obtain

$$
PV = \frac{c}{N} \sm \frac{\frac{1}{1 + \frac{y}{N}}}{{1 - \frac{1}{1 + \frac{y}{N}}}} = \frac{c}{y}.
$$


# Present value of an annuity

The present value of an annuity of $K$ equal payments can be obtained by taking the inner product of the constant vector of cash flows with the vector of discount factors $(1 / (1 + y_N/N))^k$. In the days of pencil-and-paper calculations, this was far too burdonsome a calculation to be practical.  Fortunately, there are some simple mathematical shortcuts which make the calculation simpler.

Notice that an annuity is the difference between a perpetuity whose first payment is one period from now and a perpetuity whose first payment starts one period after the annuity ends. The present value of the annuity starting one period from today is $\frac{c}{y}$, and the present value of a perpetuity starting one period after the annuity ends is $\frac{1}{(\left( 1 + y / N \right)^K} \sm \frac{c}{y}$, assuming the annuity makes $K$ payments. The difference is the desired formula

$$
\begin{aligned}
PV_{\text{annuity}}(y_N, c, K, N) 
&= \sum_{k=1}^{K} \frac{\frac{c}{N}}{\left( 1 + \frac{y_N}{N} \right)^k} \\
&= \left(1 - \frac{1}{\left( 1 + \frac{y_N}{N} \right)^K} \right) \sm \frac{c}{y_N} = \text{Discrete Annuity Formula}.
\end{aligned}
$$

The key point here is that the second line defines an **anuity formula** which is much easier to calculate than the first line.

Using continuous compounding, this formula becomes

$$
\int_{m=0}^{T} c \sm \e^{-y_\infty \sm m} \mathrm{d}t
= \left( 1 - \e^{-y_\infty \sm T} \right) \sm \frac{c}{y_\infty}
= \text{Continuous Annuity Formula}.
$$

# Note on computation

What happens in the discrete and continuous annuity formulas above if $y_N$ or $y_\infty$ is exactly zero, close to zero, or even negative?  

If the yield to maturity $y_N$ or $y_\infty$ is exactly zero, the formulas imply division by zero, so the result seems mathematically meaningless.

If the yield to maturity $y_N$ or $y_\infty$ is negative, the present value of the *perpetuity* becomes infinite, so the annuity formula is obtained as the result of subtracting two infinite quantities, which is undefined. Fortunately, the annuity formulas happen to give the correct results when the yield to maturity is negative.

Computationally, when $y_\infty$ or $y_N$ is close to zero (whether positive or negative), there is a catastrophic loss of precision in calculating the numerator $1 - \e^{-y_\infty \sm T}$ because $\e^{-y_\infty \sm T} \rightarrow 1$, which leads to subtraction of numbers of approximately equal value. This is a big issue for numerical computations in any field of study, not just finance.  With **double precision** numbers of dtype *np.float64*, we obtain $1.00 - (1.00 + z) = 0.00$, not $1.00 - (1.00 + z) = z$ when $z$ is small.  Thus, when $y = 0$, we have the ambiguity of $0/0$ in the annuity formula. 


Here are examples illustrating failure of conventional calculations when $y$ is small.

Note that the function `np.expm1(x)`, which calculates $\e^x-1$, gives accurate results:

In [4]:
epsilon = np.array([10.0**(-n) for n in range(10, 20)])

df = pd.DataFrame({'epsilon' : epsilon, 
                  '-epsilon?' : 1.00 - (1.00 + epsilon),
                  'one?' : (1.00 - (1.00 + epsilon)) / epsilon, 
                 'exp(epsilon)-1 ~ epsilon?' : np.exp(epsilon) - 1.00,
                 'np.expm1(epsilon) = epsilon!' : np.expm1(epsilon)})

display(df) 

Unnamed: 0,epsilon,-epsilon?,one?,exp(epsilon)-1 ~ epsilon?,np.expm1(epsilon) = epsilon!
0,1e-10,-1e-10,-1.0,1e-10,1e-10
1,1e-11,-1e-11,-1.0,1e-11,1e-11
2,1e-12,-1.000089e-12,-1.000089,1.000089e-12,1e-12
3,1e-13,-9.992007e-14,-0.999201,9.992007e-14,1e-13
4,1e-14,-9.992007e-15,-0.999201,9.992007e-15,1e-14
5,1e-15,-1.110223e-15,-1.110223,1.110223e-15,1e-15
6,1e-16,0.0,0.0,0.0,1e-16
7,1e-17,0.0,0.0,0.0,1e-17
8,1e-18,0.0,0.0,0.0,1e-18
9,9.999999999999999e-20,0.0,0.0,0.0,9.999999999999999e-20


## Optional problem for self-study

In the cell above: 

1. How small is `epsilon` when dramatic loss of precision occurs?

2. How small is `epsilon` when the nonzero result becomes numerically equal to zero?

3. Does `np.expm1` provide an accurate answer?

#### Advice

You might think that catastrophic loss of precision is unlikely to occur because the above dataframe indicates that problems arise only for very small values of `epsilon`.  My experience is that if a problem might theoretically occur, then it will probably happen sooner than you think.  If it does, it may be difficult to debug. Therefore, it is best to recognize the problem early and deal with it before it occurs.

For example, small values of epsilon are sometimes used when calculating numerical derivatives, and numerical derivatives are common in financial calculations.

Numpy has the functions `np.expm1` and `np.log1p` to deal with this. These functions are very useful in finance calculations. Scipy also has the function `scipy.special.exprel`, which is useful when dealing with continuous compounding.

## Optional: Numerically robust annuity formula

With periodic compounding, we can write the discrete annuity formula

$$
PV_{\text{annuity}}(y_N, c, K, N) = \left(1 - \frac{1}{\left( 1 + \frac{y_N}{N} \right)^K} \right) \sm \frac{c}{y_N}
$$

as

$$
PV_{\text{annuity}}(y_N, c, K, N) = \frac{1 - \e^{-K \sm \log(1 + y_N / N)}}{y_N} \sm c.
$$

Now we can use the functions `np.expm1` and `np.logp1` to obtain a more numerically robust implementation of the annuity formula when $y_N$ is small and use `np.where` to deal with the case $y_N=0.00$.


In [5]:
# New version:

# Annuity formula without adjustment for rounding error or division by zero:

def pv_annuity_bad(yN, c, K, N):
    """Returns present value of K payments of c dollars per payment, paid with frequency N (N=12 ~ monthly), 
    given annualized yield yN. Number of years of payments is K / N."""
    pv = (1.00 - 1.00 / (1.00 + yN / N)**K) * c / yN
    return pv

# Annuity formula using np.expm1 and np.logp1 functions should give same result:
# This version should be more accurate.
# This formula also has adjustments to avoid division by zero.

def pv_annuity_robust(yN, c, K, N):
    """Returns present value of K payments of c dollars per payment, paid with frequency N (N=12 ~ monthly), 
    given annualized yield yN. Number of payments is K / N.
    The algorithm both makes small values of y slightly larger 
    and also uses np.log1p and np.expm1 to make calculations more numerically stable for small yN."""
    
    eps = 1.0e-50
    yNx = np.where(np.abs(yN) <= eps, eps, yN)
    z = K * np.log1p(yNx / N)
    res =  -np.expm1(-z) * c / yNx
    return res

c = 1.00
K = 360
N = 12
yN = np.array([10.00**(-n) for n in (list(range(1, 20)) + [30, 40, 50])] + [0.00])  # Division by zero!
#yN = np.array([10.00**(-n) for n in range(6, 50)])
#yN = np.linspace(start=-0.50000, stop = 0.50000, num=41, endpoint=True)

pv_bad = pv_annuity_bad(yN, c, K, N)
pv_robust = pv_annuity_robust(yN, c, K, N)

df = pd.DataFrame({'yN' : yN, 'pv_bad' : pv_bad,
                   'pv_robust' : pv_robust,
                   'ratio_minus_one' : (pv_bad / pv_robust - 1.00)
                  })

display(df)

%timeit -r 7 -n 10000 pv0 = pv_annuity_bad(yN, c, K, N)
%timeit -r 7 -n 10000 pv1 = pv_annuity_robust(yN, c, K, N)



  pv = (1.00 - 1.00 / (1.00 + yN / N)**K) * c / yN


Unnamed: 0,yN,pv_bad,pv_robust,ratio_minus_one
0,0.1,9.495902,9.495902,-5.551115e-16
1,0.01,25.908922,25.908922,-9.436896e-14
2,0.001,29.553253,29.553253,1.209033e-12
3,0.0001,29.95492,29.95492,6.549872e-12
4,1e-05,29.995488,29.995488,-1.268173e-10
5,1e-06,29.999549,29.999549,1.35314e-10
6,1e-07,29.999955,29.999955,2.80231e-09
7,1e-08,29.999998,29.999995,8.253678e-08
8,1e-09,30.000002,30.0,8.297907e-08
9,1e-10,30.000002,30.0,8.424454e-08


  pv = (1.00 - 1.00 / (1.00 + yN / N)**K) * c / yN


17.7 µs ± 587 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
20.1 µs ± 409 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


## Optional problem for self-study

In the above cell:

1. For what values are the errors in the functions f0() and f1() the greatest?

2. (There is no "right" or "wrong" answer to this question.) As a practical matter, what is the more important numerical issue: (A) avoiding rounding error, or (B) avoiding division by zero?

Comment:  When significant loss of precision occurs as a result of subtracting quantities of nearly equal value, results of calculations can appear to be incorrect. Finding the "error" which led to the incorrect result can be very difficult if you are not specifically looking for rounding error.


## Optional: Numerically accurate annuity formula with continuous compounding

A numerically accurate annuity formula with continuous compounding can be implemented using the function `scipy.special.exprel`, which calculates $(\e^z - 1) / z$ accurately for small $z$ and even avoids division by zero when $z = 0$.

In [6]:
def pv_continuous_robust(y, c, T):
    """Present value of an annuity of c dollars per year paid continuously for T years at yield y."""
    res = scipy.special.exprel(-y * T) * (c * T)
    return res

c = 10.00
T = 30
y = np.array([0.00, 1e-50, 0.000001, 0.06, 1.00]) 

pv = pv_continuous_robust(y, c, T)

df = pd.DataFrame({'y' : y, 'pv' : pv})

display(df)

%timeit -r 7 -n 10000 pv = pv_continuous_robust(y, c, T)


Unnamed: 0,y,pv
0,0.0,300.0
1,9.999999999999999e-51,300.0
2,1e-06,299.9955
3,0.06,139.116852
4,1.0,10.0


6.07 µs ± 60.4 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


## Optional problem for self-study

1. Does the function `scipy.special.exprel(z)` give the correct answer when $z = 0$? 

2. Does there appear to be a problem with rounding error?

## Calculating annuity payment from yield

Sometimes we are given the present value of and need to calculate the annuity payment corresponding to a given yield. It is easy to solve for $c$ in terms of the present value to obtain

$$
c = \frac{PV \sm y_N}{1 - \frac{1}{\left( 1 + \frac{y_N}{N} \right)^K}}.
$$

Note that $c$ is the annual payment; the payment each period is $c / N$.

## Example

You have a retirement nest egg of one million dollars. You buy an annuity which pays a fixed dollar amount per month for 20 years.  If you die before the 20-year period ends, your heirs continue to receive the annuity's monthly payments.

1.  Calculate the amount of your monthly payment if an annual yield to maturity of $r_{12}= 3.00$ percent is used to value the annuity.

2.  What is the present value of the remaining annuity payments after 10 years using the same yield to maturity (calculated as of 10 years from now)?

The solutions are *cmonthly* and *pv_left* in the next cell:

In [7]:
pv = 10**6
r12 = 0.0300 # annual rate to be compounded monthly
N = 12  # monthly compounding
years = 20
years_left = 10

cannual = pv * r12 / (1.00 - 1.00 / (1.00 + r12 / N)**(years * N))
cmonthly = cannual / 12
pv_left = (cannual / r12) * (1.00 - 1.00 / (1.00 + r12 / N)**(years_left * N))

print(f"{pv=}\n{r12=}\n{N=}\n{years=}\n")
print(f"{cannual=}\n{cmonthly=}\n{pv_left=}")



pv=1000000
r12=0.03
N=12
years=20

cannual=66551.71174247048
cmonthly=5545.975978539206
pv_left=574350.9948957138


## Problem for self-study

Consider a 30-year fixed-rate mortgage for 400,000 dollars with mortgage (coupon) rate $c=6.00$ percent per year (compounded monthly). 

1. Calculate the monthly payment on the mortgage (i.e., annuity value based on the assumption that the mortgage is priced at par).

2.  Create a dataframe with columns giving the monthly payment (a constant), the amount of principal owed after the monthyly payment is made, the amount of each month's payment which goes to paying down principal, and the amount of each month's payment which is interest.

3. Plot the principal amount owed as a function of the number of payments made.

Hints:

1. The amount of each month's payment which goes toward paying down principal is equal to the change in principal outstanding from month to month.

2. The amount owed after the last payment should be exactly zero (up to rounding error)!  The amount owed after the first month's payment is slightly less than 400,000 dollars because some of the first month's payment went to principal.

3. The division of payments between principal and interest can be calculated in more than one way: (A) Calculate the present value of the remaining payments each month, then see how this present value falls over time. (B) Start with 400,000 owed, calculate the interest owed for the first month, then credit the rest of the first month's payment to principal, which reduces the principal.  Now repeat this over and over each month based on the reduced principal each month. 


# Solving for yield to maturity as a function of asset price

While solving for asset prices (present value) as a function of yield to maturity is a straightforward closed-form calculation---either using an inner product with discount factors or an annuity formla---solving for yield to maturity as a function of asset price involves calculating an inverse function for which there is generally no closed-form solution.

Computationally, the difficulty of calculating the inverse value is simplified when all the future cash flows on the asset are positive because this makes the present value a monotonically decreasing function of the yield to maturity. (The first cash flow can be negative.)  This guarantees that a unique yield to maturity exists and also makes it easier to guess whether an estimated yield needs to be revised upward or downward.

The Python package Scipy has functions for solving equations of one variable. These functions can be used to calculate yield to maturity as a function of present value.

Let's create a general function for solving for the yield to maturity given a price for some cash flows:


First, define a function which calculates the cash flows on a coupon bond which pays interest only until the last period, at which the principal value is paid back. The cash flow at the initial period (current date) is a negative value equal to the price of the bond in the market:

In [8]:
def f_bond_cash_flows_and_dates(price, coupon, maturity, nfreq=2, par_value=100.00):
    """
    price = price of bond
    coupon = bond coupon paid on par value (e.g., coupon=4.00 if 4 percent coupon on par_value=100)
    maturity = date of principal payment
    nfreq = compounding convention (e.g., nfreq=12 for monthly payments), should be integer
    par_value = principal paid back in final payment
    
    returns:
    cf = np.array of cash flows, including negative cash flow of -price at t[0]
    t = np.array of dates when cash flows paid, t[0] = current date.
    """
    number_of_time_periods = maturity * Nfreq + 1
    interest_payment = coupon / nfreq
    t = np.linspace(start=0.00, stop=dtyp(maturity), num=number_of_time_periods, dtype=dtyp)
    cf = np.full(shape=(number_of_time_periods,), fill_value=interest_payment, dtype=dtyp)
    cf[0] = -price
    cf[-1] = cf[-1] + par_value
    return cf, t

Use this function to define cash flows and dates:

In [9]:
price = 100.00
coupon = 4.00 # percent of par_value per year
maturity = 2 # years
num_payments_per_year = 2  # bond equivalent yield
par_value = 100.00

cf, t = f_bond_cash_flows_and_dates(price=price, coupon=coupon, maturity=maturity, 
                                    nfreq=num_payments_per_year, par_value=par_value)

print(f"{cf=}\n{t=}")

cf=array([-100.,    2.,    2.,    2.,  102.])
t=array([0. , 0.5, 1. , 1.5, 2. ])


Now define a function to calculate the NPV of the cash flows given a yield to maturity:

In [10]:
# Function to calculate the net present value of cash flows 
# from yield to maturity and present value

def fnpvy(ytm, cf, t, nfreq=None):
    """
    ytm = yield to maturity (guessed at each iteration), e.g., 0.0300 for 3 percent annual yield
    cf = vector (np.array) of cash flows, including -price at t[0]
    t = vector (np.array) of dates when cash flows are paid
    nfreq = compounding convention used for yield (e.g., monthly implies nfreq=12)
    
    npv = net present value calculated from yield to maturity ytm
    """
    if nfreq == None or nfreq == np.inf:
        # continuous compounding
        df = np.exp(-ytm * (t - t[0]))
    elif type(nfreq) == int and nfreq > 0:
        # Two equivalent methods for calculating discount factors:
        df = 1.00 / (1.00 + ytm / nfreq)**(t * nfreq)
        #df = np.exp(-((t - t[0]) * nfreq) * np.log1p(ytm / nfreq))
    else:
        assert False, "nf should be None, np.inf, or integer!"
    npv = df @ cf
    return npv

Test this function:

In [11]:
#Test the functtion: NPV should be zero when price=100.00 and yield = coupon / 100:

y = coupon / 100.00
res = fnpvy(ytm=y, cf=cf, t=t, nfreq=num_payments_per_year) # result should be 100.00!

print(f"{y=}, {res=} should equal 0.00")

y=0.04, res=-8.881784197001252e-16 should equal 0.00


Now test two different root-finding algorithms:

In [12]:
# Test two different root-finding algorithms:

args = (cf, t, num_payments_per_year)

rootres = scipy.optimize.root_scalar(fnpvy, args, method=None, bracket=None,
                                    fprime=None, fprime2=None, x0=0.01, x1=0.10,
                                    xtol=1.0e-10, rtol=1.0e-10, maxiter=100, options=None)

#print(f"\nResults of root_scalar:\n{rootres=}")

(yb, rres) = scipy.optimize.brentq(fnpvy, 0.01, 0.10, args, 
                                   xtol=1.0e-10, rtol=1.0e-10, maxiter=100, 
                                   full_output=True, disp=False)

print(f"\nResults of brentq:\n{yb=}\n{rres=}")    



Results of brentq:
yb=0.040000000000881275
rres=      converged: True
           flag: 'converged'
 function_calls: 7
     iterations: 6
           root: 0.040000000000881275


Define a function to calculate yield to maturity from security price:

In [13]:
def calculate_yields_from_prices(prices, cf, t, nfreq):

    yas = []
    ybs = []
    ias = []
    ibs = []
    for pv in prices:
        #nf = np.inf
        cf[0] = -pv
        args = (cf, t, num_payments_per_year)

        rootres = scipy.optimize.root_scalar(fnpvy, args, method=None, bracket=None,
                                        fprime=None, fprime2=None, x0=0.00, x1=0.10,
                                        xtol=1.0e-10, rtol=1.0e-10, maxiter=100, options=None)

        try:
            (yb, rres) = scipy.optimize.brentq(fnpvy, a=-0.20, b=0.20, args=args, 
                                       xtol=1.0e-10, rtol=1.0e-10, maxiter=100, 
                                       full_output=True, disp=False)
        except:
            print("You may need to decrease a and increase b in scipy.optimize.brentq to get better bracketing interval")
            raise
            
        yas.append(rootres.root)
        ybs.append(yb)
        ias.append(rootres.iterations)
        ibs.append(rres.iterations)

    df = pd.DataFrame({'prices' : prices, 'ytm_root_scalar' : yas, 'ytm_brent' : ybs,
                       'iterations_root_scalar' : ias, 'iterations_brent' : ibs})
    df['diff'] = df['ytm_brent'] - df['ytm_root_scalar']
    return df


Use this function to calculate the yield-to-maturities corresponding to a list of different prices:

In [14]:
prices = [90.00, 95.00, 100.00, 105.00, 110.00]
df = calculate_yields_from_prices(prices, cf, t, nfreq=num_payments_per_year)    
display(df)    

Unnamed: 0,prices,ytm_root_scalar,ytm_brent,iterations_root_scalar,iterations_brent,diff
0,90.0,0.09615,0.09615,5,7,-2.320366e-14
1,95.0,0.067133,0.067133,6,7,-1.515871e-13
2,100.0,0.04,0.04,6,7,-5.035763e-13
3,105.0,0.014544,0.014544,5,7,-1.064622e-12
4,110.0,-0.009413,-0.009413,5,7,-1.595675e-12


## Points about above example

Notice the following about the example in the previous cells:

1. To set up an objective function to solve for yield, the yield `ytm` must be the first argument to the function.

2. The solver finds a value of `ytm` which makes the function `fnpvy` equal to zero.  Therefore the function `fnpvy` is defined so that the NPV is zero when the correct value of `ytm` is used. When the solver finds the value of `ytm` which makes the net present value `fnpv(ytm, c, m, nfreq)` equal to zero, it is finding the value of `ytm` which gives the future cash flows a present value equal to the initial price `-cf[0]`.

3. The example illustrates two solvers. The function `scipy.optimize.root_scalar` will pick an algorithm automatically, depending on which arguments are provided. The function `scipy.optimize.brentq` requires an initial pair of guesses which bracket the solution. If the pair of initial guesses do not bracket the solution (i.e., give one positive and one negative value since the desired solution is zero), an exception occurs. (You can verify this by changing the bracketing interval to make it very small.)

4. I have built up the columns of a dataframe by appending items to lists. In practice, this technique may work better (and be more computationally efficient) than first defining an empty dataframe with correct size, then filling in the values later.



## Notes on root finding algorithms

There are many algorithms for finding the roots of scalar functions.  Of course, the function needs to be monotonic to guarantee that there is at most one root.  If so, then an initial guess which **brackets** the root is guaranteed to work (theoretically), assuming that the function is continuous. Given that a solution makes the value of the function zero, a pair of guesses brackets the root if one guess makes the function positive and the other makes it negative.

The algorithms work by improving guesses iteratively, eventually converging to a solution, typically by improving a **bracketing interval**.

Here is a quick summary of how different scalar root-finding algorithms work:

1. The **bisection method** starts with two guesses which bracket the solution, then selects the next guess as the midpoint, obtains a smaller bracketing interval, and continues until the two guesses which bracket the solution are equal up to a given tolerance.

2. The **secant method** pretends that the function is the linear function defined by its values at the endpoints of the bracketing interval, solves the linear function for the value which makes it zero, then obtains a better bracketing interval.

3. **Newton's method** method assumes that the function is linear at a guessed value, with a slope inferred from the (perhaps numerical) derivative at the guessed value, then solves the linear function for the theoretical solution. This method converges very fast when the inital guess is close to being correct; it has **quadratic convergence**, which doubles the number of significant digits in the solution on each iteration. If the guess is not close enough to the solution, one iteration of Newton's method might produce a result worse than the guess itself.

4. Other methods use both first and second derivatives at a guessed point.

5. One can also solve equations by using an **optimizer** to minimize the value of, say, the squared error from a guessed solution.  Using an optimizer is typically not the most numerically efficient way to solve scalar equations.

There is more information about how optimizers and root finding algorithms work in the documentation for `scipy.optimize`. I recommend reading the Scipy User's Guide first, then the documentation in the Scipy API Reference.

In my personal experience, Scipy root-finding algorithms generally work well most of the time, especially when a tight bracketing interval is known, but there are occasionally problems associated with (A) the initial bracketing interval not actually bracketing the solution or (B) the function itself not being defined for all real numbers. These problems, even if they arise rarely, can be a nuisance to deal with.

# Duration

The **duration** of a fixed income security is defined as its **present-value-weighted average time to maturity**. Duration is expressed as an interval of time.  The duration on a **pure discount security** (**zero coupon bond**) is exactly the maturity of the security. For example, the duration of a 91-day Treasury bill is 91 days.  The duration of a promised cash flow five years from now is exactly five years.  The duration of a coupon bond is less than its maturity because the coupon payments occur before or on the maturity date.

In practice, there are two distinct ways of defining **duration**: a theoretically correct way based on a "correct" yield curve and an approximation based on yield to maturity.

If we think of yield to maturity as an implied average interest rate on a security with cash flows at potentially different dates, then duration is an implied average maturity of the cash flows, which occur at different dates.


## Exact duration = Theoretically correct Macaulay duration 

The theoretically correct definition is often called **Macaulay duration**, but I will call it **exact duration**, **theoretical duration**, **theoretically correct duration** or **theoretically correct Macaulay duration**. Suppose we are given a vector of cash flows $\vec{c} := (\vec{c}[1], \ldots, \vec{c}[J])$, a vector of time intervals $\vec{m} := (\vec{m}[1], \ldots, \vec{m}[J])$ defining when the cash flows are paid, and a "correct" continuously compounded yield curve function $r_\infty(t_0, m)$. The yield curve defines a vector of interest rates $\vec{r} := (\vec{r}[1], \ldots, \vec{r}[J])$ for each cash flow date: $\vec{r}[j] = r_\infty(t_0, \vec{m}[j])$. Since the yield curve may be upward sloping, downward sloping, hump shaped, etc., potentially different interest rates are used to calculate a discounted value for each cash flow.

The **exact duration** of the fixed income security is defined by the formula

$$
\text{Exact Duration} = D_{\text{exact}}(\vec{r}, \vec{c}, \vec{m})
:= \frac{\sum_{j=1}^{J} \vec{m}[j] \sm \vec{c}[j] \sm \e^{-\vec{r}[j] \sm \vec{m}[j]}}{\sum_{j=1}^{J} \vec{c}[j] \sm \e^{-\vec{r}[j] \sm \vec{m}[j]}}.
$$

We can see that this is the present-value-weighted average time interval to maturity by noticing that the present value of cash flow $j$ is $\vec{c}[j] \sm \e^{-\vec{r}[j] \sm \vec{m}[j]}$. Now write the duration as a weighted sum of the "maturities" $\vec{m}[j]$, where the vector of weights $\vec{w} := (\vec{w}[1], \ldots, \vec{w}[J])$ are the fractions of the total present value corresponding to the various cash flows at date $j$: 

$$
\text{Exact Duration} = D_{\text{exact}}(\vec{r}, \vec{c}, \vec{m})
:= \sum_{j=1}^{J} \vec{w}[j] \sm \vec{m}[j],
$$

where

$$
\vec{w}[j] = \frac{\vec{c}[j] \sm \e^{-\vec{r}[j] \sm \vec{m}[j]}}{\sum_{j=1}^{J} \vec{c}[j] \sm \e^{-\vec{r}[j] \sm \vec{m}[j]}},
\quad 
\sum_{j=1}^{J} \vec{w}[j] = 1.
$$

Exact duration cannot be calculated from the price of the security because information about the shape of the entire yield curve is needed.



## Yield duration = Approximate Macaulay duration

An approximation to exact duration is obtained when the interest rate from the correct yield curve $r(t_0, m)$ is replaced by the yield to maturity $y(t_0)$ corresponding to the price of the bond at date $t_0$. This gives an approximation to exact duration, which changes the weight given to each cash flow. 

In practice, there is inconsistency about terminology for this concept. It is is sometimes called **modified duration** and, with some inconsistency, is also called **Macaulay duration**, and the term **modified duration** is applied to something else. I like to call it **yield duration**, **yield-to-maturity duration**, or **approximate duration**. Why is the terminology inconsistent? There are two reasons:

1. Macaulay himself gave two definitions, which correspond to exact duration (**exact Macaulay duration**) and yield duration (**approximate Macaulay duration**).

2. Many finance professionals have a sloppy way of thinking about finance concepts. The sloppy thinking often reflects a practice of turning a simple mathematical concept into something complicated when the simple mathematical concept is not understood.

The formula for **yield duration** is

$$
\text{Yield Duration} = D_{\text{yield}}(y, \vec{c}, \vec{m})
:= \frac{\sum_{j=1}^{J} \vec{m}[j] \sm \vec{c}[j] \sm \e^{-y \sm \vec{m}[j]}}{\sum_{j=1}^{J} \vec{c}[j] \sm \e^{-y \sm \vec{m}[j]}}.
$$

Yield duration can also be expressed as an approximate present-value-weighted average time to maturity by using yield to maturity to obtain approximate discount factors $\e^{-y \sm m}$, which replace the exact discount factors $\e^{-\vec{r}[j] \sm m}$: 

$$
\text{Yield Duration} = D_{\text{yield}}(y, \vec{c}, \vec{m})
:= \sum_{j=1}^{J} \vec{w}[j] \sm \vec{m}[j],
$$

where

$$
\vec{w}[j] = \frac{\vec{c}[j] \sm \e^{-y \sm \vec{m}[j]}}{\sum_{j=1}^{J} \vec{c}[j] \sm \e^{-y \sm \vec{m}[j]}},
\quad 
\sum_{j=1}^{J} \vec{w}[j] = 1.
$$

Implementation of exact duration potentially requires knowledge about the entire yield curve. The advantage of yield duration is that it can be calculated from the security's price. Given a price for a security, first calculate the yield to maturity corresponding to that price.  Then calulate the duration from this yield.  This logic emphasizes that the duration changes with the security's price. If the price of a security falls, its yield rises.  This decreases the discount factor used to value each cash flow, but the discount factor for more distant cash flows falls percentage-wise by more than the discount factor for more nearby cash flows.  *The yield duration of the security becomes less when its price falls* because the more distant cash flows are a smaller percentage of its present value. (An exception, of course, is the duration of a pure discount security, which is always a constant equal to the maturity of the security.)


## Price sensitivity duration

Recall that the formula for the present value of a secuirty as a function of its yield to maturity $y$ is

$$
PV(y, \vec{c}, \vec{m})
:= \sum_{j=1}^{J} \vec{c}[j] \sm \e^{-y \sm \vec{m}[j]}.
$$

If we differentiate this formula with respect to yield $y$, we obtain

$$
\frac{\partial PV}{\partial y} = -\sum_{j=1}^{J} \vec{m}[j] \sm \vec{c}[j] \sm \e^{-y \sm \vec{m}[j]}.
$$

If we divide by the present value we obtain an expression exactly equivalent to **yield duration**:

$$
-\frac{1}{PV} \sm \frac{\partial PV}{\partial y} 
= \frac{\sum_{j=1}^{J} \vec{m}[j] \sm \vec{c}[j] \sm \e^{-y \sm \vec{m}[j]}}{\sum_{j=1}^{J} \vec{c}[j] \sm \e^{-y \sm \vec{m}[j]}}.
$$

The minus sign in the previous two formulas reflects the inverse relationship between net present value and yield to maturity: When they yield to maturity rises, the calculated present value falls.

Thus, yield duration can be equivalently defined using the derivative as in the above equation!
When yield duration is defined with the previous equation, it is often called **modified duration**. I will call it **price sensitivity duration**. Thus, I define **price sensitivity duration** as this alternative way of defining yield duration:

$$
\begin{aligned}
\text{Price Sensitivity Duration} 
&= -\frac{1}{PV} \sm \frac{\partial PV}{\partial y}  \\
&= -\frac{\partial \log(PV)}{\partial y} \\
&\approx -\frac{\text{Percentage change in price}}{\text{Change in yield}}.
\end{aligned}
$$

Why do we need two apparently different names, **yield duration** and **price sensitivity duration**, for the same concept?  If we dig a little deeper, we do find some differences:

1. The definition of price sensitivity duration is based on a continuously compounded yield. When calculating yield duration, the same duration number is obtained regardless of the compounding convention used because the present value of each cash flow will be unaffected by the compounding convention used.  When calculating price sensitivity duration, the compounding convention does matter because the derivative of one yield compounding convention with respect to another is not exactly one. The adjustment due to the derivative of one yield compounding convention with respect to another adds clutter to the definition of price sensitivity duration and adds confusion  to the discussion of these concepts.  The clutter is minimized when continuous compounding is the compounding convention.  This is yet another mathematical advantage of working with continuously compounded interest rates.

2. While yield duration is based on the hypothetical assumption that interest rates are constant and never change, the concept of price sensitivity duration can be implemented when the cash flows do change as a function of the yield. For example, cash flows on mortgages are not exactly fixed because the homeowner might **default** on the mortgage or **prepay** it. If defaults and prepayments change with yield in a predictable way, it is possible to apply the price sensitivity duration concept to a fixed income security whose cash flows are not precisely fixed but are instead assumed to be functions of the yield to maturity. This way of thinking can easily become unwieldly or internally inconsistent. Aa related approach is to use the empirically observed relationship between change in price and change in yields on say Treasury securities to define an **empirical duration** concept applicable to "fixed income" securities whose cash flows are not exactly fixed. This approach is a variation on yield sensitivity duration.




## (Optional) Compact mathematical notation

Sometimes it is useful to use compact vector or matrix notation to express financial concepts. For arbitrary vectors $\vec{x}$ and $\vec{y}$, let $\vec{x} \mm \vec{y}$ denote the inner product, let $\vec{x} \sm \vec{y}$ denote the element-by-element product, let $\e^{\vec{x}}$ denote element-by-element exponentiation, let $\vec{1}$ (bold 1) denote a vector of ones.  

Exact duration can be written

$$
\text{Exact Duration} = D_{\text{exact}}(\vec{r}, \vec{c}, \vec{m})
:= \vec{w} \t \mm \vec{m},
$$

where

$$
\vec{v} := \vec{c} \sm \e^{-\vec{r} \sm \vec{m}},
\quad
\vec{w} = \frac{\vec{v}}{\vec{v} \t \mm \vec{1}},
\quad 
\vec{w} \t \mm \vec{1} = 1,
$$

Yield duration can be written

$$
\text{Yield Duration} = D_{\text{yield}}(y, \vec{c}, \vec{m})
:= \vec{w} \t \mm \vec{m},
$$

where

$$
\vec{v} := \vec{c} \sm \e^{-y \sm \vec{m}},
\quad
\vec{w} = \frac{\vec{v}}{\vec{v} \t \mm \vec{1}},
\quad 
\vec{w} \mm \vec{1} = 1.
$$

In this notation, $\vec{v} := \vec{c} \sm \e^{-\vec{r} \sm \vec{m}}$ is the vector of present values for each cash flow; the present value of all the cash flows, which defines the present value of the asset itself,  is the scalar $\vec{c} \mm \e^{-\vec{r} \sm \vec{m}} = \vec{v} \mm \vec{1}$.

Price sensitivity duration can be defined as

$$
\begin{aligned}
\text{Price Sensitivity Duration} 
&= D_{\text{sensitivity}}(y, \vec{c}, \vec{m}) \\
&:= -\frac{1}{\vec{c} \t \mm \e^{-y \sm \vec{m}}} 
\sm \frac{\partial (\vec{c} \t \mm \e^{-y \sm \vec{m}})}{\partial y} \\
&= \frac{\vec{c} \t \mm (\vec{m} \sm \e^{-y \sm \vec{m}})}{\vec{c} \t \mm \e^{-y \sm \vec{m}}} \\
&= \frac{\vec{m} \t \mm (\vec{c} \sm \e^{-y \sm \vec{m}})}{\vec{c} \t \mm \e^{-y \sm \vec{m}}} 
\end{aligned}
$$

To understand the formulas above, you need to apply some concepts from linear algebra and multi-variable calculus.  The Python package Numpy has functions which implement element-by-element array functions and inner products in a manner consistent with this notation.

## Exercise for self-study: Describe how duration of a bond varies with coupon and maturity

Idea: High coupons shorten duration.  What about mortgages with constant monthly payments?

Duration as present-value-weighted average time to maturity versus elasticity of price with respect to yield.  Should be same if cash flows are fixed.  Becomes complicated with variable cash flows, including defaults, prepayments, and defaultable bonds.



# How to implement yield curve modeling

These notes have emphasized that yield to maturity and yield duration are intuitive approximations to more theoretically correct models which take into account the shape of the entire yield curve.  This same idea of using approximations to more theoretically correct concepts also applies to how the yield curve is used to mark securities to market, do risk management, or make trading decisions.

The basic idea of both approaches is to compare actual market prices or yields with the market prices or yields implied by a yield curve model. If the observed yield or price is different from the yield of price implied by the model, either the security is mispriced by the market or mispriced by the model. 

1. The security might be mispriced by the market because market participants lack capital to engage in arbitrage activities; capital constraints, taxes, or government regulations impede arbitrage activities; the central bank is intervening in the market; or market participants have an incorrect understanding about yields.

2. The security might be mispriced by the model because the model is "too smooth", the model is "not smooth enough", the model fails to consider relevant institutional details (taxes, regulations), or the model specification lacks the flexibility necessary to fit the term structure accurately.

Now let us consider how the following two approaches work:

1. Use an exact model of the term structure.

2. Use yield to maturity to construct an approximate model of the term structure.



## Exact yield curve model

1. An exact yield curve model is estimated by defining interest rates for different maturities as a function of model parameters, then estimating the model parameters to fit observed market prices. Care must be taken that the estimated model is neither to inflexible nor too flexible. A "good" estimated probably has a "smooth" shape.

2. Using the estimated yield curve, the model-implied present value of each security (calculated by discounting each cash flow at an interest rate which may depend on the timing of the cash flow) is compared with market prices. This approach can be used to identify mispriced securities.

3. To make the comparison of market prices with model-implied prices more intuitive, the yield to maturity corresponding to model-imlied prices and market prices can be compared.

This approach is difficult to implement because an entire estimated yield curve is needed.



## Approximate yield curve model

1. An approximate yield curve model is estimated by trying to fit the yields to maturity on many assets to a smooth curve, which defines approximate-model yield to maturity as a function of the maturity of the asset.

2. The yields to maturity corresponding to model-imlied and market prices are then compared to identify misprice securities.

This approach is easier to implement than an exact yield curve model because the yield curve can be fit to market yields as a function of maturity.

A perhaps substantial improvement to this simple approach can be implemented by not defining yield as a function of maturity but instead defining yield as a function of yield duration.  Yield duration is a better estimate of the timing of the cash flows than maturity precisely because it estimates the timing as a present-value-weighted average time to maturity.

In practice, the approximate yield curve may fit price almost as closely as the exact yield curve, especially if yield is estimated to be a function of duration.



In [15]:
timestamp = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
tfinish = timeit.default_timer()
print(f"Finished: {timestamp = }\nExecution time = {tfinish - tstart} seconds")


Finished: timestamp = '2023-09-22 23:45:30'
Execution time = 3.4046415999999997 seconds
