### Imports

In [22]:
import numpy as np
from scipy.interpolate import CubicSpline, interp1d
from scipy.optimize import newton
import matplotlib.pyplot as plt
import bond_pricing as bp

### Generalized Bootstrap Method to Determine the Yield Curve

#### Model Introduction

If an analyst were able to possess a sufficient quantity of reliable friction-free government bond data, a technique known as the bootstrap would straightforwardly derive the spot yield curve. This technique is based on the notion that individual coupon-paying bonds can be viewed as “packages” of pure discount bonds. For example, a three-year bond is comprised of six pure discount bonds, namely the five coupon payments to be paid every six months, and the final payment which is the sum of the final coupon and the return of principal. This suggests that a bond’s value can be viewed as either the present value of future cash flows discounted at the yield to maturity, or as the sum of the values of individual pure discount bonds, each of which is a present value of a cash flow discounted at its own time-specific yield.

The classic “textbook” explication of the bootstrap usually begins by assuming the existence of a set of perfectly spaced bonds: for example, a 6-month, a 12-month, an 18-month and so on. If these bonds have frictionless market prices, the bootstrap renders the correct yield curve in a straightforward fashion, as will be illustrated below.

The problem with the bootstrap is that it relies heavily on the existence of a suitable body of data. In particular there are two problems: illiquidity and missing data points. Quotes coming from a thin market may be at divergence from true market prices due to spreads and asynchronous trading. This is why applying the bootstrap to a raw sample of bonds is likely to lead to an unreasonably “choppy” yield curve. One approach is to perform some form of curve fitting to arrive at a reasonably smooth representation of the yield curve. This can be done either before or after utilizing the bootstrap. Nevertheless there is a problem with any smoothing procedure which is unavoidable: the final curve will always be a function of an assumed functional form. For example, the Nelson and Siegel procedure allows for a single hump while the Svensson approach allows for two humps.

There is another approach for eliminating inappropriate shapes, namely to make use of averaged yield data. For example, one could take all bonds in the neighborhood of the five-year maturity, average their yields and then use the resultant average as the yield on a hypothetical five-year par bond in the belief that some of the bonds comprising the average will be discount bonds and others will be premium bonds. This, in fact, will be the approach that we will employ in some examples discussed later.

As for the second problem with the bootstrap — the lack of a full set of data — this must be solved by imposing conditions on the intermediate points. Of course, one can never avoid the arbitrariness of the choice of various interpolation procedures. We argue below that a cubic spline is an appropriate choice.

The procedure described below is a generalized bootstrapping method. Using symbolic manipulation of algebraic expressions, it straightforwardly deals with any data set regardless of time spacing. Different interpolation assumptions can be accommodated. The illiquidity issue is dealt with by using highly liquid T-bills at the short end of the maturity and average yields beyond one year. 

More specifically, beginning with a series of bond value expressions for a given set of fixed income securities, one cannot — except under ideal conditions — solve for the yields corresponding to all payment dates, since there are more unknowns than equations. Each maturity date is accounted for by a single bond value expression. Generating equations for the points corresponding to the coupon dates, however, requires the use of an interpolation approach that involves the manipulation of symbolic quantities. By using python function (NumPy and SciPy), it becomes possible to perform the symbolic interpolation and generate the required algebraic equations. Once a sufficient number of algebraic equations are obtained, we then solve the resultant system numerically to obtain the points on the yield curve.

A key advantage of our approach is its simple “one-shot” nature. Unlike the methods used by McCulloch, Vasicek and Fong, and Svensson who make a *priori* assumptions about the form of the yield curve, we let the available data determine the exact form of the yield curve by solving a system of nonlinear equations. It is worth stressing that our simpler — yet, general — method requires the processing of certain symbolic quantities to interpolate the points on the yield curve corresponding to coupon dates. Such symbolic processing is only possible by using sophisticated computer algebra systems such as the sort mentioned above.

#### The Textbook Bootstrap

The bootstrap method — as discussed in standard texts — is used to solve sequentially a system of nonlinear equations which has at least one equation whose solution for a single unknown can be obtained in a straightforward manner. In this section we provide a brief review of the bootstrap method by utilizing a numerical example. The data on the maturity, coupons and prices of four bonds are given below.

**First Example**

| Bond No. | Time to Maturity (Years) | Annual Coupon (Dollars) | Bond Price (Dollars) |
| -------- | ------------------------ | ----------------------- | -------------------- |
| 1        | 0.50                     | 0                       | 94.90                |
| 2        | 1.00                     | 0                       | 90.00                |
| 3        | 1.50                     | 8                       | 96.00                |
| 4        | 2.00                     | 12                      | 101.60               |

* **Note**: We will be using continuously compounding yields in all cases 
* **Coupons**: Coupons are paid semi-annually

$$PV = FV \times e^{-rt}$$
$$r = \ln\left(\frac{FV}{PV}\right) \times \frac{1}{t}$$

Denoting the discount rate for a maturity of $t$ years by $r_{t}$, we can use $t = 0.50$ and $t = 1.00$ to solve $94.9 = 100e^{-r_{0.50}\times 0.5}$ and $90.0 = 100e^{-r_{1.00}\times 1.0}$ and obtain $r_{0.50} = 0.1047$ and $r_{1.00} = 0.1054$, respectively.

In [54]:
years = np.array([0.5, 1.0, 1.5, 2.0])
coupon = np.array([0, 0, 8, 12])
price = np.array([94.90, 90.00, 96.00, 101.60])
face = np.full(4, 100)
freq = 2

r1 = np.log(100/price[0]) / 0.5
r2 = np.log(100/price[1]) / 1.0

In [51]:
# Securities maturing in 0.5y and 1y are zero-coupons
cc_ytm = [r1, r2]

for i in range(2, len(coupon)):
    pv_cf = np.sum(np.full(int(years[i]*freq)-1, coupon[i]/freq) * np.exp(-np.array(cc_ytm) * years[:i]))
    cc_ytm.append(np.log((100 + coupon[i]/freq)/(price[i] - pv_cf)) / years[i])

cc_ytm = np.array(cc_ytm)
print("Implied Spot Rates (c.c.) = ",cc_ytm)

Implied Spot Rates (c.c.) =  [0.10469296 0.10536052 0.10680926 0.10808028]


In [57]:
pv_cf = (4*np.exp(-r1*0.5)) + (4*np.exp(-r2*1))
np.log(104 / (price[2] - pv_cf)) / 1.5

0.10680926388170525

In [48]:
# Using Discount Factors
df = [price[0]/100, price[1]/100]

for i in range(2, len(coupon)):
    pv_cf = np.sum(np.full(int(years[i]*freq)-1, coupon[i]/freq) * np.array(df))
    df.append((price[i] - pv_cf) / (100 + coupon[i]/freq))
    
df = np.array(df)
print("Discount Factors = ",df)

spots = df**-(1/years) - 1
print("Implied Spot Rates (s.a.) = ",spots)

spots_cc = np.log(1/df) / years
print("Implied Spot Rates (c.c.) = ",spots_cc)

Discount Factors =  [0.949      0.9        0.85196154 0.80560595]
Implied Spot Rates (s.a.) =  [0.11036963 0.11111111 0.112722   0.11413718]
Implied Spot Rates (c.c.) =  [0.10469296 0.10536052 0.10680926 0.10808028]


For the third bond which matures in 1.5 years there are three payments of $4, $4 and $104 at t = 0.50, 1:00 and 1.50, respectively. Since the discount rates at t = 0.50 and 1.00 are already available from the previous calculations, the rate $r_{1.50}$ for $t = 1.50$ can be computed by solving $96 = 4e^{-0.1047\times 0.5} + 4e^{-0.1054\times 1.0} + 4e^{-r\times 1.5}$ for $r_{1.50}$ which gives $r_{1.50} = 0.1086$. In a similar fashion, it is straightforward to calculate $r_{2.00}= 0.1081$.

**Second Example**

| Bond No. | Time to Maturity (Years) | Annual Coupon (Dollars) | Bond Price (Dollars) |
| -------- | ------------------------ | ----------------------- | -------------------- |
| 1        | 0.50                     | 0                       | 99.00                |
| 2        | 1.00                     | 0                       | 97.80                |
| 3        | 1.50                     | 4                       | 102.50               |
| 4        | 2.00                     | 5                       | 105.00               |

In [58]:
years = np.array([0.5, 1.0, 1.5, 2.0])
coupon = np.array([0, 0, 4, 5])
price = np.array([99, 97.80, 102.50, 105.00])
face = np.full(4, 100)
freq = 2

r1 = np.log(100/price[0]) / 0.5
r2 = np.log(100/price[1]) / 1.0

In [59]:
# Securities maturing in 0.5y and 1y are zero-coupons
cc_ytm = [r1, r2]

for i in range(2, len(coupon)):
    pv_cf = np.sum(np.full(int(years[i]*freq)-1, coupon[i]/freq) * np.exp(-np.array(cc_ytm) * years[:i]))
    cc_ytm.append(np.log((100 + coupon[i]/freq)/(price[i] - pv_cf)) / years[i])

cc_ytm = np.array(cc_ytm)
print("Implied Spot Rates (c.c.) = ",cc_ytm)

Implied Spot Rates (c.c.) =  [0.02010067 0.02224561 0.02284449 0.02416379]


Note that the bootstrap succeeded in both cases because there were four equations and four unknown yields. What if in our example we had a 2.75-year bond (with coupon payments at 0.25, 0.75, 1.25, 1.75, 2.25); or what if the 0.50-year T-bill did not exist? Although the textbook bootstrap can no longer be applied in such a case, the generalized bootstrap method outlined here, which we characterize next, can easily deal with such problems. This method can also automate the task of symbolically generating the interpolation equations that are crucial in determining the yield curve. Naturally, the generality of our method implies that simpler problems — easily amenable to the textbook bootstrap method — can also be solved in essentially one step instead of sequentially.

This methodology can also straightforwardly deal with extrapolation. This would, for example, be necessary if we had a 2.75-year bond, since the first cash flow would come before any of the known yields.

In the general model, the necessity to use natural spline interpolation requires the symbolic manipulation of certain quantities. The program we will use for this purpose is SciPy interpolate's cubic spline function `scipy.interpolate.CubicSpline` and NumPy's function `interp`, which automatically performs the symbolic computations necessary in the development of our model.

### Polynomial Spline & Exponential Spline Method

### Nelson & Siegel - Parsimonious Modeling of Yield Curves

### References

- [A generalized bootstrap method to determine the yield curve](https://www.tandfonline.com/doi/abs/10.1080/13504860010021162)
- [Bootstrapping Zero Curves - Rebrained!](https://rebrained.com/?p=23)
- [Bootstrapping (Finance) - Wikipedia](https://en.wikipedia.org/wiki/Bootstrapping_%28finance%29)

In [166]:
t = [0.25, 1.00, 1.50, 2.00, 2.75]

In [None]:
x = np.array([1,2,3])
y = np.array([3,1,2])

print(CubicSpline(x,y)(1.6))
print(interp1d(x,y)(1.6))
print(np.interp(1.6, x, y))

1.4399999999999995
1.7999999999999998
1.7999999999999998


In [83]:
from sympy import *
X = [1,2,3]
Y =  [3,1,2]
x = symbols('x')
li = interpolating_spline(1, x, X, Y)
li

Piecewise((5 - 2*x, (x >= 1) & (x <= 2)), (x - 1, (x >= 2) & (x <= 3)))

In [82]:
from sympy import *
X = [1,2,3]
Y =  symbols('y1 y2 y3')
x = symbols('x')
li = interpolating_spline(1, x, X, Y)
li

Piecewise((y1*(2 - x) + y2*(x - 1), (x >= 1) & (x <= 2)), (y2*(3 - x) + y3*(x - 2), (x >= 2) & (x <= 3)))