## Modelling Notebook

**Layer**: Silver

**Domain**: Risk-free

**Action**: Modelling of the zero coupon bond forward rates using observable market data

In [0]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pyspark.sql import functions as F

from scipy.interpolate import CubicSpline
from scipy.optimize import minimize
from scipy.integrate import cumulative_trapezoid

In [0]:
# import user defined functions class
from src.utils import ModellingFunctions

## Section 1 - Load Inputs

In [0]:
# Set as a parameters here for now
as_of_date = '2025-06-30'
knots = [0.25, 1.0, 11.0625, 22.5833]
OCR_weight = 4e9
govtbond_cap_weight = 4e9
error_optimisation_method = 'L-BFGS-B'
long_term_rate=0.048
bridging_max_slope=0.0005
bridging_min_extension_years=10
bridging_extension_freq=0.25
curve_max_extension=100

In [0]:
# Select bonds to include based on series id
series_id_incl = [
    'INM.DP1.N',
    'INM.DG19.NS2605',
    'INM.DG19.NS2704',
    'INM.DG19.NS2805',
    'INM.DG19.NS2904',
    'INM.DG19.NS3005',
    'INM.DG19.NS3105',
    'INM.DG19.NS3205',
    'INM.DG19.NS3304',
    'INM.DG19.NS3405',
    'INM.DG19.NS3505',
    'INM.DG19.NS3605',
    'INM.DG19.NS3704',
    'INM.DG19.NS4105',
    'INM.DG19.NS5105',
    'INM.DG19.NS5405'
]

In [0]:
# Read cleaned and enriched inputs from silver layer
df_yields = spark.table("workspace.riskfree_silver.003_rbnz_yields_transformed") \
    .filter((F.col("date") == as_of_date) & (F.col("series_id").isin(series_id_incl)))

df_amounts = spark.table("workspace.riskfree_silver.002_nzdm_govtbonds_onissue_enriched") \
    .filter((F.col("as_of_date") == as_of_date) & (F.col("series_id").isin(series_id_incl)))

In [0]:
display(df_yields)
display(df_amounts)

Databricks visualization. Run in Databricks to view.

Databricks visualization. Run in Databricks to view.

## Section 2 - Pre-processing of data

In [0]:
# Convert spark to pandas for modelling
pdf_yields = df_yields.toPandas()
pdf_amounts = df_amounts.toPandas()

# Merge the dataframes based on series_id
merged_pdf = pdf_yields.merge(pdf_amounts, on='series_id', how='left')

# Replace term_yr with something linked to months
merged_pdf['term_yr'] = merged_pdf['term_mth_whole'] / 12

# Change term_yr 0 to 1/100 for OCR, as can't divide by 0
merged_pdf["term_yr"] = merged_pdf["term_yr"].replace(0, 1/12)

## Section 3 - Bootstrapping of zero-coupon forward rates

The one-month forward rate is determined from the one-month Treasury bill.

Nominal Government bonds are decomposed into maturity and individual coupon payments to produce a set of equivalent zero-coupon nominal bonds maturing on the 15th of the month

A forward rate is determined for the shortest nominal Government bond, for the period up until the first nominal bond matures. For the period between the first nominal bond and the nominal second bond a forward rate is determined so that the second nominal bond market value is equalled using the previous forward rates as well. This process is repeated to solve for each successive forward rate until all nominal bonds have been valued.

In [0]:
df_bootstrapped = ModellingFunctions.bootstrap_forward_columns(merged_pdf, term_col = "term_yr", spot_col = "yield_decimal")

display(df_bootstrapped)

In [0]:
# Plotting
plt.figure(figsize=(15, 6))
plt.scatter(df_bootstrapped["term_yr"], df_bootstrapped["yield_decimal"], color='blue', label='Spot Rate (pa)')
plt.step(df_bootstrapped["term_yr"], df_bootstrapped["fwd_rate_bootstrapped"], where='post', color='red', label='Forward Rate (bootstrapped)')

plt.xlabel('Term Year')
plt.ylabel('Rate')
plt.title('Spot Rates and Bootstrapped Forward Rates')
plt.legend()
plt.grid(True)
plt.show()

## Section 4 - Curve fitting and interpolation

The process is to fit a curve of forward rates to the zero-coupon portfolio of available bonds. The parameters of the fitted curve are determined by solving to minimize the least squares differences of the resulting fitted spot rates with the actual market spot rates. Two-, three- and six-month Treasury bill rates are used in addition to nominal Government bonds.

Market yields are weighted by the lesser of the amount available in the market, which excludes the amounts held by the Reserve Bank of New Zealand (RBNZ) and the Earthquake Commission (which is not usually traded) and $4 billion. This means that implied forward rates automatically give less weight to those bonds which represent a smaller proportion of the tradeable market.

The curve fitted is a cubic spline on the forward rates with 4 knots. This is fairly standard methodology with enough flexibility to fit most yield curves. There is some judgment involved in selecting the position of the knots, but this also gives a little flexibility to cope with any anomalies that may be present in the yield curve without changing the fundamental principles.

In [0]:
# Sort by term_yr and keep only values needed
df = merged_pdf.sort_values(by='term_yr')[["term_yr", "yield_decimal", "market_bonds_m"]].rename(columns={"yield_decimal": "spot_rate_pa"})

# Set a default high weight to T-bills and OCR with no weight
df["weight"] = np.where(
    df["market_bonds_m"].isnull(),
    OCR_weight,
    np.minimum(df["market_bonds_m"], govtbond_cap_weight)
)

df

Solve for Knot Parameters and Spline Function

In [0]:
spline, fitted_rates, res = ModellingFunctions.optimize_forward_curve_spline(
    df = df,
    knots = knots,
    error_func = ModellingFunctions.fit_cubic_forward_curve_error,
    bounds=(0.0001, 0.1),
    method=error_optimisation_method,
    options={"maxiter": 200}
)

Output Full Forward Curve

In [0]:
# Generate dense output
max_fit_term = df["term_yr"].max()
output_terms = np.arange(0, max_fit_term + 1/12, 1/12)
fwd_rates = spline(output_terms)

df_cubic = pd.DataFrame({
    "term_yr": output_terms,
    "fwd_rate_cubic": fwd_rates
})

df_all = pd.merge_asof(df_cubic, df_bootstrapped, on="term_yr", direction="nearest")
df_all = df_all.sort_values(by="term_yr").bfill()

# Merge on market observed spot rates
df_all = df_all.merge(df[["term_yr", "spot_rate_pa"]], on="term_yr", how="left")

display(df_all)

In [0]:
# Plotting
plt.figure(figsize=(15, 6))
plt.scatter(df_all["term_yr"], df_all["spot_rate_pa"], color='blue', label='Spot Rate (pa)')
plt.step(df_all["term_yr"], df_all["fwd_rate_bootstrapped"], where='post', color='red', label='Forward Rate (bootstrapped)')
plt.plot(df_all["term_yr"], df_all["fwd_rate_cubic"], color='green', label='Forward Rate (cubic)')

plt.xlabel('Term Year')
plt.ylabel('Rate')
plt.title('Spot Rates, Bootstrapped Forward Rates and Cubic Forward Rates')
plt.legend()
plt.grid(True)
plt.show()

## Section 5 - Bridging to long-term riskfree rate

Bridging is required from the last observable market data point, out to a long-term assumption. The methodology applies linear interpolation over 10 years from the maturity date of
the last nominal Government bond, subject to a maximum slope of 0.05% pa.

In [0]:
df_extended = ModellingFunctions.bridge_forward_curve_to_longterm(
    df_curve=df_all,
    term_col= "term_yr",
    fwd_col= "fwd_rate_cubic",
    long_term_rate=long_term_rate,
    max_slope=bridging_max_slope,
    min_extension_years=bridging_min_extension_years,
    extension_freq=bridging_extension_freq
)

# Expand to 100 years at 1/12 intervals
full_terms = np.arange(0, curve_max_extension, 1/12)

# Forward fill the forward rates
full_fwd_rates = np.interp(full_terms, df_extended["term_yr"], df_extended["fwd_rate_extended"])

# Final output
df_extended = pd.DataFrame({
    "term_yr": full_terms,
    "fwd_rate_extended": full_fwd_rates,
})

display(df_extended)

Databricks visualization. Run in Databricks to view.