# Model Comparison for Dynamic Nelson Siegel Estimation

This notebook compares different estimation methods for the Dynamic Nelson Siegel model using the US Yield Curve data from 1972 to 2000.

In [1]:
import pandas as pd

import sys
import os

# Add the src/ directory to the Python path
sys.path.append(os.path.abspath("../src"))
# Load the data
url = 'https://www.dropbox.com/s/inpnlugzkddp42q/bonds.csv?dl=1'
df = pd.read_csv(url, sep=';', index_col=0)

# Display the first few rows of the dataset
df.head()

Unnamed: 0_level_0,M3,M6,M9,M12,M15,M18,M21,M24,M30,M36,M48,M60,M72,M84,M96,M108,M120
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
1972:01,3.382,3.782,3.995,4.12,4.442,4.595,4.714,4.804,5.1,5.331,5.479,5.718,5.971,6.007,6.026,6.041,6.088
1972:02,3.47,3.831,3.983,4.291,4.391,4.443,4.626,4.769,5.04,5.246,5.443,5.665,5.897,5.96,6.028,6.082,6.283
1972:03,3.874,4.463,4.663,4.937,5.141,5.317,5.466,5.528,5.59,5.774,5.875,5.999,6.119,6.11,6.096,6.084,6.269
1972:04,3.648,4.113,4.355,4.527,4.568,4.78,4.969,5.109,5.354,5.526,5.644,5.798,5.941,5.98,6.063,6.128,6.24
1972:05,3.835,4.232,4.446,4.631,4.63,4.76,4.855,4.947,5.178,5.382,5.563,5.715,5.894,5.937,5.996,6.042,6.249


## Approach 1: Cross-Sectional DNS Parameter Estimation

In this section, we will estimate the cross-sectional DNS parameters (βs) at each point in time and model their dynamics using a VAR.

In [2]:
import sys
import os
import matplotlib.pyplot as plt

# Add the src directory to sys.path
sys.path.append(os.path.abspath("../src"))
from dnss.models.cross_sectional_var import CSVAR

# split df into train and test sets
train_size = int(len(df) * 0.8)
train_df = df[:train_size]
test_df = df[train_size:]

model = CSVAR(fix_lambda=True)
dates = pd.to_datetime(train_df.index, format='%Y:%m')
maturities = [3, 6, 9, 12, 15, 18, 21, 24, 30, 36, 48, 60, 72, 84, 96, 108, 120]

model.fit(dates=dates, maturities=maturities, data=train_df)

yield_curves = model.forecast(steps=len(test_df), return_param_estimates=False)

test_df.index = yield_curves.index

# print(yield_curves.head())

# Combine dfs
act_and_pred = pd.concat([test_df, yield_curves], axis=1)

# print(act_and_pred.head())

# Yield MSE
total_mse = 0
for i, maturity in enumerate(maturities):
    mse = ((act_and_pred.iloc[:, i] - act_and_pred.iloc[:, i + len(maturities)]) ** 2).mean()
    total_mse += mse
    print(f"MSE for maturity {maturity} months: {mse:.4f}")

total_mse /= len(maturities)
print(f"Total MSE: {total_mse:.4f}")


[2025-04-21 15:35:35] [INFO] [dnss.models.cross_sectional_var] Starting parameter estimation with fixed lambda=0.4...


  params = params.fillna(0).infer_objects(copy=False)


[2025-04-21 15:35:36] [INFO] [dnss.models.cross_sectional_var] Fitting VAR model...
[2025-04-21 15:35:36] [INFO] [dnss.models.cross_sectional_var] Forecasting 70 steps ahead...
MSE for maturity 3 months: 3.0248
MSE for maturity 6 months: 5.2413
MSE for maturity 9 months: 6.2631
MSE for maturity 12 months: 6.2639
MSE for maturity 15 months: 6.1345
MSE for maturity 18 months: 6.2493
MSE for maturity 21 months: 6.3357
MSE for maturity 24 months: 6.4994
MSE for maturity 30 months: 6.3869
MSE for maturity 36 months: 6.3346
MSE for maturity 48 months: 6.1805
MSE for maturity 60 months: 6.2471
MSE for maturity 72 months: 5.9663
MSE for maturity 84 months: 5.8528
MSE for maturity 96 months: 5.7506
MSE for maturity 108 months: 5.8353
MSE for maturity 120 months: 6.1185
Total MSE: 5.9226


In [2]:
import sys
import os
import matplotlib.pyplot as plt

# Add the src directory to sys.path
sys.path.append(os.path.abspath("../src"))
from dnss.models.kalman_filter import TEMP_KALMAN

# split df into train and test sets
train_size = int(len(df) * 0.8)
train_df = df[:train_size]
test_df = df[train_size:]
dates = pd.to_datetime(train_df.index, format='%Y:%m')
maturities = [3, 6, 9, 12, 15, 18, 21, 24, 30, 36, 48, 60, 72, 84, 96, 108, 120]

result = TEMP_KALMAN(dates=dates, maturities=maturities, data=train_df)

Shape of y: (278, 17)
Type of y: <class 'numpy.ndarray'>
[[3.382 3.782 3.995 ... 6.026 6.041 6.088]
 [3.47  3.831 3.983 ... 6.028 6.082 6.283]
 [3.874 4.463 4.663 ... 6.096 6.084 6.269]
 ...
 [5.662 6.41  6.864 ... 7.741 7.743 7.742]
 [5.932 6.304 6.566 ... 7.514 7.501 7.56 ]
 [5.868 6.074 6.231 ... 7.162 7.148 7.133]]
Initial parameters: [-6.13418189e-16 -5.11181824e-17 -3.19488640e-17  5.00000000e-01
  9.50000000e-01  9.00000000e-01  8.00000000e-01  9.00000000e-01
  4.96714153e-03  0.00000000e+00  0.00000000e+00  0.00000000e+00
 -2.34153375e-03 -2.34136957e-03  0.00000000e+00  0.00000000e+00
 -4.69474386e-03  5.42560044e-03 -4.63417693e-03  0.00000000e+00
  2.41962272e-03 -1.91328024e-02 -1.72491783e-02 -5.62287529e-03
  1.43207616e+00  1.42359627e+00  1.40387056e+00  1.36117768e+00
  1.32771065e+00  1.31300994e+00  1.30056547e+00  1.27036863e+00
  1.21921018e+00  1.20034497e+00  1.15429064e+00  1.12311978e+00
  1.10236721e+00  1.07523498e+00  1.06224636e+00  1.06037989e+00
  1.02789

## Conclusion

This notebook has demonstrated the first approach for estimating the Dynamic Nelson Siegel model using cross-sectional parameters. Further comparisons with other methods will be conducted in subsequent sections.