# Testing YieldFactorModels Filter Functionality

This notebook tests the filter functions in the YieldFactorModels.jl package.

## 1. Setup and Import

In [1]:
# Add the package to the environment
using Pkg
Pkg.activate(".")
Pkg.instantiate()


[32m[1m  Activating[22m[39m project at `~/Library/CloudStorage/OneDrive-VrijeUniversiteitAmsterdam/surfdrive/JuliaProjects/YieldFactorModels.jl`
[92m[1mPrecompiling[22m[39m project...
   2201.2 ms[32m  ✓ [39mYieldFactorModels
  1 dependency successfully precompiled in 4 seconds. 432 already precompiled.


## 1.5 Test Performance Improvements

The model has been optimized based on profiling data:
- ✅ Pre-allocated buffers for network outputs
- ✅ Pre-computed maturity transformations  
- ✅ In-place operations in `update_factor_loadings!`
- ✅ Reduced allocations in hot loops

Expected improvements: 15-30% faster, 50% fewer allocations

In [2]:
# Import the package and required dependencies
using Revise
using YieldFactorModels
using LinearAlgebra
using ForwardDiff
using Random

# NOTE: `export VAR=...` is a shell command and is not valid Julia syntax in a code cell.
# For runtime settings that can be changed from within Julia use `ENV` or library APIs.
# Set BLAS / native libraries thread knobs where possible:
ENV["OPENBLAS_NUM_THREADS"] = "1"
ENV["OMP_NUM_THREADS"] = "1"
ENV["MKL_NUM_THREADS"] = "1"  # If MKL.jl is used, prefer MKL.set_num_threads(1)
# Also set BLAS threads from Julia (affects LinearAlgebra.BLAS):
LinearAlgebra.BLAS.set_num_threads(1)

# Important: `JULIA_NUM_THREADS` controls Julia's worker threads and must be
# set before the Julia process / kernel is started. You cannot change the number
# of Julia threads from inside a running kernel. To run the kernel with 1 thread,
# start Jupyter / the kernel with the env var set (see instructions below).

Random.seed!(123)  # For reproducibility


TaskLocalRNG()

In [3]:
cd("..")

## 3. Test Individual Functions

### 3.1 Test `initialize_filter!`

In [4]:
println(pwd())

/Users/siccokooiker/Library/CloudStorage/OneDrive-VrijeUniversiteitAmsterdam/surfdrive/JuliaProjects


In [5]:
data, maturities = load_data("YieldFactorModels.jl/data/", "6")

([7.83403892981863 8.30247299311406 … 4.3268632601382855 4.155682521352292; 7.96778641185185 8.46780405699119 … 4.3328289043004515 4.159368653729067; … ; 10.9519681799619 11.657231239239 … 4.125693391252582 4.537209692896116; 11.371582727334 12.0250709896095 … 4.324326305225288 4.7829716342137685], [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0  …  30.0, 36.0, 48.0, 60.0, 72.0, 84.0, 96.0, 108.0, 120.0, 180.0])

### 3.2 Test `get_β_OLS!`

In [6]:
# Test the OLS estimation function
Z_test = randn(24, 3)
y_test = randn(24)
beta_test = zeros(3)
YieldFactorModels.get_β_OLS!(beta_test, Z_test, y_test)

### 3.4 Test Optimziation


In [22]:
float_type = Float64
model, model_type = YieldFactorModels.create_model("3SSD-NNS", maturities,24, 3, float_type, "YieldFactorModels.jl/results/thread_id__6/")
param_groups = YieldFactorModels.get_param_groups(model, String[])
all_params = YieldFactorModels.load_initial_parameters!(model, model_type, float_type)
YieldFactorModels.set_params!(model, all_params[:, 1])
# Load static parameters if applicable
all_params[:,1] = YieldFactorModels.load_static_parameters!(model, model_type, "YieldFactorModels.jl/results/", "6", all_params[:,1])
# Convert parameters to appropriate float type
all_params = convert(Matrix{float_type}, all_params)
results = predict(model, data[:, 1:end])


Optimisers.Restructure{Flux.Chain{Tuple{Flux.Dense{typeof(tanh), Matrix{Float64}, Vector{Float64}}, Flux.Dense{typeof(identity), Matrix{Float64}, Bool}}}, @NamedTuple{layers::Tuple{@NamedTuple{weight::Int64, bias::Int64, σ::Tuple{}}, @NamedTuple{weight::Int64, bias::Tuple{}, σ::Tuple{}}}}}
Default param groups assigned.


(preds = [6.836002938501072 7.0992843469451365 … 4.150331377141063 3.92760502542217; 7.929025938268129 8.189768005127775 … 4.119564537265799 3.897860697809863; … ; 10.845713973357347 11.63996258595033 … 3.858842547786029 4.22242469315709; 8.675261774378091 9.271342205410845 … 3.9259835662780045 4.3173510246318685], factors = [8.675261774378091 9.271342205410845 … 3.9259835662780045 4.3173510246318685; -1.8392588358770194 -2.1720578584657075 … 0.22434781086305824 -0.38974599920969866; 4.282504198060763 4.6735084642928735 … -0.13247547846904148 -0.18729848703345295], states = [-0.00869043760087783 -0.00869043760087783 … -0.00869043760087783 -0.00869043760087783; 0.007003267748465669 0.007003267748465669 … 0.007003267748465669 0.007003267748465669; … ; 0.3535718110369586 0.13674046637611081 … -0.2072905707676452 -0.14996731268904825; 0.04603488150938351 0.0005651201363267022 … 0.008286494568556213 0.009839559000467396], factor_loadings_1 = [1.0 1.0 … 1.0 1.0; 0.8053996830722817 0.86728523

In [24]:
# Alternative: Use BenchmarkTools for detailed timing
using BenchmarkTools

println("Benchmarking with BenchmarkTools...")
println("(This may take a minute...)\n")

benchmark_result = @benchmark YieldFactorModels.get_loss(
    $model, 
    $data, 
) samples=50 evals=3

display(benchmark_result)

println("\n" * "="^60)
println("Summary:")
println("  Minimum time: $(minimum(benchmark_result.times) / 1e9) seconds")
println("  Median time:  $(median(benchmark_result.times) / 1e9) seconds")
println("  Mean time:    $(mean(benchmark_result.times) / 1e9) seconds")
println("  Allocations:  $(benchmark_result.allocs)")
println("  Memory:       $(benchmark_result.memory / 1e6) MB")
println("="^60)

BenchmarkTools.Trial: 50 samples with 3 evaluations per sample.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m10.691 ms[22m[39m … [35m 16.269 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m 0.00% … 31.17%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m12.158 ms               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m18.48%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m12.237 ms[22m[39m ± [32m794.305 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m18.71% ±  4.03%

  [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂[39m [39m [39m▂[39m█[39m▅[39m█[34m▅[39m[32m▅[39m[39m▅[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▅[39m▁[39

Benchmarking with BenchmarkTools...
(This may take a minute...)


Summary:
  Minimum time: 0.010690569333333334 seconds
  Median time:  0.0121581945 seconds
  Mean time:    0.0122366436 seconds
  Allocations:  73302
  Memory:       38.949712 MB


### 3.4 Test Handling Missing Data

In [None]:
# set pwd one level back 

YieldFactorModels.run("6", 231, 12, false, "3SSD-NNS", Float64; window_type = "expanding",  max_group_iters=10, run_optimization=true, reestimate=false )
# vcat(fill("1", 22), fill("2", 12) )

Optimisers.Restructure{Flux.Chain{Tuple{Flux.Dense{typeof(tanh), Matrix{Float64}, Vector{Float64}}, Flux.Dense{typeof(identity), Matrix{Float64}, Bool}}}, @NamedTuple{layers::Tuple{@NamedTuple{weight::Int64, bias::Int64, σ::Tuple{}}, @NamedTuple{weight::Int64, bias::Tuple{}, σ::Tuple{}}}}}
Default param groups assigned.
The param groups are : ["1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2"]
✓ Found valid initial parameters after 0 perturbations

└ @ YieldFactorModels /Users/siccokooiker/Library/CloudStorage/OneDrive-VrijeUniversiteitAmsterdam/surfdrive/JuliaProjects/YieldFactorModels.jl/src/YieldFactorModels.jl:193




Starting block-coordinate optimization

--- Starting point 1/1 ---
Iter     Function value    √(Σ(yᵢ-ȳ)²)/n 
------   --------------    --------------
     0     1.737757e-01     3.038141e-03
 * time: 0.0001251697540283203
    10     1.737757e-01     6.713232e-04
 * time: 2.6737160682678223
    20     1.737757e-01     5.643142e-04
 * time: 5.680793046951294
    30     1.723667e-01     6.788667e-04
 * time: 7.72992205619812
    40     1.723667e-01     4.888665e-04
 * time: 10.412940979003906
    50     1.723667e-01     2.064324e-04
 * time: 11.15041995048523
    60     1.723667e-01     1.964524e-04
 * time: 11.910125970840454
    70     1.723667e-01     1.990757e-04
 * time: 12.330580949783325
    80     1.723667e-01     2.144394e-04
 * time: 12.397964000701904
    90     1.723667e-01     2.164436e-04
 * time: 12.457633972167969
   100     1.723667e-01     1.943745e-04
 * time: 12.52399206161499
   110     1.723667e-01     1.533252e-04
 * time: 12.587254047393799
   120     1.719992e

## 4. Test `get_mse` Function

Test the mean squared error calculation over the full dataset.

In [11]:
using CairoMakie
using DelimitedFiles

InterruptException: InterruptException:

In [12]:
model_name = "SD-NS"
# read /Users/siccokooiker/surfdrive/JuliaProjects/YieldFactorModels.jl/results/SD-NS/SD-NS__thread_id__6__factors_filtered_outofsample.csv
filtered_data = readdlm("/Users/siccokooiker/Library/CloudStorage/OneDrive-VrijeUniversiteitAmsterdam/surfdrive/JuliaProjects/YieldFactorModels.jl/results/$(model_name)/$(model_name)__thread_id__6__factors_filtered_outofsample.csv", ',')
# /Users/siccokooiker/surfdrive/JuliaProjects/YieldFactorModels.jl/data/thread_id__6__data.csv
data = readdlm("/Users/siccokooiker/Library/CloudStorage/OneDrive-VrijeUniversiteitAmsterdam/surfdrive/JuliaProjects/YieldFactorModels.jl/data/thread_id__6__data.csv", ',')
data = data'
# print shape
println(size(data))
println(size(filtered_data))

# plot first 3 factors over time using CairoMakie

f = Figure(resolution = (900, 500))
ax = Axis(f[1, 1], xlabel = "Time", ylabel = "Value", title = "Factors and Yield Curve Components")
lines!(ax, filtered_data[:, 1], label = "Factor 1")
lines!(ax, filtered_data[:, 2], label = "Factor 2")
lines!(ax, filtered_data[:, 3], label = "Factor 3")

# plot level, slope and curvature from data
# level is last column
lines!(ax, data[1:end-1, end], label = "Level")
# slope is difference between first and last column
lines!(ax, data[1:end-1, 1] .- data[1:end-1, end], label = "Slope")
# curvature: 2 x 14th column - (1st + last)
lines!(ax, 2 .* data[1:end-1, 14] .- (data[1:end-1, 1] .+ data[1:end-1, end]), label = "Curvature")

axislegend(ax, position = :rb)
f

UndefVarError: UndefVarError: `readdlm` not defined in `Main`
Suggestion: check for spelling errors or missing imports.
Hint: a global variable of this name may be made accessible by importing DelimitedFiles in the current active module Main

In [13]:


fac_path = "/Users/siccokooiker/Library/CloudStorage/OneDrive-VrijeUniversiteitAmsterdam/surfdrive/JuliaProjects/YieldFactorModels.jl/results/$(model_name)/$(model_name)__thread_id__6__factor_loadings_2_filtered_outofsample.csv"
mat_path = "/Users/siccokooiker/Library/CloudStorage/OneDrive-VrijeUniversiteitAmsterdam/surfdrive/JuliaProjects/YieldFactorModels.jl/data/thread_id__6__maturities.csv"

Z = readdlm(fac_path, ',', Float64)      # (T × M)
Y = vec(readdlm(mat_path, ',', Float64)) # (M)
T, M = size(Z)
t = 1:T

# --- Create wireframe ---
fig = Figure(resolution = (900, 600))
ax = Axis3(fig[1, 1];
    xlabel="Time",
    ylabel="Maturity",
    zlabel="Loading",
    aspect=(2.0, 1.0, 1.0),  # Make time axis 2× as long as Y-axis
    elevation=25 * π / 180,
    azimuth=45 * π / 180,
    xreversed=true,
    backgroundcolor=:gray20,
    xgridcolor=(:gray60, 0.5),
    ygridcolor=(:gray60, 0.5),
    zgridcolor=(:gray60, 0.5),
    xticklabelsize=10,
    yticklabelsize=10,
    zticklabelsize=10,
    xlabelsize=12,
    ylabelsize=12,
    zlabelsize=12,

)

# draw lines for each maturity
for m in 1:M
    lines!(ax, t, fill(Y[m], T), Z[:, m], color=:cyan)
end
fig

UndefVarError: UndefVarError: `readdlm` not defined in `Main`
Suggestion: check for spelling errors or missing imports.
Hint: a global variable of this name may be made accessible by importing DelimitedFiles in the current active module Main

In [14]:
# plot predictions /Users/siccokooiker/Library/CloudStorage/OneDrive-VrijeUniversiteitAmsterdam/surfdrive/JuliaProjects/YieldFactorModels.jl/results/NS/NS__thread_id__6__fit_filtered_outofsample.csv
fit_data = readdlm("/Users/siccokooiker/Library/CloudStorage/OneDrive-VrijeUniversiteitAmsterdam/surfdrive/JuliaProjects/YieldFactorModels.jl/results/$(model_name)/$(model_name)__thread_id__6__fit_filtered_outofsample.csv", ',')

# size is (480, 24)
println(size(fit_data))

# plot all 24 lines in one plot 
f2 = Figure(resolution = (900, 500))
ax2 = Axis(f2[1, 1], xlabel = "Maturity", ylabel = "Yield", title = "Yield Curve Fits Over Time")
for i in 1:size(fit_data, 2)
    lines!(ax2, 1:size(fit_data, 1), fit_data[:, i])
end
f2


UndefVarError: UndefVarError: `readdlm` not defined in `Main`
Suggestion: check for spelling errors or missing imports.
Hint: a global variable of this name may be made accessible by importing DelimitedFiles in the current active module Main

## 5. Test `predict` Function

Test the full prediction over all time periods.

In [15]:
# Test predict function
println("Testing predict function...")
println("This function should return:")
println("  - Factors (M × T matrix)")
println("  - States (L × T matrix)")
println("  - Predictions (N × T matrix)")
println("  All stacked vertically: (M+L+N) × T")

# Note: Call with your actual model
# results = predict(model, data)
# println("\nExpected output shape: ", (M+L+N, T))
# println("Actual output shape: ", size(results))

## 6. Visualization (Optional)

Visualize the predictions vs actual data.

In [None]:
# Using Plots.jl for visualization
using Plots

# Plot actual yield data
plot(1:T, data[1, :], label="Yield 1 (actual)", 
     xlabel="Time", ylabel="Yield", 
     title="Yield Curve Data", linewidth=2)
plot!(1:T, data[3, :], label="Yield 3 (actual)", linewidth=2)
plot!(1:T, data[5, :], label="Yield 5 (actual)", linewidth=2)

In [None]:
# After running predict, you can plot predictions vs actuals
# pred_start = M + L + 1
# predictions = results[pred_start:end, :]

# plot(1:T, data[1, :], label="Actual Yield 1", linewidth=2)
# plot!(1:T, predictions[1, :], label="Predicted Yield 1", 
#       linestyle=:dash, linewidth=2)
# title!("Actual vs Predicted Yields")

## 7. Performance and Diagnostics

In [None]:
# Calculate prediction errors
# residuals = data .- predictions
# mse_per_yield = mean(residuals.^2, dims=2)

# println("MSE per yield:")
# for i in 1:N
#     println("  Yield $i: ", mse_per_yield[i])
# end

In [None]:
# Plot factor evolution
# factors = results[1:M, :]

# plot(1:T, factors[1, :], label="Factor 1 (Level)", linewidth=2)
# plot!(1:T, factors[2, :], label="Factor 2 (Slope)", linewidth=2)
# plot!(1:T, factors[3, :], label="Factor 3 (Curvature)", linewidth=2)
# title!("Factor Evolution Over Time")
# xlabel!("Time")
# ylabel!("Factor Value")

## 8. Summary

This notebook demonstrates testing of the filter functionality. To fully run it, you'll need to:

1. Ensure your model type implements `AbstractYieldFactorModel`
2. Create a concrete model instance with appropriate parameters
3. Uncomment and run the actual function calls
4. Verify that all functions work correctly with your model structure