# GOVERNMENT BOND YIELD BIVARIATE ANALYSIS AND YIELD CURVE ANALYSIS
MODULE 1 | LESSON 4


---
|  |  |
|:---|:---|
|**Reading Time** 60 minutes |   |
|**Prior Knowledge** U.S. Treasury Bonds, Yield Curve, Linear Algebra, Basic Python |   |
|**Keywords** Bond price-yield curve, Risk free interest rate, Nelson Siegel, Cubic Spline|  |

---

*In the previous lesson, we introduced U.S. Treasury price yield curve and discussed the shape of the curve. We also introduced the methods to fit U.S. Treasury curves using polynomial fitting techniques. In this lesson, we will continue to learn new analytical methods to understand U.S. Treasury yields. First, we will explore bivariate relationships of different yields, including correlation and covariance. Then, we will learn a new technique to extract key features of the U.S. Treasury yield curve to analyze the yield curve's behavior.*

## **1. Bivariate Analysis**
**Bivariate analysis** is a collection of methods to analyze the relationship between two variables. In finance, we are particularly interested in understanding how two variables interact with each other. In this lesson, we'll look at the yields of different U.S. Treasury bonds and how they move in relation to each other. The first bivariate analysis tool we will learn is covariance.
<br>
<br>
### **1.1 Covariance**
**Covariance** is a metric to measure how two variables move together. Specifically, covariance calculates the amount of movement the two variables exhibit. Here is the covariance formula:
<br>
$$Cov(𝑋,𝑌)=𝐸[(𝑋−𝐸[𝑋])(𝑌−𝐸[𝑌])]$$
<br>
Here are some of the properties of covariance:
1. If the covariance has a positive sign, it means the two variables move in the same direction.
2. If the covariance has a negative sign, the two variables move in opposite directions.
3. If the covariance is 0, the two variables are linearly uncorrelated (uncorrelated).

The higher the absolute value of the covariance of the two variables, the stronger the (positive or negative) relationship the two variables have. Let's pull some U.S. Treasury yield data to demonstrate covariance between two different yields.
<br>

A covariance matrix is a bivariate analysis measure for two variables. However, when we have several variables and want to know the pairwise relationships of these variables, we'll need a covariance matrix. A **covariance matrix** is a square matrix that represents the pairwise covariances between multiple variables in a dataset. Now, let's pull U.S. Treasury yield data and investigate their covariance matrix.

In [1]:
import pandas as pd
from fredapi import Fred
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

In [2]:
# Initialize the FRED API with your key
fred = Fred(api_key='9054697d5de01c254dc114e6da492bc7') # Replace my APIKEY with "YOUR_API_KEY"

# List of Treasury yield series IDs
series_ids = ['DGS1MO', 'DGS3MO', 'DGS6MO', 'DGS1', 'DGS2', 'DGS3', 'DGS5', \
              'DGS7', 'DGS10', 'DGS20', 'DGS30']

# Function to get data for a single series
def get_yield_data(series_id):
    data = fred.get_series(series_id, observation_start="1975-01-01", observation_end="2024-05-03")
    return data

# Get data for all series
yields_dict = {series_id: get_yield_data(series_id) for series_id in series_ids}

# Combine into a single DataFrame
yields = pd.DataFrame(yields_dict)

# Rename columns for clarity
yields.columns = ['1 Month', '3 Month', '6 Month', '1 Year', '2 Year', '3 Year', '5 Year', \
                  '7 Year', '10 Year', '20 Year', '30 Year']

In [3]:
# Make datetime as the index
yields.index = pd.to_datetime(yields.index)

In [4]:
# Drop NaN in the dataset
yields = yields.dropna()

In [None]:
# Calculate covariance matrix for US Treasury yields in the dataset
covariance_matrix = yields.cov()
print("Covariance Matrix:")
print(covariance_matrix)

From the above covariance matrix for U.S. Treasury yields, we can see all pairwise yields have positive covariances. It means when the yield of one maturity increases, the other yields will also increase. We can visualize the covariance matrix with a heatmap in Python.

In [None]:
#Make a heatmap for covariance matrix
plt.figure(figsize=(8, 6))
sns.heatmap(covariance_matrix, annot=True, cmap='coolwarm', fmt=".1f")
plt.title('Covariance Heat Map of Treasury Bond Yields')
plt.show()

From the above covariance matrix heatmap, we can better read the differences of the covariance numbers. However, one downside of covariance is its value changes when the scales of two variables change. We cannot just compare different covariances for different pairwise variables to evaluate how close they move together. Because of this issue, we use correlation more often in data analysis.
<br>
<br>
### **1.2 Correlation**
Correlation is also a metric to measure the co-movement of the two variables. However, it eliminates the scale issue mentioned above by dividing covariance with the square root of the multiplication of the two variables' variances. Here is the correlation math formula:
<br>
<br>
$$Corr(X,Y)=\frac{Cov(X,Y)}{\sqrt{Var(X)*Var(Y)}}$$<br>
where $Cov(X,Y)$ is the covariance of $X$ and $Y$, $Var(X)$ and $Var(Y)$ are variances of $X$ and $Y$.
<br>
<br>
Here are some properties of correlation:

1. Unlike covariance, the value of correlation is limited between −1 and 1.
2. If the correlation of two variables is greater than 0, the two variables are positively correlated.
3. If the correlation of two variables is less than 0, the two variables are negatively correlated.
4. If two variables are perfectly positively correlated, the correlation will be 1.
5. If two variables are perfectly negative correlated, the correlation will be −1.
6. If the correlation is 0, the two variables are linearly uncorrelated.

Like with a covariance matrix, if we have several variables, we can use a **correlation matrix** to present correlation metrics for pairwise variables. Let's calculate the correlation matrix for our U.S. Treasury yield data.

In [None]:
# Calculate correlation matrix for US Treasury yields in the dataset
correlation_matrix = yields.corr()
print("Correlation Matrix:")
print(correlation_matrix)

Next, let's make a heatmap for the correlation matrix to better read the result.

In [None]:
# Make a heatmap for correlation matrix
plt.figure(figsize=(8, 6))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('Correlation Heat Map of Treasury Bond Yields')
plt.show()

From the above correlation matrix heatmap for U.S. Treasury yields, we can compare correlations among different pairwise yields. First, we can see the correlations on the diagonal cells are all 1s. It is easy to understand because all yields move perfectly in the same direction with themselves. Secondly, another interesting observation is that two yields that have close maturities have higher correlations. As the difference of the maturities of two yields widen, the correlation of the two yields gets lower.
<br>
<br>
Due to its easy-to-understand property, a correlation metric is a commonly used and crucial measurement for bivariate analysis. Understanding the correlation between different Treasury bond yields is crucial for risk management, portfolio construction, and understanding market dynamics. It helps quantify how different parts of the yield curve move in relation to each other, which is essential for strategies like yield curve trades and immunization in fixed-income portfolios.

### **2. Feature Extractions and Principal Component Analysis**
In this section, we are going to introduce a method to extract key features from a dataset. It is called **principal component analysis (PCA)**. Oftentimes, we try to discover key common factors or features that can explain the movement of all variables in a dataset. If we can find these key common factors that can represent most of the data variation in the dataset, we won't need to use the whole dataset for analysis. Instead, we can focus on analyzing these key factors. This is a data dimension reduction technique. It is very crucial that it can reduce the size of the dataset to make a lot of algorithms run more efficiently.  
<br>
In this section, we will continue to use the U.S. Treasury yield dataset to demonstrate this technique. We will show a step-by-step process for conducting PCA. Our first step is to standardize the scales of variables in our dataset.
<br>
<br>
### **2.1 Standardizing Variables**
**Standardizing (or normalizing)** a variable is a common statistical technique used to transform a variable into a standard scale. It converts a variable so that it has a mean of 0 and a standard deviation of 1. Here is the formula to standardize a variable:

$$Z=\frac{X-\mu}{\sigma}$$

Where:
<br>
$Z$ = standardized variable with mean = 0 and standard deviation = 1
<br>
$X$ = original variable
<br>
$\mu$ = mean of the original variable
<br>
$\sigma$ = standard deviation of the original variable

Here are the benefits of standardizing variables:
1. Making all variables comparable on the same scale
2. Facilitating calculation of the other analyses requiring same-scale variables

Now let's standardize the U.S. Treasury yield dataset. We first need to calculate the mean and standard deviation of each yield. Then, we will apply the above formula to our dataset to create a standardized (normalized) dataset.



In [None]:
#Calculate means for all yields in the dataset
yield_means = yields.mean()
print("Yield Means:")
print(yield_means)


In [None]:
#Calculate standard deviations for all yields in the dataset
yield_stds = yields.std()
print("Yield Standard Deviations:")
print(yield_stds)

In [None]:
# Now create a standardized US Treasury yield dataset
standardized_data = (yields - yield_means) / yield_stds
print("Standardized Yield (first 5 rows):")
print(standardized_data.head())


Usually, preparing a standardized dataset is part of the data preparation step before running the analysis. Make sure to standardize your dataset if the analysis you are going to run requires variables to be on the same scale.
<br>
Before moving on to the next step of the PCA, we need to learn new tools from linear algebra. In the next two sections, we will introduce eigenvectors and eigenvalues.
<br>
### **2.2 Eigenvectors and Eigenvalues**
In this section, we will learn the definitions of eigenvectors and eigenvalues and how to calculate them. Eigenanalysis is a very important linear algebra method in finance. Please review the required reading for this lesson to understand this topic. In the next section, we will use visualization to enhance our understanding of eigenvectors and eigenvalues.
<br>
### **2.3 Visualization of Eigenvectors and Eigenvalues**
In this section, we will use graphs to explain the concepts of eigenvectors and eigenvalues.
<br>

Vectors in a two-dimensional coordinate system are described by their magnitude and direction. A linear transformation occurs when we multiply a vector by a matrix. The transformation can change both the vector's magnitude and direction.
However, there are certain unique vectors that will maintain their direction or just switch to the opposite direction when a linear transformation happens. The only change is the magnitude of these vectors. These special vectors are called eigenvectors. Let's first revisit the equation between eigenvectors and eigenvalues from the last equation:
<br>
$$Ax=\lambda x$$
<br>
where:
<br>
$A$ is the linear transformation matrix
<br>
$x$ is eigenvector
<br>
$\lambda$ is the magnitude metric or eigenvalue
<br>
<br>
Let's first look at the left side of the equation. It corresponds to the linear transformation of a vector we described above. On the right side, the vector multiplies with a scalar. When the left-hand side of the equation equals the right-hand side of the equation, it means the linear transformation of the vector equals a magnitude change of the same vector. Based on what we described about an eigenvector, the vector in this equation is an eigenvector. Figure 1 below demonstrates this concept visually.
<br>
<br>
**Figure 1: Visual Demonstration of Eigenvectors and Eigenvalues**

![Graph showing a linear transformation of an (x, y) system with a circle transformed into an ellipse, highlighting eigenvector v_1 and its scaled version \lambda v_1](images/FD_M1_L4_fig1.jpg)
<br>
<br>


From the above figure, we can see the purple eigenvector preserves its direction after the linear transformation, which is shown by the orange vector. The transformation merely stretches or compresses the eigenvector, without changing its direction.
<br>
<br>
The eigenvalue associated with an eigenvector is the number by which the eigenvector is scaled during transformation.
   - If the eigenvalue is positive, the eigenvector is stretched in its original direction.
   - If negative, the eigenvector is stretched in the opposite direction.
   - If the eigenvalue is 1, the eigenvector remains unchanged.
   - If 0, the eigenvector is collapsed to a point.
When representing eigenvalues as vector lengths, the length of each eigenvector arrow can be scaled to represent its corresponding eigenvalue.

### **2.4 Derive Principal Components**
The next step of PCA is to find the covariance matrix of the standardized dataset. For a standardized dataset, its covariance matrix will be the same as the correlation matrix of the pre-standardized dataset. In the following Python code, we first import the necessary packages for matrix manipulation and for eigenvector/eigenvalue calculation. Then, we will get the covariance matrix for the standardized data and draw a heatmap for the covariance matrix.

In [None]:
import numpy as np
from numpy import linalg as LA

In [None]:
# Calculate covariance matrix of the standardized dataset
std_data_cov = standardized_data.cov()

In [None]:
# Draw a heatmap of the covariance matrix
plt.figure(figsize=(8, 6))
sns.heatmap(std_data_cov, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('Covariance Heat Map of Standardized Treasury Bond Yields')
plt.show()

Now let's calculate the eigenvectors and eigenvalues of the covariance matrix of the standardized yield dataset.

In [None]:
# Calculate eigenvectors and eigenvalues of the covariance matrix of standardized yield dataset
eigenvalues, eigenvectors = LA.eig(std_data_cov)
eigenvalues

In [None]:
eigenvectors

From the above output, we can see that the eigenvalues and corresponding eigenvectors are ordered in descending order by the values of eigenvalues. In PCA, we also call eigenvectors **loadings**. We will see why this is important later.
<br>

We can view this collection of all eigenvectors as a linear transformation matrix to the standardized dataset. The transformed data will have a very interesting feature that we will introduce soon. Let's transform the standardized data with eigenvectors first.


In [None]:
# Transform standardized data with Loadings
principal_components = standardized_data.dot(eigenvectors)
principal_components.columns = ["PC_1","PC_2","PC_3","PC_4","PC_5","PC_6","PC_7","PC_8","PC_9","PC_10","PC_11"]
principal_components.head()

The above result shows 11 transformed variables. In PCA, they are called **principal components**. Remember we mentioned earlier that the eigenvalues from PCA are in descending order. These principal components are also presented in the same order as their corresponding eigenvalues. For example, PC_1 corresponds to the first eigenvalue, which is 9.22. PC_2 corresponds to the second highest eigenvalue, which is 1.63.
<br>

The most important feature of PCA is **the leading principal components can explain higher portions of the variance of the dataset than the rest of the principal components**. For example, PC_1 can explain more variance in the standardized data than PC_2. But by how much? The corresponding eigenvalue for PC_1 is the variance of the whole data explained by PC_1. It is 9.22 in this case. By the same logic, PC_2 explains 1.63 of the total variance of the dataset. We can sum up all the eigenvalues to get the total variance of the data. From the total variance of the data, we can also calculate the percentage of variance contribution each principal component catches. Here is the Python code.


In [None]:
# Put data into a DataFrame
df_eigval = pd.DataFrame({"Eigenvalues":eigenvalues}, index=range(1,12))

# Work out explained proportion
df_eigval["Explained proportion"] = df_eigval["Eigenvalues"] / np.sum(df_eigval["Eigenvalues"])
#Format as percentage
df_eigval.style.format({"Explained proportion": "{:.2%}"})

From the above table, we can see PC_1 can explain almost 84% of the variance in the standardized data. PC_2 can explain almost 15% of the variance in the standardized data. The first two leading principal components can explain almost 99% of the variance in the dataset. The rest of the 9 principal components only explain 1% of the variance of the data. Hence, to make data analysis more efficient without losing too much information, we can just use the first two principal components for analysis instead of all 11 principal components. Due to this special feature, we also call PCA a data-dimension reduction technique.
<br>

PCA's application is very wide in financial analysis and machine learning. For example, for image analysis, the original collection of images may be large, but by using PCA, we can reduce the data size while still preserving the key features of the images.
<br>
<br>


### **2.5. Principal Component Analysis for U.S. Treasury Yield**
The great thing about using PCA to analyze the U.S. Treasury yield curve is that it can decompose the yield curve into a number of principal components (Oprea). The first three leading principal components describe the following features of the yield curve:
<br>
1. PC_1: the parallel shift of the yield curve (shift).
2. PC_2: the flattening or steeping of the yield curve (tilt).
3. PC_3: Curvature change of the yield curve (twist).
<br></n>

First, let's draw the yield curve as the benchmark for better comparison (Bjerring).

In [None]:
# Treasury Yield Curve
yields.plot(figsize=(12, 8), title='Figure 2, Treasury Yields', alpha=0.7) # Plot the yields
plt.legend(bbox_to_anchor=(1.03, 1))
plt.show()

Now, let's draw principal components. Let's start from PC_1.

In [None]:
# Plot PC_1
principal_components["PC_1"].plot(figsize=(12, 8), title='Figure 3, Principal Component 1', alpha=0.7)
plt.show()

From the above Figure 3 of PC_1, we can see the pattern looks similar to the patterns of Figure 2 (this graph starts from 2001 because we dropped NaN). This PC_1 is used to represent the yield curve parallel shift. Next, let's analyze PC_2. PC_2 is usually used to analyze how tilted the yield curve is. The following demonstration in Python code explains why PC_2 is used to analyze tilt.
<br>

First, let's calculate the difference of 2-year yield and 10-year yield as the yield curve tilt and draw a graph.

In [None]:
#Calculate slope (difference) of 2-year Treasury yield and 10-year Treasury yield
df_s = pd.DataFrame(data = standardized_data)
df_s = df_s[["2 Year","10 Year"]]
df_s["Tilt"] = df_s["2 Year"] - df_s["10 Year"]
df_s.head()

In [None]:
# Draw the graph of Slope of 2-Year Treasury Yield - 10-Year Treasury Yield
df_s["Tilt"].plot(figsize=(12, 8), title='Figure 4, Tilt of 2-Year Treasury Yield - 10-Year Treasury Yield', alpha=0.7) # Plot the yields difference
plt.show()

After drawing Figure 4 for the yield difference between the 2-year Treasury yield and 10-year Treasury yield, let's draw a graph for PC_2.

In [None]:
# Draw the graph for PC_2
principal_components["PC_2"].plot(figsize=(12, 8), title='Figure 5, Principal Component 2', alpha=0.7) # Plot the yields
plt.show()

From Figure 4 and Figure 5, we can see that the tilt of the 2-year Treasury yield and 10-year Treasury yield graph has a pattern similar to the PC_2 graph. Let's calculate the correlation to confirm this observation.

In [None]:
np.corrcoef(principal_components["PC_2"], df_s["Tilt"])

The above code shows that the correlation of PC_2 and Tilt is 97%, which is very high. Hence, we can use the PC_2 as a proxy to analyze how tilted the yield curve is.
<br>
<br>
Now let's draw PC_3, the change of curvature of the yield curve.

In [None]:
# Draw the graph for PC_3
principal_components["PC_3"].plot(figsize=(12, 8), title='Figure 6, Principal Component 3', alpha=0.7) # Plot the yields
plt.show()

From Figure 6, we can see that the change in curvature of the yield curve oscillates around 0.

## **3. Value at Risk for a Fixed-Income Portfolio**
In this section, we are going to use the feature extraction method we learned in the last section to calculate the Value at Risk of a bond portfolio.
<br>
<br>
### **3.1 Value at Risk (VaR)**
**Value at Risk (VaR)** is a statistical metric to measure the potential maximum loss of an investment or a portfolio at a given time period under certain a confidence level. For example, when a stock portfolio has a VaR of \$1 million during a day at 95% confidence level, it means that there is 95% probability that this stock portfolio will not lose over $1 million in a day.
<br>
VaR is an easy-to-understand risk metric and can be used to compare risks across different asset classes. VaR is usually used in risk management for portfolio management or regulartory reporting. It is also part of the metrics to set risk cap for traders. VaR can also be applied for capital allocation.
<br>
<br>
### **3.2 VaR for a Simple Treasury Bond Portfolio**
We will demonstrate how to calculate VaR for a simple Treasury bond portfolio in this section. This simple bond portfolio will only consist of 2-year Treasury bonds, 5-year Treasury bonds, and 10-year Treastury bonds. First, let's create a dataset with the yields of these three bonds and calculate the daily yield percentage change in these bonds.

In [None]:
# Create a dataset with 3 Treasury bond yields and calculate the yield changes
var_dataset = yields[["2 Year","5 Year","10 Year"]]
var_yield_chng_dataset = var_dataset.pct_change()
var_yield_chng_dataset = var_yield_chng_dataset.dropna()
var_yield_chng_dataset.head()



Next, we will prepare the dataset. We need to standardize the dataset first.


In [None]:
# Standardize the dataset
var_yield_chng_dataset_means = var_yield_chng_dataset.mean()
var_yield_chng_dataset_stds = var_yield_chng_dataset.std()
var_yld_chng_stnd_data = (var_yield_chng_dataset - var_yield_chng_dataset_means) / var_yield_chng_dataset_stds

Now we can calculate the eigenvectors and eigenvalues of the standardized dataset.

In [None]:
# Calculate eienvectors and eigenvalues and rank by eigenvalues
var_cov_matrix = var_yld_chng_stnd_data.cov()
eigenvalues, eigenvectors = np.linalg.eig(var_cov_matrix)
sorted_indices = np.argsort(eigenvalues)[::-1]
pca_components = eigenvectors[:, sorted_indices]

Let's check eigenvalues and how much variance of the data is explained by an eigenvector.

In [None]:
# Put data into a DataFrame
df_eigval = pd.DataFrame({"Eigenvalues":eigenvalues}, index=range(1,4))

# Work out explained proportion
df_eigval["Explained proportion"] = df_eigval["Eigenvalues"] / np.sum(df_eigval["Eigenvalues"])
#Format as percentage
df_eigval.style.format({"Explained proportion": "{:.2%}"})

From the above table, we can see the first two eigenvectors account for 97% of the variance in the dataset. Hence, we are going to select the first two eigenvectors for analysis.

In [None]:
# Choose number of components (e.g., 2)
n_components = 2
selected_components = pca_components[:, :n_components]

Next, let's assume that our simple bond portfolio consists of \$2 million in 2-year Treasury bonds, \$2 million in 5-year Treasury bonds, and \$1 million in 10-year Treasury bonds.

In [None]:
# Define a simple portfolio
portfolio = {
    2: 2000000,  # $2M in 2-year bond
    5: 2000000,  # $2M in 5-year bond
    10: 1000000  # $1M in 10-year bond
}

Next, we will calculate bond sensitivities in the portfolio. We assume the bond durations are the same as their maturity for simplicity. Then, we can calculate the portfolio value changes and VAR.

In [None]:
# Calculate portfolio sensitivities (assuming duration = maturity for simplicity)
sensitivities = np.array([maturity * amount for maturity, amount in portfolio.items()])

# Calculate portfolio value changes
portfolio_changes = (var_yield_chng_dataset*sensitivities) @ selected_components

# Calculate VaR
confidence_level = 0.95  # 95% VaR
var = -np.percentile(portfolio_changes, 100 * (1 - confidence_level))

print(f"1-day 95% VaR: ${var:,.2f}")

# Display summary statistics
print("\nSummary Statistics:")
print(f"Portfolio Value: ${sum(portfolio.values()):,.2f}")
print(f"VaR as % of Portfolio Value: {var / sum(portfolio.values()) * 100:.3f}%")

The above result shows that the 1-day VaR at 95% confidence level for our simple Treasury bond portfolio is $458,249. It is about 9% of the total portfolio value. The above example demonstrates how to use the feature extraction method to reduce the portfolio dataset and use the smaller dataset to calculate VaR.
<br>
<br>
## **4. Conclusion**
In this lesson, we first went through the basics of bivariate analysis. We explained what bivariate analysis is and then we introduced the concepts of covariance and correlation. We then moved on to applying the feature extraction method to analyze the Treasury yield curve. We learned how to standardize a dataset. We also learned what eigenvectors and eigenvalues are. Then, we moved to conduct feature extraction from the Treasury bond yield. Next, we also applied the feature extraction method to calculate the Value at Risk of a simple Treasury bond portfolio. These tools are fundamental for understanding more advanced financial theories.

**References**
<br>
* Bjerring, Thomas T. "The Yield Curve and Its Components." *Github*, 16 October 2019, https://bjerring.github.io/bonds/2019/10/16/the-yield-curve-and-its-components.html.
<br>
* Oprea, Andreea. "The Use of Principal Component Analysis (PCA) in Building Yield Curve Scenarios and Identifying Relative-Value Trading Opportunities on the Romanian Government Bond Market." *Journal of Risk and Financial Management*, vol. 15, no. 6, 2022. https://www.mdpi.com/1911-8074/15/6/247.


---
Copyright 2024 WorldQuant University. This
content is licensed solely for personal use. Redistribution or
publication of this material is strictly prohibited.
