# Systemic Risk Index (SRI) Creation

This notebook constructs the Systemic Risk Index (SRI) by applying Principal Component Analysis (PCA) to a set of key financial stress indicators.

The process involves the following steps:

1.  **Load Data**: The cleaned, weekly market data from the previous phase is loaded.
2.  **Select Inputs**: Three indicators are selected as inputs for the index:
    *   `VIX`: CBOE Volatility Index
    *   `MOVE`: Merrill Lynch Option Volatility Estimate (Treasury Volatility)
    *   `BAMLC0A0CM`: ICE BofA US Corporate Index Effective Yield
3.  **Standardize Inputs**: The selected indicators are standardized using `StandardScaler` to convert them to z-scores. This ensures each variable has a mean of 0 and a standard deviation of 1, preventing any single indicator from dominating the analysis due to its scale.
4.  **Apply PCA**: Principal Component Analysis is performed on the standardized data to identify the primary axis of shared variance among the indicators.
5.  **Extract & Orient PC1**: The first principal component (PC1) is extracted. We then check its component loadings to ensure that a higher index value corresponds to higher risk (e.g., a positive relationship with VIX). If the orientation is inverted, we multiply the component by -1.
6.  **Rescale to 0-100**: The final, oriented index is rescaled to a more intuitive 0-100 range using `MinMaxScaler`, where 0 represents the lowest systemic risk in the sample period and 100 represents the highest.

In [None]:
import pandas as pd
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.decomposition import PCA

In [3]:
# Load Data
df = pd.read_csv("../data/cleaned_market_data.csv", index_col=0, parse_dates=True)

In [None]:
# Get some basic information about the DataFrame
df.info()

In [None]:
# STEP 1: Select the data for the index
risk_factors = df[["VIX", "MOVE", "BAMLC0A0CMEY"]]

# STEP 2: Standardize the data
scaler = StandardScaler()
scaled_factors = scaler.fit_transform(risk_factors)

# STEP 3: Apply PCA
pca = PCA(n_components=1)
principal_component = pca.fit_transform(scaled_factors)

# STEP 4: Put the new index back into a DataFrame
sri_raw = pd.Series(
    principal_component.flatten(), index=risk_factors.index, name="SRI_raw"
)

# STEP 5: Interpretation & Verification
# We need to check if our index makes sense. Does high VIX lead to high risk?
loadings = pd.Series(pca.components_[0], index=risk_factors.columns)
print("PCA Component Loadings:")
print(loadings)

# STEP 6: Rescale the SRI if needed
# If you want to rescale the SRI to a 0-100 range, you can do so using MinMaxScaler
min_max_scaler = MinMaxScaler(feature_range=(0, 100))
sri_scaled = min_max_scaler.fit_transform(sri_raw.to_numpy().reshape(-1, 1))

# Finally, add it to your main DataFrame
df["SRI"] = sri_scaled

# You can then rescale it to 0-100 for easier dashboarding, but the raw version is what you use for correlation analysis.
print("\nFirst 5 values of the new Systemic Risk Index:")
print(df["SRI"].head())

In [31]:
# Save the updated DataFrame
df.to_csv("../data/systemic_risk_index.csv")