# Guide Notebook

This notebook is designed to provide a guide for the replication project with sample of the data structure, key functionalities of our code, challenges we faced and the result of replication.


In [None]:
import config
from pathlib import Path
OUTPUT_DIR = Path(config.OUTPUT_DIR)

In [None]:
import numpy as np
from matplotlib import pyplot as plt

## Data Fetch:

There are two key data features we fetch. 

1) CDS Rates: Source: Markit on WRDS

2) Risk free rates: Fred website and FED website - 

        3-Month Treasury Constant Maturity Rate: https://fred.stlouisfed.org/series/DGS3MO
        6-Month Treasury Constant Maturity Rate: https://fred.stlouisfed.org/series/DGS6MO 
        Swap Rates: https://www.federalreserve.gov/data/yield-curve-tables/feds200628_

The data fetch process is automated. The CDS rates are fetched via WRDS queries to the SAS database server that hosts Markit tables. 

The risk free rates are fetched from the websites using pandas webreader. 


### CDS Data

The below snippet shows how we fetch the cds data from Markit. 

This fetched data has close to 6000 tickers. We fetch the parspread for each day available from 2001 to 2024. The paper uses data until 2016, however we extend to the whole period available. 

![Image Description](../assets/snip_1.png)


#### Processing Steps:

The paper states 20 CDS portoflios are created, however it does not offer details on the construction of the 20 portfolios using the 6000 tickers available. 

We propose the following process based on our research of the CDS returns calculation and He Kelly Manela's paper. This methodology is in consultation with the Professor Jeremey. 

1) Since the returns is calculated on a monthly basis, we resample the CDS data to monthly 

2) We propose to construct the 20 portfolios by splitting the set of CDS monthly rates for the 6000 tickets into 20 quantiles. This ensures a monthly rebalancing into quantiles. 

3) Once we have all 6000 tickers into 20 different quantiles for each month, we obtain one single value for each quantile which will form the CDS spread value for that particular portfolio. 

4) We combine the CDS spreads for all tickers within a quantile through three approaches and try all of them to see which might work best:

        a) Mean: A simple mean of the spreads within each portfolio 
        b) Median: Probably a more accurate representation of CDS returns within a portfolio of CDS products.
        c) Weighted: Calculate a weighted mean of the CDS spreasds within each portfolio.

Challenges: 

1) Lack of clarity on portfolio construction. We do not have a definite method of how the CDS spreads of all tickers are combined to form 20 different CDS portfolios. Fix: Try out multiple methods on constructing the portfolio. 

2) High volatility in 20th quantile. We notice very high volaitlity in the 20th quantile of CDS spreads. Although we expected the 20th quantile to be more volatile than the other 19 because the quantiles are constructed in such a manner, the values we noticed were notoriously high. Fix: Implement a smoothening method on the 20th quantile. 