# Homework 2

## FINM 37400 - 2025

### UChicago Financial Mathematics

* Mark Hendricks
* hendricks@uchicago.edu

***

In [1]:
# Import standard modules
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from sklearn.decomposition import PCA

import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

import os
import sys
from pathlib import Path


# BASE_DIR = Path(__file__).absolute().parent.parent # Uncomment for python files
BASE_DIR = os.path.dirname(os.getcwd()) # Comment for python files
sys.path.insert(0, str(Path(BASE_DIR) / 'utils'))

import config
import scheffer_quant.treasuries as tr
import scheffer_quant.bond_pair_trading as bpt

In [2]:
# Global variables
DATA_DIR = Path(config.DATA_DIR)
DATA_DIR.mkdir(parents=True, exist_ok=True)
COMPOUNDING_FREQ = 2

pd.options.display.max_columns = 30
pd.options.display.max_colwidth = 100
pd.set_option('display.float_format', lambda x: '%.4f' % x)

# 1 HBS Case: Fixed-Income Arbitrage in a Financial Crisis (A): US Treasuries in November 2008

## Data
* Use the data file `treasury_ts_2015-08-15.xlsx`.
* Examine the treasure issues with `kytreasno` of `204046` and `204047`. These are the bond and note (respectively) which mature on 2015-08-15.
* Look at the data on 2008-11-04.

In [3]:
FILE_DATE = '2015-08-15'
FILE_PATH = Path(config.DATA_DIR) / f"treasury_ts_{FILE_DATE}.xlsx"
ID_BOND = 204046
ID_NOTE = 204047
DATE0 = '2008-11-04'
DAYS_Y = 365.25

# Extract data:
data = pd.read_excel(FILE_PATH, sheet_name='database')
data_info = pd.read_excel(FILE_PATH, sheet_name='info', index_col=0)

data_filtered = data[(data['kytreasno'].isin([ID_BOND, ID_NOTE])) & (data['caldt'] == DATE0)].sort_values(by='caldt')
data_filtered.head(2)

Unnamed: 0,kytreasno,kycrspid,caldt,tdbid,tdask,tdnomprc,tdnomprc_flg,tdsourcr,tdaccint,tdretnua,tdyld,tdduratn,tdpubout,tdtotout,tdpdint,tdidxratio,tdidxratio_flg
4178,204047,20150815.2043,2008-11-04,105.9531,105.9844,105.9688,M,X,0.9355,0.0116,0.0001,2168.0166,20998.0,32470.0,0.0,,
5834,204046,20150815.1106,2008-11-04,141.8594,141.8906,141.875,M,X,2.3387,0.0097,0.0001,1910.3079,2852.0,4024.0,0.0,,


## 1.1 The situation

Make a chart comparing the issues in the following features, (as of Nov 4, 2008.)
* coupon rate
* bid
* ask
* accrued interest
* dirty price
* duration (quoted in years, not days, assuming 365.25 days per year.)
* modified duration
* YTM

In [4]:
summary_data = tr.get_summary_table(info=data_info, database=data, date=DATE0)
summary_data.drop(index=['clean price', 'maturity date', 'issue date'], inplace=True)
print("Summary of bond and note data: on 2008-11-04")
summary_data

Summary of bond and note data: on 2008-11-04


kytreasno,204046,204047
coupon rate,10.6250,4.2500
type,bond,note
bid,141.8594,105.9531
ask,141.8906,105.9844
accrued interest,2.3387,0.9355
dirty price,144.2137,106.9042
duration,5.2301,5.9357
ytm,0.0358,0.0324
modified duration,5.1383,5.8412


## 1.2 Hedge Ratio

Suppose you are building a trade to go long $n_i$ bonds (`204046`) and short $n_j$ notes (`204047`).

We can find an equation for $n_j$ in terms of $n_i$ such that the total holdings will have duration equal to zero. (Having zero duration also means have zero dollar duration, if helpful.)

Notation:
* $n_i$: number of bonds purchased (or sold)
* $D_i$: duration of bond $i$
* $D_{\$,i}$: dollar duration of bond $i$, equal to $p_iD_i$

If we want the total duration of our holdings to be zero, then we need to size the trade such that $n_i$ and $n_j$ satisfy,

$$0 = n_iD_{\$,i} + n_jD_{\$,j}$$

$$n_j = -n_i\frac{D_{\$,i}}{D_{\$,j}}$$

Suppose you will use $\$1mm$ of capital, leveraged 50x to buy $\$50mm$ of the bonds (`204046`).

Use the ratio above to short a number of notes (`204047`) to keep zero duration.

Report the number of bonds and notes of your position, along with the total dollars in the short position.

In [5]:
ID_LONG = 204046
ID_SHORT = 204047
SIZE_ASSET_LONG = 50e6
EQUITY_RATIO = 1/50

financing = pd.DataFrame(dtype=float,index=[ID_LONG,ID_SHORT])
financing['haircut'] = [EQUITY_RATIO, EQUITY_RATIO]
# financing['repo'] = [.0015,.0010]
financing

Unnamed: 0,haircut
204046,0.02
204047,0.02


In [6]:
balance_sheet, fmt = bpt.trade_balance_sheet(
                                prices=summary_data.loc['dirty price'],
                                durations=summary_data.loc['duration'],
                                haircuts=financing['haircut'],
                                long_asset=SIZE_ASSET_LONG,
                                key_long=ID_LONG,
                                key_short=ID_SHORT)
balance_sheet.style.format(fmt)

Unnamed: 0,equity,assets,contracts
204046,"$1,000,000.00","$50,000,000.00",346707.81
204047,"$-881,131.58","$-44,056,578.93",-412112.66


## 1.3 Profit Opportunity

Using the concept of **modified duration**, how much profit or loss (PnL) would you expect to make for every basis point of convergence in the spread? Specifically, assume the convergence is symmetric: the bond's (`204046`) ytm goes down 0.5bp and the note (`204047`) ytm goes up 0.5bp.

Describe the PnL you would expect to achieve on your position should this happen. Specify the PnL of the long position, the short position, and the net total.

Suppose the spread in YTM between the two securities disappears, due to a symmetric move of roughly ~17bps in each security's YTM. What is the PnL? (This is just a linearly scaling of your prior answer for a 1bp convergence.)

In [7]:
spread_convergence = 0.5 / 10000
pnl_spread_converges, fmt_dict = bpt.pnl_spread_convergence(
                                    spread_convergence=spread_convergence,
                                    modified_duration=summary_data.loc['modified duration'],
                                    price=summary_data.loc['dirty price'],
                                    n_contracts=balance_sheet['contracts'])
pnl_spread_converges.style.format(fmt_dict, na_rep='')



Unnamed: 0_level_0,ytm change,modified duration,price,DV01,num contracts,pnl
kytreasno,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
204046,-0.0025%,5.14,$144.21,0.07,346707.81,"$6,422.86"
204047,0.0025%,5.84,$106.90,0.06,-412112.66,"$6,433.57"
total,,,,,,"$12,856.42"


## 1.5 Examining the Trade through June 2009

Calculate the pnl of the trade for the following dates:
* 2009-01-27
* 2009-03-24
* 2009-06-16

Did the trade do well or poorly in the first six months of 2009?

Calculate the YTM spreads on these dates. Does the YTM spread correspond to pnl roughly as we would expect based on the calculation in 1.3?

In [8]:
SELECTED_DATES = ['2009-01-27', '2009-03-24', '2009-06-16']
data_backtest = data[(data['kytreasno'].isin([ID_BOND, ID_NOTE])) & (data['caldt'] >= '2009-01-01') & (data['caldt'] <= '2009-06-30')]
data_backtest = data_backtest.assign(dirty_price=data_backtest['tdnomprc'] + data_backtest['tdaccint'])

prices_ts = data_backtest.pivot_table(index='caldt', columns='kytreasno', values='dirty_price')
durations_ts = data_backtest.pivot_table(index='caldt', columns='kytreasno', values='tdduratn')  / 365.25
cpn_cash_flows_ts = tr.get_cash_flows(data=data_backtest, data_info=data_info, keys_list=[ID_BOND, ID_NOTE])

results_df, fmt = bpt.trade_backtest(
                prices_ts=prices_ts,
                durations_ts=durations_ts,
                key_long=ID_BOND,
                key_short=ID_NOTE,
                financing=financing,
                long_asset=SIZE_ASSET_LONG,
                frequency_rebal=1,
                cpn_cash_flows_ts=cpn_cash_flows_ts
            )
# FIlter indexes caldt = 2009-01-01, 2009-03-24, 2009-06-16:
display(results_df.loc[results_df.index.isin(SELECTED_DATES), ].style.format(fmt, na_rep=''))

Unnamed: 0_level_0,price long,price short,long cpn,short cpn,long,hedge ratio,short,price dff long,price dff short,long pnl,short pnl,net pnl,long ($),short ($),long equity,short equity,equity,margin call,capital paid in,return (init equity),return (avg equity)
caldt,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2009-01-27,155.23,116.12,0.0,0.0,322097.13,1.18,-379189.48,0.716372,0.542799,"$231,811.23","$-206,722.49","$25,088.73","$50,000,000.00","$-44,033,120.50","$1,000,000.00","$-880,662.41","$119,337.59","$-14,697.44","$94,672.82",21.10%,27.86%
2009-03-24,150.97,113.91,0.0,0.0,331194.28,1.18,-391916.63,-0.228462,-0.292947,"$-75,550.87","$114,526.77","$38,975.90","$50,000,000.00","$-44,643,598.00","$1,000,000.00","$-892,871.96","$107,128.04","$-55,934.25","$56,366.55",32.78%,41.92%
2009-06-16,146.3,108.09,0.0,0.0,341760.11,1.2,-410439.69,0.341851,0.238303,"$117,104.61","$-98,012.45","$19,092.16","$50,000,000.00","$-44,365,434.00","$1,000,000.00","$-887,308.68","$112,691.32","$5,073.25","$87,376.86",16.06%,21.21%


In [9]:
spread_ytm_ts = bpt.get_spread_bps(database=data_backtest, id_ref=ID_LONG)
spread_diff_ts = spread_ytm_ts.diff()
ytm_ts = data_backtest.pivot_table(index='caldt', columns='kytreasno', values='tdyld')  / 365.25
modified_duration_ts = durations_ts * (1 + ytm_ts)

pnl_comparison = pd.DataFrame(results_df.loc[SELECTED_DATES, 'net pnl'])

for date in SELECTED_DATES:
    pnl_spread, fmt_dict = bpt.pnl_spread_convergence(
                                    spread_convergence=-spread_diff_ts.loc[date].values[0] / 2 / 10000,
                                    modified_duration=modified_duration_ts.loc[date],
                                    price=prices_ts.loc[date],
                                    n_contracts=results_df.loc[date, ['long', 'short']].rename(index={'long': ID_LONG, 'short': ID_SHORT}))
    pnl_comparison.loc[date, 'spread pnl'] = pnl_spread.loc['total', 'pnl']

pnl_comparison['$ difference'] = pnl_comparison['spread pnl'] - pnl_comparison['net pnl']

pnl_comparison.style.format('${:,.2f}')    

Unnamed: 0_level_0,net pnl,spread pnl,$ difference
caldt,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2009-01-27,"$25,088.73","$12,097.48","$-12,991.25"
2009-03-24,"$38,975.90","$19,223.04","$-19,752.87"
2009-06-16,"$19,092.16","$9,338.36","$-9,753.80"


In [10]:
results_df.index = pd.to_datetime(results_df.index, format='%Y-%m-%d')
prices_ts.index = pd.to_datetime(prices_ts.index, format='%Y-%m-%d')
cpn_cash_flows_ts.index = pd.to_datetime(cpn_cash_flows_ts.index, format='%Y-%m-%d')

results_df['unhedged pnl'] = results_df['long'].shift(1) * (prices_ts[ID_LONG].diff() + results_df['long cpn']).fillna(0)
results_df.rename(columns={'net pnl': 'net hedged pnl'}, inplace=True)

# Create figure with secondary y-axis
fig = go.Figure()

# -- Cumulative Hedged P&L Trace
fig.add_trace(
    go.Scatter(
        x=results_df.index,
        y=results_df['net hedged pnl'].cumsum(),
        mode='lines',
        name='Cumulative Hedged P&L',
        yaxis='y'
    )
)

# -- Cumulative Unhedged P&L Trace
fig.add_trace(
    go.Scatter(
        x=results_df.index,
        y=results_df['unhedged pnl'].cumsum(),
        mode='lines',
        name='Cumulative Unhedged P&L',
        yaxis='y'
    )
)

# -- Spread YTM Trace (on a secondary y-axis)
fig.add_trace(
    go.Scatter(
        x=spread_ytm_ts.index,
        y=spread_ytm_ts.iloc[:, 0],  # or specify the exact column if needed
        mode='lines',
        name='Spread YTM',
        yaxis='y2'
    )
)

# Update layout with dual y-axes
fig.update_layout(
    title=f'Cumulative P&L and Spread YTM between Bonds and Notes (ID: {ID_LONG} and {ID_SHORT})',
    xaxis=dict(
        title='Date',
        type='date'
    ),
    yaxis=dict(
        title='Cumulative P&L ($)',
        tickformat=",0f",
        zeroline=True,
        zerolinecolor='darkgray',
        zerolinewidth=1,
        gridcolor='lightgrey'
    ),
    yaxis2=dict(
        title='Spread YTM',
        overlaying='y',   # overlay the second y-axis on the first
        side='right'      # place it on the right
    ),
    legend=dict(
        x=0.5,
        y=-0.2,
        xanchor='center',
        yanchor='top',
        orientation='h',
        bordercolor="Gray",
        borderwidth=1
    )
)

# Show the figure
fig.show()


***

# 2 Hedging Duration

Use data from `../data/treasury_ts_duration_2024-10-31.xlsx`.

The file contains time-series information on two treasuries. Observe the info of the securities with the following code:


In [11]:
QUOTE_DATE = '2024-10-31'
FILE_PATH = Path(config.DATA_DIR) / f"treasury_ts_duration_{QUOTE_DATE}.xlsx"

# Extract data:
data = pd.read_excel(FILE_PATH, sheet_name='database')
data_info =  data.drop_duplicates(subset='KYTREASNO', keep='first').set_index('KYTREASNO')
data_info[['type','issue date','maturity date','cpn rate']]

Unnamed: 0_level_0,type,issue date,maturity date,cpn rate
KYTREASNO,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
207391,note,2019-08-15,2029-08-15,1.625
207392,bond,2019-08-15,2049-08-15,2.25


You will largely focus on the sheets which give the timeseries of prices and durations for each of the two securities, as shown in the following code.

In [12]:
price = pd.read_excel(FILE_PATH,sheet_name='price').set_index('quote date')
duration = pd.read_excel(FILE_PATH,sheet_name='duration').set_index('quote date')
first_date = duration.index[0]

display(price)
display(duration)

Unnamed: 0_level_0,207391,207392
quote date,Unnamed: 1_level_1,Unnamed: 2_level_1
2019-08-09,98.8828,99.7891
2019-08-12,99.7969,102.5547
2019-08-13,99.2812,101.8672
2019-08-14,100.4062,105.1797
2019-08-15,100.8828,106.2344
...,...,...
2024-11-22,88.7402,63.7227
2024-11-25,89.2871,65.3789
2024-11-26,89.2148,65.1758
2024-11-27,89.4375,65.6562


Unnamed: 0_level_0,207391,207392
quote date,Unnamed: 1_level_1,Unnamed: 2_level_1
2019-08-09,9.2895,22.0001
2019-08-12,9.2855,22.1185
2019-08-13,9.2803,22.0843
2019-08-14,9.2828,22.2285
2019-08-15,9.2822,22.2709
...,...,...
2024-11-22,4.5394,17.2055
2024-11-25,4.5320,17.3123
2024-11-26,4.5291,17.2955
2024-11-27,4.5267,17.3254


### 2.1.

Suppose you have a portfolio of `10,000` USD long in security `207391` on the first day of the sample.

If you want to manage interest rate exposure using duration, how large of a short position should you hold in `207392`?

In [13]:
SIZE_EQUITY_LONG = 10000
ID_LONG = 207391
ID_SHORT = 207392

In [14]:
short_position = SIZE_EQUITY_LONG * (duration.loc[first_date, ID_LONG] / duration.loc[first_date, ID_SHORT])
print(f'To hedge ${SIZE_EQUITY_LONG:,} in {ID_LONG}, short ${short_position:,.2f} in {ID_SHORT}.')

To hedge $10,000 in 207391, short $4,222.48 in 207392.


### 2.2.

Step through the time-series, doing the following:

* Starting at the end of the first day, set the hedged position according to the relative given durations.
* Use the second day's price data to evaluate the net profit or loss of the hedged position.
* Reset the the hedged position using the end-of-second-day durations. Again fix the long position of security `207391` to be `10,000`.
* Repeat throughout the timeseries.

Calculate the daily profit and loss (PnL) for the
* dynamically hedged position constructed above.
* long-only position, (still at `10,000` throughout.)

(You might check to verify that the net duration is zero at all dates.)

Report...
* the cumulative PnL of both strategies via a plot.
* the (daily) mean, standard deviation, min, and max of the PnL in a table.

In [15]:
# Create position dataframe for the strategy
position = pd.DataFrame(index=duration.index, dtype=float)
position['long'] = SIZE_EQUITY_LONG / price[ID_LONG]
position['hedge ratio'] = (duration[ID_LONG] / duration[ID_SHORT]) * (price[ID_LONG] / price[ID_SHORT])
position['short'] = - position['hedge ratio'] * position['long']


# Check that duration is indeed hedged as of end of day
position[['long ($)','short ($)']] = position[['long','short']] * price[[ID_LONG,ID_SHORT]].values
position['net ($)'] = position[['long ($)','short ($)']].sum(axis=1)
wts = position[['long ($)','short ($)']].div(position[['long ($)','short ($)']].sum(axis=1),axis=0)

position['duration'] = (wts * duration[[ID_LONG,ID_SHORT]].values).sum(axis=1)
position['duration'].describe().to_frame().T.style.format('{:.1%}')

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
duration,133000.0%,0.0%,0.0%,-0.0%,-0.0%,0.0%,0.0%,0.0%


In [16]:
# P&L of the Hedged Position:
position['long p&l']  = position['long'].shift(1)  * price[ID_LONG].diff()
position['short p&l'] = position['short'].shift(1) * price[ID_SHORT].diff()
position['net hedged p&l']   = position['long p&l'] + position['short p&l']

In [17]:
# P&L of the Unhedged Position:
position['unhedged p&l'] = position['long'].shift(1) * price[ID_LONG].diff()

In [18]:
fig = px.line(position[['net hedged p&l','unhedged p&l']].cumsum(), title='Cumulative P&L of the Hedged Position')
fig.update_layout(
    xaxis_title='Date',
    yaxis_title='Cumulative P&L ($)',
    yaxis=dict(
        tickformat=",0f",
        zeroline=True,
        zerolinecolor='darkgray', 
        zerolinewidth=1,
        gridcolor='lightgrey',
        showgrid=True
    )
)
fig.show()

### 2.3.

Give two reasons that the daily PnL is not always zero for the hedged position given that we have perfectly hedged the duration.

<span style="color:lightblue"> 

* The PnL is not zero because the two securities have different convexity:
    - Duration is only a linear approximation of the sensitivity of a bond to variation in the interest rates.
    - A portfolio hedged using only duration will experience residual gains or losses due to differences in convexity—especially during larger market moves.
* The PnL is not zero because the two securities have different coupon rates (and thus different cash flows).
    - Bonds with different coupons have different cash flow distributions, leading to differences in their reinvestment risk and price behavior over time. 
    - A higher-coupon bond, for instance, will distribute more cash earlier, which can lead to a shorter effective duration compared to a low-coupon bond.
    - Since bond prices incorporate expectations about future cash flows, differences in coupons will lead to mismatches in total returns over time, causing nonzero PnL even in a hedged position.

### 2.4.
The PnL above doesn't account for the coupons.

Calculate a dataframe indexed by dates with columns for the two treasuries with values of coupon payments. 
* Recall that the stated coupon rate is semiannual, so at any give coupon date, it pays half the stated rate.
* Figure out the coupon dates by using the `data` tab and looking for dates where `acc int` goes down. Recall that accrued interest measures the portion of the coupon period that has passed. So when this resets, it is because the coupon has been paid.

Report the first 5 dates that a coupon is paid (by either bond).

In [19]:
coupons_ts = tr.get_cash_flows(data, data_info.T, [ID_LONG, ID_SHORT])
coupons_ts

Unnamed: 0_level_0,207391,207392
quote date,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-02-18,0.8125,1.125
2020-08-17,0.8125,1.125
2021-02-16,0.8125,1.125
2021-08-16,0.8125,1.125
2022-02-15,0.8125,1.125
2022-08-15,0.8125,1.125
2023-02-15,0.8125,1.125
2023-08-15,0.8125,1.125
2024-02-15,0.8125,1.125
2024-08-15,0.8125,1.125


### 2.5.
Account for the coupons in the PnL calculations of `2.2`. Report the updated PnL in a plot and a table, similar to the reporting in `2.2`.

In [20]:
# P&L of the Hedged Position with Coupons:
coupons_ts = coupons_ts.rename(columns={ID_LONG: 'long coupon', ID_SHORT: 'short coupon'})
position = pd.merge(position, coupons_ts, how='outer', left_index=True, right_index=True)
position = position.fillna(0)

position['long p&l w/ coupon']  = position['long p&l'] + position['long'].shift(1) * position['long coupon'] 
position['short p&l w/ coupon'] = position['short p&l'] + position['short'].shift(1) * position['short coupon']
position['net hedged p&l w/ coupon'] = position['long p&l w/ coupon'] + position['short p&l w/ coupon']

# P&L of the Unhedged Position with Coupons:
position['unhedged p&l w/ coupon'] = position['long'].shift(1) * (price[ID_LONG].diff() + position['long coupon'])

fig = px.line(position[['net hedged p&l w/ coupon','unhedged p&l w/ coupon']].cumsum(), title='Cumulative P&L of the Hedged Position with Coupons')
fig.update_layout(
    xaxis_title='Date',
    yaxis_title='Cumulative P&L ($)',
    yaxis=dict(
        tickformat=",0f",
        zeroline=True,
        zerolinecolor='darkgray', 
        zerolinewidth=1,
        gridcolor='lightgrey',
        showgrid=True
    )
)
fig.show()

***