In [1]:
import numpy as np
import pandas as pd

from sklearn.linear_model import LinearRegression
from scipy.stats import norm
from sklearn.preprocessing import QuantileTransformer
from scipy.stats import multivariate_normal


#### [Section 1 – Multivariate Statistics]

1. [10 points] Generate a data set of two independent and identically distributed (i.i.d.) variables of N(0,1), of length 5,000. How do you impose a correlation of, say, 0.5, to the data set? You can code up your work in your preferred programming language(s) or you can work it out in Excel.

In [70]:
np.random.seed(0)

x = np.random.normal(0, 1, 5000)
y = np.random.normal(0, 1, 5000)
independent_vars = np.vstack([x, y]).T

corr = 0.5

# Cholesky decomposition of the correlation matrix
corr = np.array([[1, corr], [corr, 1]])
L = np.linalg.cholesky(corr)

# Generating the correlated y_new
correlated_vars = independent_vars @ L
correlated_x, correlated_y = correlated_vars[:, 0], correlated_vars[:, 1]

corr_new = np.corrcoef(correlated_x, correlated_y)
corr_new

array([[1.        , 0.44988415],
       [0.44988415, 1.        ]])

2. [20 points] Following the above result, how would you apply a correlation of 0.5 to an empirical data set of two variables that start with a different correlation. That is, without modifying the marginal distributions of the two variables, convert the correlation of the data set to 0.5. Download S&P 500 index and USD/CAD FX rate historical data from 2019-12-31 to 202312-31 and apply your method to the historical data set. Yahoo Finance or other alternative data sources are all acceptable. You can code up your work in your preferred programming language(s) or work it out in Excel. Either way, you are also expected to explain your work in writing.

In [3]:
# Load the data
ASSET_LIST = ['US10Y_%', 'CA10Y_%', 'SPX_US$', 'TSX_CAD$', 'SPGSCI_USD$', 'Gold_USD$', 'USDCAD_CAD$']
DF_LIST = []
for asset in ASSET_LIST:
    df = pd.read_excel('Data.xlsx', sheet_name=asset)
    df['time'] = pd.to_datetime(df['time'])
    df[asset] = df['close']
    df.set_index('time', inplace=True)
    DF_LIST.append(df[[asset]])

DATA = pd.concat(DF_LIST, axis=1).dropna()


In [4]:
DATA.describe().round(2)

Unnamed: 0,US10Y_%,CA10Y_%,SPX_US$,TSX_CAD$,SPGSCI_USD$,Gold_USD$,USDCAD_CAD$
count,5457.0,5457.0,5457.0,5457.0,5457.0,5457.0,5457.0
mean,3.23,3.07,2080.74,13361.13,446.0,1110.77,1.25
std,1.35,1.49,1120.9,4038.62,159.3,553.72,0.17
min,0.51,0.44,682.55,5695.33,130.29,252.1,0.92
25%,2.15,1.78,1229.13,10485.2,337.76,560.1,1.11
50%,3.03,2.94,1530.95,13407.01,437.54,1225.9,1.28
75%,4.26,4.22,2743.07,15730.79,579.93,1557.9,1.35
max,6.79,6.6,5254.34,22361.78,890.29,2371.61,1.61


In [71]:
DATA_Q2 = DATA[(DATA.index >= '2019-12-31') & (DATA.index <= '2023-12-31')][['SPX_US$', 'USDCAD_CAD$']].copy()

In [77]:


# Step 1: Transform to uniform using ECDF
X = np.array(DATA_Q2['SPX_US$'])
Y = np.array(DATA_Q2['USDCAD_CAD$'])
quantile_transformer = QuantileTransformer(output_distribution='uniform', random_state=0, n_quantiles=len(X))
U, V = quantile_transformer.fit_transform(np.column_stack((X, Y))).T

# Step 2: Apply a Gaussian copula with the desired correlation
desired_corr = np.array([[1, 0.5], [0.5, 1]])
mvn = multivariate_normal(mean=[0, 0], cov=desired_corr)
copula_samples = mvn.rvs(size=1000)
new_U = norm.cdf(copula_samples[:, 0])
new_V = norm.cdf(copula_samples[:, 1])

# Step 3: Transform back to original scales
X_new, Y_new = quantile_transformer.inverse_transform(np.column_stack((new_U, new_V))).T

new_corr = np.corrcoef(X_new, Y_new)
print("Original Correlation:", np.corrcoef(X, Y)[0, 1])
print("New Correlation:", new_corr[0, 1])

Original Correlation: -0.5019001388081297
New Correlation: 0.5077164775034696


3. [10 points] How would you extend the above bivariate framework to multivariate distributions? Show your work in code or Excel and you can work with a set of 4 or 5 variables.

In [79]:
DATA_Q3 = DATA[(DATA.index >= '2019-12-31') & (DATA.index <= '2023-12-31')][
    ['SPX_US$', 'USDCAD_CAD$', 'US10Y_%', 'SPGSCI_USD$', 'Gold_USD$']]
DATA_Q3 = DATA_Q3.to_numpy()
# Step 1: Transform marginals to uniform
quantile_transformer = QuantileTransformer(output_distribution='uniform', random_state=0, n_quantiles=len(DATA_Q3))
uniform_data = quantile_transformer.fit_transform(DATA_Q3)

# Step 2: Create the desired correlation matrix
desired_corr = np.full((5, 5), 0.5)
np.fill_diagonal(desired_corr, 1)

# Step 3: Apply Gaussian Copula
mvn = multivariate_normal(mean=np.zeros(5), cov=desired_corr)
copula_samples = mvn.rvs(size=1000)

# Step 4: Convert copula samples to uniform
uniform_samples = norm.cdf(copula_samples)

# Step 5: Transform back to original scales
transformed_data = quantile_transformer.inverse_transform(uniform_samples)

# Step 6: Check the correlation of the transformed data
transformed_corr = np.corrcoef(transformed_data.T)
transformed_corr

array([[1.        , 0.49875481, 0.47121812, 0.49098476, 0.45255403],
       [0.49875481, 1.        , 0.51287709, 0.53309411, 0.46951191],
       [0.47121812, 0.51287709, 1.        , 0.47587352, 0.45413942],
       [0.49098476, 0.53309411, 0.47587352, 1.        , 0.4680466 ],
       [0.45255403, 0.46951191, 0.45413942, 0.4680466 , 1.        ]])

#### [Section 2 – Investment/Total Fund]

4. [30 points] Imagine OTPP’s CIO and the Investment Committee decided to increase tactical allocation to the S&P 500 Index by CAD$1 billion, how would you execute the allocation and what implications would your execution decision have on the total fund besides the obvious impact on the equity asset class? Any other considerations should be examined? Explore the execution alternatives as many as possible and explain the impact to the total fund from asset risk and liquidity risk perspectives. The intention is to explore basic understanding of ﬁnancial products (cash vs derivative) and its implications in the context of asset allocation in a portfolio such as OTPP.

How would you execute the allocation and what implications would your execution decision have on the total fund besides the obvious impact on the equity asset class?

My execution decision would be to buy S&P 500 futures contracts.

Implications on the total fund:
- Future is mark-to-market. It generates daily cashflow and impact liquidity.
- Counterparty credit risk but minimal due to mark-to-market
- Less capital required.
- Need to roll


Explore the execution alternatives as many as possible and explain the impact to the total fund from asset risk and liquidity risk perspectives.

Alternative 1: Buy S&P 500 ETFs
Implications on the total fund:
- Large trades in ETFs required slow execution to avoid moving the market
- Less responsive because cannt liquidate quickly.
- Much higher capital required.

Alternative 2: Synthetic Long using S&P 500 Index Options (buy call option and sell put option)
Implications on the total fund:
- This approach takes volatility risk in addition to price risk
- Cash flow impact whenever the options roll over or expire.
- Dividend is not captured in the synthetic long position.
- Less capital required.

Alternative 3: Buy S&P 500 Index Total Return Swaps
Implications on the total fund:
- Leveraged exposure to the index with a smaller amount of capital.
- counterparty risk. Mitigated by collateral posting and daily mark-to-market.
- Liquidity risk due to OTC nature of the swaps. 
- Cash flow impact whenever the swaps roll over or expire.
- upfront margin required to post as collateral.



#### [Section 3 – Data]

5. [10 points] You are given two sets of time series. The lengths are slightly different and the dates are mostly the same but not exact. Below is an example. Can you come up with a strategy to align them so that the two time series are lined up for dates that are common to both sets, i.e., the intersection of the two date sets? You can choose to solve/guess the missing values in writing or in Excel.

The strategy is to join the two datasets on the 'Date' column, keeping only the rows where the date is common to both datasets.

In [80]:
# Prepare Dataset
df1 = pd.DataFrame(columns=['Date', 'SP500'])
df2 = pd.DataFrame(columns=['Date', 'USDCAD'])

df1.loc[len(df1)] = {'Date': '12/31/2019', 'SP500': 3230.78}
df1.loc[len(df1)] = {'Date': '01/02/2020', 'SP500': 3257.85}
df1.loc[len(df1)] = {'Date': '01/03/2020', 'SP500': 3234.85}
df1.loc[len(df1)] = {'Date': '01/06/2020', 'SP500': 3246.28}

df2.loc[len(df2)] = {'Date': '12/31/2019', 'USDCAD': 1.30606}
df2.loc[len(df2)] = {'Date': '01/01/2020', 'USDCAD': 1.3002}
df2.loc[len(df2)] = {'Date': '01/02/2020', 'USDCAD': 1.2973}
df2.loc[len(df2)] = {'Date': '01/03/2020', 'USDCAD': 1.2983}
df2.loc[len(df2)] = {'Date': '01/06/2020', 'USDCAD': 1.29866}

df_merge = df1.merge(df2, how='inner', on='Date')
df_merge

Unnamed: 0,Date,SP500,USDCAD
0,12/31/2019,3230.78,1.30606
1,01/02/2020,3257.85,1.2973
2,01/03/2020,3234.85,1.2983
3,01/06/2020,3246.28,1.29866


Since the missing data is SP500 on 01/01/2020 and market is closed on 01/01/2020, we can use the last available data point to fill in the missing value. The implication is that the return on 01/01/2020 will be 0.

6. [10 points] Below is a table storing the Market Value (MV), DV01 and DV01 Convexity for a Fixed Income portfolio, FICA. However, there are a few missing values for the DV01 Convexity column. How would come up with an estimation for the missing DV01 Convexity values? How do you validate your method? You are welcome to try multiple methods for bonus points. Describe your thought process and the intent here is to test “number literacy” under a business context.

In [84]:
# Prepare Dataset
df_FI = pd.DataFrame(columns=['Date', 'Portfolio', 'MV', 'DV01', 'DV01Convexity'])
df_FI.loc[len(df_FI)] = {'Date': '10/27/2023', 'Portfolio': 'FICA', 'MV': 67454088, 'DV01': 158800}
df_FI.loc[len(df_FI)] = {'Date': '10/27/2023', 'Portfolio': 'FICA', 'MV': 89304605, 'DV01': 230212}
df_FI.loc[len(df_FI)] = {'Date': '10/27/2023', 'Portfolio': 'FICA', 'MV': 113440499, 'DV01': 252102}
df_FI.loc[len(df_FI)] = {'Date': '10/27/2023', 'Portfolio': 'FICA', 'MV': 185055847, 'DV01': 404225,
                         'DV01Convexity': 1042}
df_FI.loc[len(df_FI)] = {'Date': '10/27/2023', 'Portfolio': 'FICA', 'MV': 182724762, 'DV01': 398461,
                         'DV01Convexity': 1029}
df_FI.loc[len(df_FI)] = {'Date': '10/27/2023', 'Portfolio': 'FICA', 'MV': 182731405, 'DV01': 398644,
                         'DV01Convexity': 1029}
df_FI.loc[len(df_FI)] = {'Date': '10/27/2023', 'Portfolio': 'FICA', 'MV': 183099003, 'DV01': 399745,
                         'DV01Convexity': 1032}
df_FI.sort_values(by=['MV'], inplace=True)
df_FI

Unnamed: 0,Date,Portfolio,MV,DV01,DV01Convexity
0,10/27/2023,FICA,67454088,158800,
1,10/27/2023,FICA,89304605,230212,
2,10/27/2023,FICA,113440499,252102,
4,10/27/2023,FICA,182724762,398461,1029.0
5,10/27/2023,FICA,182731405,398644,1029.0
6,10/27/2023,FICA,183099003,399745,1032.0
3,10/27/2023,FICA,185055847,404225,1042.0


In [98]:

# Sample data with DV01 Convexity
dfDataforModel = df_FI.dropna()

LR = LinearRegression()
X = dfDataforModel[['MV', 'DV01']]
y = dfDataforModel['DV01Convexity']

LR.fit(X, y)

# Sample data missing dv01 Convexity
dfDataforPrediction = df_FI[df_FI['DV01Convexity'].isna()]
dfDataforPrediction = dfDataforPrediction[['MV', 'DV01']]

# Predict missing DV01Convexity
dfDataforPrediction = LR.predict(dfDataforPrediction)

dfDataforPrediction

array([492.96554092, 661.39280282, 700.80135567])

7. [15 points] What is data normalization in database design? Why and when do you normalize data? Make sure you include an example to illustrate.

What is data normalization in database design?
reduce redundancy and improve data integrity by organizing fields and table of a database. 

Why Normalize Data?
Avoids duplicate information. Saving storage space and ensuring that changes to data are reflected consistently across the database.
Optimized data structures reduce the complexity of database operations and can improve the performance of the system.

When to Normalize Data
Normalization is typically beneficial in the following scenarios:
Dataset is structured: When the data is organized into tables with rows and columns, normalization can help in reducing redundancy and improving data integrity.
Data is subject to frequent updates.
Data is shared across multiple tables.


In [99]:
# Example
data = {
    'OrderID': [1, 2, 3, 4],
    'CustomerName': ['Alice', 'Bob', 'Alice', 'Charlie'],
    'Product': ['Apple', 'Banana', 'Cherry', 'Date'],
    'Quantity': [5, 10, 15, 20],
    'Price': [1.0, 0.5, 2.0, 0.25]
}
df = pd.DataFrame(data)
df

Unnamed: 0,OrderID,CustomerName,Product,Quantity,Price
0,1,Alice,Apple,5,1.0
1,2,Bob,Banana,10,0.5
2,3,Alice,Cherry,15,2.0
3,4,Charlie,Date,20,0.25


In [87]:
# Normalized Tables
df_customers = pd.DataFrame({'CustomerID': [1, 2, 3], 'CustomerName': ['Alice', 'Bob', 'Charlie']})
df_products = pd.DataFrame({'ProductID': [1, 2, 3, 4], 'Product': ['Apple', 'Banana', 'Cherry', 'Date']})
df_orders = pd.DataFrame(
    {'OrderID': [1, 2, 3, 4], 'CustomerID': [1, 2, 1, 3], 'ProductID': [1, 2, 3, 4], 'Quantity': [5, 10, 15, 20],
     'Price': [1.0, 0.5, 2.0, 0.25]})


Customer Table (df_customers) with CustomerID as the primary key

In [88]:
df_customers

Unnamed: 0,CustomerID,CustomerName
0,1,Alice
1,2,Bob
2,3,Charlie


Product Table (df_products) with ProductID as the primary key

In [89]:
df_products

Unnamed: 0,ProductID,Product
0,1,Apple
1,2,Banana
2,3,Cherry
3,4,Date


Orders Table (df_orders) with OrderID as the primary key and CustomerID and ProductID as foreign keys

In [90]:
df_orders

Unnamed: 0,OrderID,CustomerID,ProductID,Quantity,Price
0,1,1,1,5,1.0
1,2,2,2,10,0.5
2,3,1,3,15,2.0
3,4,3,4,20,0.25


Once the data is normalized, we can establish relationships between the tables using foreign keys.

We can create views to get the same information as the original table by joining the normalized tables.

#### [Section 4 – Value-at-Risk]

8. [40 points] You are mandated to construct a portfolio of equities (S&P 500 and S&P/TSX indices), S&P GSCI Index, Gold, and US 10Y Treasuries and CA 10Y Treasuries. The asset mix is not determined here. (1) Keep in mind that you are a Canadian investor and you need to make up your decision on currency exposure. Explain how you arrive at your currency exposure decision and how you would implement your target currency exposure. (2) Build a process to calculate the 1-day Value-at-Risk (VaR) of the portfolio for any given asset mix. For Treasuries, you can use a simple linear approximation to calculate PnLs. (3) In your VaR calculation process, decompose VaR by the products, i.e., calculate the contributional VaR’s from the products, as well as the incremental VaR’s. Download market data from Yahoo Finance or any other reputable vendors. This can be done in code of your preferred programming language(s) or in Excel.

1) Currency Exposure Decision:

The currency exposure decision depends on the economic outlook. The current outlook indicates that the US economy is stronger than the Canadian economy.
Canada is expected to have rate cut sooner than the US, which could lead to a weaker Canadian dollar.

Based on the above, I would consider having a higher exposure to the US dollar compared to the Canadian dollar.
To implement, I can invest more in US assets and do not hedge the currency exposure on commodities like Gold and S&P GSCI Index.
FX exposure tilts can be achieved by using currency futures or forwards to hedge the currency exposure.
To get target currency exposure, I can first calcuate the currency exposure without hedging and then use currency futures or forwards to adjust the exposure to the desired level.


2) VaR Calculation Process:

In [107]:
# Returns calculation
Q8_DATA = DATA[(DATA.index >= '2019-12-31') & (DATA.index <= '2023-12-31')].copy()
Q8_DATA['US10Y_%'] = Q8_DATA['US10Y_%'].diff() / 100
Q8_DATA['CA10Y_%'] = Q8_DATA['CA10Y_%'].diff() / 100
Q8_DATA[['SPX_US$', 'TSX_CAD$', 'SPGSCI_USD$', 'Gold_USD$', 'USDCAD_CAD$']] = Q8_DATA[
    ['SPX_US$', 'TSX_CAD$', 'SPGSCI_USD$', 'Gold_USD$', 'USDCAD_CAD$']].pct_change()
Q8_DATA.dropna(inplace=True)
# Calculate price return for Treasuries assume constant duration of 8
modified_duration = 8
Q8_DATA['US10Y_%'] = -Q8_DATA['US10Y_%'] * modified_duration
Q8_DATA['CA10Y_%'] = -Q8_DATA['CA10Y_%'] * modified_duration

# hedge ratio to adjust portfolio level currency exposure to target
hege_ratio = 0.3

# Convert to hedged CAD returns based on hege_ratio. Hedging only on principal amount.
Q8_DATA['US10Y_CAD%'] = Q8_DATA['US10Y_%'] + Q8_DATA['USDCAD_CAD$'] * (1 - hege_ratio)
Q8_DATA['SPX_CAD$'] = Q8_DATA['SPX_US$'] + Q8_DATA['USDCAD_CAD$'] * (1 - hege_ratio)
Q8_DATA['SPGSCI_CAD$'] = Q8_DATA['SPGSCI_USD$'] + Q8_DATA['USDCAD_CAD$'] * (1 - hege_ratio)
Q8_DATA['Gold_CAD$'] = Q8_DATA['Gold_USD$'] + Q8_DATA['USDCAD_CAD$'] * (1 - hege_ratio)

Q8_RETURNS = Q8_DATA[['SPX_CAD$', 'TSX_CAD$', 'SPGSCI_CAD$', 'Gold_CAD$', 'US10Y_CAD%', 'CA10Y_%']]


In [101]:
def cal_historical_var(pf_returns, pf_weights, var_confidence=0.95):
    # Calculate portfolio returns
    pf_returns = pf_returns.dot(pf_weights)

    # Calculate VaR
    var = np.percentile(pf_returns, 100 * (1 - var_confidence))
    return var

In [102]:
PF_WEIGHTS = np.array([0.2, 0.2, 0.2, 0.2, 0.1, 0.1])
VAR_CONFIDENCE = 0.95

HISTORICAL_VAR = cal_historical_var(Q8_RETURNS, PF_WEIGHTS, VAR_CONFIDENCE)
HISTORICAL_VAR

-0.012388939869001482

In [103]:
# Calculate contribution to VaR
def cal_contribution_to_var(asset_returns, pf_weights, var):
    # Calculate portfolio returns
    pf_returns = asset_returns.dot(pf_weights)

    portfolio_std = pf_returns.std()
    marginal_VaR = asset_returns.cov().dot(pf_weights) / portfolio_std
    contributional_VaR = marginal_VaR * var / portfolio_std
    contributional_VaR = contributional_VaR / contributional_VaR.sum() * var

    return contributional_VaR


In [104]:
CONTRIBUTIONAL_VAR = cal_contribution_to_var(Q8_RETURNS, PF_WEIGHTS, HISTORICAL_VAR)
CONTRIBUTIONAL_VAR

SPX_CAD$      -0.002436
TSX_CAD$      -0.003993
SPGSCI_CAD$   -0.004159
Gold_CAD$     -0.002373
US10Y_CAD%     0.000461
CA10Y_%        0.000111
dtype: float64

In [105]:
def cal_incremental_var(asset_returns, pf_weights, var_confidence=0.95):
    incremental_VaR = np.zeros(len(pf_weights))
    pf_var = cal_historical_var(asset_returns, pf_weights, var_confidence=var_confidence)
    for i in range(len(pf_weights)):
        new_weights = pf_weights.copy()
        new_weights[i] = 0
        new_weights = new_weights / new_weights.sum()
        incremental_VaR[i] = pf_var - cal_historical_var(asset_returns, new_weights, var_confidence=var_confidence)
    df = pd.DataFrame({'Asset': asset_returns.columns, 'Incremental VaR': incremental_VaR})
    return df

In [106]:
INCREMENTAL_VAR = cal_incremental_var(Q8_RETURNS, PF_WEIGHTS, VAR_CONFIDENCE)
INCREMENTAL_VAR

Unnamed: 0,Asset,Incremental VaR
0,SPX_CAD$,0.000864
1,TSX_CAD$,-0.000831
2,SPGSCI_CAD$,-0.001273
3,Gold_CAD$,0.000504
4,US10Y_CAD%,0.001891
5,CA10Y_%,0.001461


9. [10 points] Following the setup in Q8 above, how would you calculate the 10-day VaR? What about 1-year VaR? Here, you don’t have to do the actual work but you are expected to put down your thought process and explain your assumptions and/or new assumptions and/or breaking assumptions, etc. On the other hand, you are more than welcome to work it out for bonus points.

To calculate 10-day VaR, we can first calculate the rolling non-overlapping 10-day returns for each asset in the portfolio. Then, we can calculate the portfolio returns for each 10-day period and calculate the VaR based on these returns.

To calculate 1-year VaR, we can calculate to rolling overlapping 1-year returns for each asset in the portfolio due to sample size. Then we need to adjust auto-correlation in the returns to avoid overestimation of VaR. Finally, we can calculate the portfolio returns for each 1-year period and calculate the VaR based on these returns.


10. [5 points] Say, you are given an estimate of 3-month VaR of $3.6B. How would you estimate the 1-year VaR without additional details? State your assumption(s).

1-year VaR = 3-month VaR * sqrt(4) = 3.6B * 2 = 7.2B

Assumptions:
- returns are normally distributed
- constant volatility over the time period
- returns are independent and identically distributed (i.i.d.) over time



#### [Section 5 – Quantitative Finance]

11. [5 points] How much would you pay for a call option on a single name equity with inﬁnity volatility? Explain your answer.

Current price of the underlying less pv of strike price.

N(d1) to 1 and N(d2) to 0


12. [5 points] How would you estimate the volatility of S&P 500 index? Based on the historical data from Q2 above, what’s your estimate? Explain your answer.

In [67]:
DATA_Q12 = DATA[(DATA.index >= '2019-12-31') & (DATA.index <= '2023-12-31')][['SPX_US$']].copy()
DATA_Q12['SPX_US$_logDiff'] = np.log(DATA_Q12['SPX_US$'] / DATA_Q12['SPX_US$'].shift(1))
daily_volatility = DATA_Q12['SPX_US$_logDiff'].std()

# There are approximately 252 trading days in a year
annualized_volatility = daily_volatility * np.sqrt(252)
annualized_volatility

0.23391149064378153

Calculate daily volatility then scale it to annualized volatility.

13. [5 points] Estimate the default probability of a credit name quoted with a 6M CDS spread of 50 basis points assuming 50% recovery rate. Explain your answer.

cds spread  = (1 - recovery rate) * hazard rate

hazard rate is the default probability per year

hazard rate = cds spread / (1 - recovery rate) = 0.5% / (1 - 0.5) = 1%

6M default probability = 1% * 0.5 = 0.5%


14. [15 points] What would the delta of a put option on ETF SPY move if SPY moves up? What purpose would a put option on SPY serve in a portfolio like OTPP? Is there a way to reduce the premium cost while not losing the purpose?

- Delta of put options is always negative. The put option will be less ITM or more OTM as the price of the underlying increase. The delta of the put option will increase as the price of the underlying increases.

- Provide downside protection. Strategies to reduce premium: put spread and collar

15. [10 points] A colleague of yours provided an estimate of risk change of 5.8B risk off for OTPP’s total portfolio upon a transaction where a 1.7B credit bond exposure is sold and a 1.7B equity exposure is bought for tactical allocation purposes. Do you think this is a sensible estimate? How would you suggest validating the result? Explain your answer.

Most likely wrong. credit bond is less volatile than equity in general unless it is a high yield bond with very high default risk. Such bond is less likely to be used in tactical allocation. Should be risk on rather than risk off.

#### [Section 6 – Artiﬁcial Intelligence]

16. [10 points] Describe an approach for an organization such as OTPP or your current workplace to utilize ChatGPT or other LLM models for extracting and understanding internal organizational data in the form of PDFs, Excel Tables and Database Tables, in terms of speciﬁc techniques/implementation? Enumerate key methods for utilizing LLM(s) for such purposes, and point out available models and their pros and cons. This is a bonus question and candidates are not required to answer it.

store the data in vectorized form. LLM to find the answer from the data first rather than from trained answers.

Github: https://github.com/krocellx/case