In [86]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import scipy
import os
import statsmodels.api as sm
import scipy.stats as stats
import statsmodels.formula.api as smf
    
import portfolio_management_helper as pmh

# Final Exam

## FINM 36700 - 2023

### UChicago Financial Mathematics

* Mark Hendricks
* hendricks@uchicago.edu

# Instructions

## Please note the following:

Points
* The exam is 155 points.
* You have 180 minutes to complete the exam.
* For every minute late you submit the exam, you will lose one point.


Submission
* You will upload your solution to the `Final Exam` assignment on Canvas, where you downloaded this. (Be sure to **submit** on Canvas, not just **save** on Canvas.
* Your submission should be readable, (the graders can understand your answers,) and it should **include all code used in your analysis in a file format that the code can be executed.** 

Rules
* The exam is open-material, closed-communication.
* You do not need to cite material from the course github repo--you are welcome to use the code posted there without citation.

Advice
* If you find any question to be unclear, state your interpretation and proceed. We will only answer questions of interpretation if there is a typo, error, etc.
* The exam will be graded for partial credit.

## Data

**All data files are found in the class github repo, in the `data` folder.**

This exam makes use of the following data files:
* `final_exam_data.xlsx`

This file has sheets for...
* `portfolio` (weekly) - Part 2
* `forecasting` (monthly) - Part 3
* `fx_carry`(daily) - Part 4

## Scoring

| Problem | Points |
|---------|--------|
| 1       | 40     |
| 2       | 25     |
| 3       | 50     |
| 4       | 40     |

In [87]:
dfs_raw = pd.read_excel(r"final_exam_data.xlsx" , sheet_name=None)
for key in dfs_raw.keys():
    print(f"{key}: {dfs_raw[key].shape}")
    
# ticker_mapping = {tick: name 
#                   for tick, name in zip(dfs_raw['descriptions'].iloc[:, 0], 
#                                                 dfs_raw['descriptions'].iloc[:, 1])}

# ticker_mapping


portfolio: (392, 8)
forecasting: (175, 4)
fx_carry: (1380, 4)


### Each numbered question is worth 5 points unless otherwise specified.

### Notation
(Hidden LaTeX commands)

$$\newcommand{\betamkt}{\beta^{i,\text{MKT}}}$$
$$\newcommand{\betahml}{\beta^{i,\text{HML}}}$$
$$\newcommand{\betaumd}{\beta^{i,\text{UMD}}}$$
$$\newcommand{\Eri}{E\left[\tilde{r}^{i}\right]}$$
$$\newcommand{\Emkt}{E\left[\tilde{r}^{\text{MKT}}\right]}$$
$$\newcommand{\Ehml}{E\left[\tilde{r}^{\text{HML}}\right]}$$
$$\newcommand{\Eumd}{E\left[\tilde{r}^{\text{UMD}}\right]}$$

$$\newcommand{\frn}{\text{MXN}}$$
$$\newcommand{\frnrate}{\text{MXSTR}}$$
$$\newcommand{\FXspot}{S}$$
$$\newcommand{\fxspot}{\texttt{s}}$$
$$\newcommand{\rflogusd}{\texttt{r}^{\text{USD}}}$$
$$\newcommand{\rflogfrn}{\texttt{r}^{\frn}}$$

$$\newcommand{\wintt}{t,t+1}$$

$$\newcommand{\targ}{\text{USO}}$$

# 1. Short Answer

#### No Data Needed

These problems do not require any data file. Rather, analyze them conceptually. 

### 1.

Consider a Linear Factor Pricing Model (LFPM).

Which metric do we examine to understand its fit, (or errors)...
* given the estimated **time-series (TS)** test?
* given the estimated **cross-sectional (CS)** test?

* **time-series (TS)**: Significance of intercepts
* **cross-sectional (CS)**: $R^2$

### 2.

Consider the Arbitrage Pricing Theory (APT). Is it fair to say that it is more likely to work for sets of assets with low cross-correlation? Why or why not?

Yes because correlation between factors are assumed to be 0. todo

### 3.

In constructing momentum portfolios, we discussed selecting the top and bottom 10% of stocks, ranked by past returns. How do you think the strategy would be impacted if we were more extreme in the selection, and went long-short just the top / bottom 1% of total stocks?

We would have much more turnover in our strategy and thus be encountering many more fees. Momentum is already known to give small returns so doing this would likely make this strategy unprofitable.

### 4.

Over longer horizons, do investments have higher Sharpe ratios? How is this issue relevant to long-term asset allocators such as Barnstable?

todo

### 5.

Before it crashed, how did LTCM's performance compare to the S&P (SPY)? Was it an attractive investment? Be specific.

todo

### 6.

Suppose investors are **not** mean-variance investors. If we find an investment with a Sharpe ratio higher than the "market", would this would be inconsistent with the CAPM?

todo

### 7.

What causes us concern about the performance of classic mean-variance optimization out-of-sample?

What is one of the potential solutions we discussed?

todo

### 8.

True or False: Uncovered Interest Parity implies Covered Interest Parity, but not vice-versa.

Explain.

todo

***

# 2. Optimization

Use the data found in the `portfolio` tab. it is weekly data.

In [88]:
df_portfolio = dfs_raw["portfolio"].set_index("date")
annual_factor = 52  # Weeks

df_portfolio.head()

Unnamed: 0_level_0,SPY,BTC,USO,TLT,IEF,IYR,GLD
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2016-01-15,-0.0214,-0.1439,-0.1031,0.0194,0.0078,-0.0307,-0.0151
2016-01-22,0.0144,-0.0214,0.0546,-0.0033,-0.0011,0.0101,0.0088
2016-01-29,0.0168,-0.0101,0.041,0.0156,0.0115,0.0105,0.0186
2016-02-05,-0.0298,0.0258,-0.0767,0.0131,0.0073,-0.0268,0.0502
2016-02-12,-0.007,-0.0085,-0.0651,0.0216,0.0076,-0.042,0.0538


### 1.

Assume the provided data is in terms of **excess** returns.

Report the weights of the tangency portfolio.

Report the weights of the MV portfolio which achieves a mean weekly return of `0.0025`.

In [89]:
pmh.calc_target_ret_weights(0.0025, df_portfolio).iloc[:, :2]

Unnamed: 0,Target 0.25% Weights,Tangency Weights
SPY,0.499,0.8576
BTC,0.0749,0.1417
USO,-0.0216,-0.0431
TLT,-0.245,-0.0405
IEF,0.7575,0.1674
IYR,-0.2536,-0.4408
GLD,0.1888,0.3577


### 2.

Assume the provided data is in terms of **total** returns.

Report the weights of the GMV portfolio

Report the weights of the MV portfolio which achieves a mean weekly return of `0.0025`.

In [90]:
pmh.calc_gmv_weights(df_portfolio)

Unnamed: 0,GMV Weights
SPY,0.0921
BTC,-0.001
USO,0.0027
TLT,-0.477
IEF,1.427
IYR,-0.0411
GLD,-0.0028


In [91]:
pmh.calc_target_ret_weights(0.0025, df_portfolio).iloc[:, :1]

Unnamed: 0,Target 0.25% Weights
SPY,0.499
BTC,0.0749
USO,-0.0216
TLT,-0.245
IEF,0.7575
IYR,-0.2536
GLD,0.1888


### 3.

Conceptually, what is the difference between the portfolios in part 1 and part 2?

Mathematically, what is the difference in their optimizations?

todo

### 4.

#### (10pts)

Consider the following:
* drop `BTC` from the sample
* target a weekly mean return of `.0025`.
* assume once again that the provided data is **excess** returns.

Using data only through 2021, 
* calculate the tangency weights
* compute the performance of this tangency portfolio in the out-of-sample (OOS) period of 2022-2023.

Report the
* mean
* vol
* Sharpe

Compare these three metrics with the equally-weighted portfolio for 2022-2023.

In [92]:
# pmh.calc_summary_statistics(
#     assets_excess_returns,
#     annual_factor=annual_factor,
#     keep_columns=['Annualized Vol', 'Annualized Mean', 'Annualized Sharpe']
# )

***

# 3.

Forecast (total) returns on gold as tracked by the ETF ticker, $\targ$. This ETF holds crude oil.

As signals, use two interest rate signals, as seen in Treasury-notes. (No need to consider anything specific about Treasury notes, just read these as macroeconomic signals.)
* Tnote rate
* month-over-month change in the Tnote rate

Find the all data needed for this problem in the sheet `forecasting`.

In [93]:
df_forecast = dfs_raw["forecasting"].set_index("date")
annual_factor = 12  # Months

df_forecast.head()

Unnamed: 0_level_0,USO,Tnote rate,Tnote rate change
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2009-05-31,0.2714,3.465,0.341
2009-06-30,0.042,3.523,0.058
2009-07-31,-0.0295,3.501,-0.022
2009-08-31,-0.0206,3.401,-0.1
2009-09-30,0.0039,3.307,-0.094


### 1.

Estimate a forecasting regression of $\targ$ on the two (lagged) signals.

$$r_{t+1}^\targ = \alpha + \beta^{x}x_t + \beta^z z_t + \epsilon_{t+1}$$

where
* $x$ denotes the interest-rate signal.
* $z$ denotes the change in rate signal.

Report the r-squared, as well as the OLS estimates for the intercept and the two betas. (No need to annualize the stats.)

In [None]:
reg = pmh.calc_regression(
    df_forecast["USO"].shift(-1), 
    df_forecast[["Tnote rate", "Tnote rate change"]],
    annual_factor=annual_factor,
    keep_columns=['Alpha', 'Tnote rate Beta', 'Tnote rate change Beta', 'R-Squared'],
    warnings=False,
    # return_model=True,
    # return_fitted_values=True,
)
reg = reg.drop(columns=["Annualized Alpha"])
reg.transpose()

Unnamed: 0,USO
Alpha,0.0201
R-Squared,0.0281
Tnote rate Beta,-0.0096
Tnote rate change Beta,0.0741


### 2.

Use your forecasted returns, $\hat{r}^{\targ}_{t+1}$ to build trading weights:

$$w_t = 0.50 + 50\;\hat{r}^{\targ}_{t+1}$$

(So the rule says to hold 50% in the ETF plus/minus 50x the forecast. Recall the forecast is a monthly percentage, so it is a small number.)

Calculate the return from implementing this strategy. Denote this as $r^x_t$.

Report the first and last 5 values.

In [None]:
r_hat = pmh.calc_regression(
    df_forecast["USO"].shift(-1), 
    df_forecast[["Tnote rate", "Tnote rate change"]],
    annual_factor=annual_factor,
    keep_columns=['Alpha', 'Tnote rate Beta', 'Tnote rate change Beta', 'R-Squared'],
    warnings=False,
    return_fitted_values=True,
    name_fitted_values="r_hat"
)
r_hat = r_hat["r_hat"]

w_t = .5 + 50 * r_hat

strat_return = w_t.shift(1) * df_forecast["USO"]
strat_return = strat_return.dropna()
display(strat_return)

date
2009-06-30    0.0467
2009-07-31   -0.0010
2009-08-31    0.0052
2009-09-30   -0.0019
2009-10-31   -0.0367
               ...  
2023-06-30    0.0217
2023-07-31    0.0534
2023-08-31    0.0034
2023-09-30    0.0034
2023-10-31   -0.0791
Length: 173, dtype: float64

### 3.

Calculate the following (annualized) performance metrics for both the passive investment, $r^\targ$, as well as the strategy implemented in the previous problem, $r^x$.

* mean
* volatility
* max drawdown

In [96]:
pd.DataFrame([df_forecast["USO"], strat_return], index=["USD", "strategy"]).T

Unnamed: 0_level_0,USD,strategy
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2009-05-31,0.2714,
2009-06-30,0.0420,0.0467
2009-07-31,-0.0295,-0.0010
2009-08-31,-0.0206,0.0052
2009-09-30,0.0039,-0.0019
...,...,...
2023-07-31,0.1514,0.0534
2023-08-31,0.0258,0.0034
2023-09-30,0.0773,0.0034
2023-10-31,-0.0722,-0.0791


In [97]:
df_data = pd.DataFrame([df_forecast["USO"], strat_return], index=["USO", "strategy"])
df_data = df_data.T.dropna()

pmh.calc_summary_statistics(
    df_data,
    annual_factor=annual_factor,
    keep_columns=['Annualized Vol', 'Annualized Mean', 'Max Drawdown']
)

Assuming excess returns were provided to calculate Sharpe. If returns were provided (steady of excess returns), the column "Sharpe" is actually "Mean/Volatility"


Unnamed: 0,Annualized Mean,Annualized Vol,Max Drawdown
USO,-0.0208,0.3574,-0.9471
strategy,0.1701,0.3074,-0.6532


### 4.

#### (7pts)


Suppose we are assessing the returns to this active strategy, $r^x$, without knowing how it is generated. 

Use a regression (with an intercept) to report the optimal hedge ratio of passive $\targ$ to this active strategy. 

* Report the hedge ratio, being clear about whether you are going long or short $\targ$ in order to hedge.

* What is the mean return of the hedged active strategy?

In [None]:
reg = pmh.calc_regression(
    df_data["strategy"], 
    df_data[["USO"]],
    annual_factor=annual_factor,
    keep_columns=['USO Beta'],
    warnings=False,
    intercept=True,
    # return_model=True,
    # return_fitted_values=True,
)
hedge_ratio = -1 * reg.transpose().iloc[0,0]
hedge_ratio

-0.23513139304045447

I would short -.235 of USO for every "unit" of my strategy

In [99]:
(np.mean(df_data["strategy"]) + hedge_ratio * np.mean(df_data["USO"])) * 12

0.17498036679787982

The strategy has 17% yearly return

### 5.

#### (8pts)

For the rest of the problem, consider the out-of-sample (OOS) performance of the strategy.

Forecast values of $\targ$ for January 2018 through Dec 2023. (So we are using the data up until January 2018 as “burn-in” data.)
* Loop through time, estimating the forecast only using data through time $t$.
* At each step, calculate the next OOS forecast, $\hat{r}^{\targ}_{t+1}$.

Report the first and last 5 values of your OOS forecast, $\hat{r}^{\targ}_{t+1}$.

In [137]:
df_forecast.iloc[104]

USO                 0.0808
Tnote rate          2.7200
Tnote rate change   0.3150
Name: 2018-01-31 00:00:00, dtype: float64

In [161]:
# todo review

In [160]:
idx_start = 104
data = []
for idx in range(idx_start, df_forecast.shape[0]):
    df_temp = df_forecast.iloc[:idx+1]
    df_temp["USO"] = df_temp["USO"].shift(-1)
    df_temp = df_temp.dropna()
    
    # Run reg
    reg = pmh.calc_regression(
        df_temp["USO"], 
        df_temp[["Tnote rate", "Tnote rate change"]],
        annual_factor=annual_factor,
        keep_columns=['Alpha', 'Tnote rate Beta', 'Tnote rate change Beta'],
        warnings=False,
    )
    alpha = reg["Alpha"].iloc[-1]
    beta = reg[["Tnote rate Beta", "Tnote rate change Beta"]].values[0]
    
    # Predict
    r_hat = alpha + beta.dot(df_forecast[["Tnote rate", "Tnote rate change"]].iloc[idx])
    date = df_forecast.index[idx]
    
    data.append({
        "date": date,
        "r_hat": r_hat,
    })
df_data = pd.DataFrame(data).set_index("date")
display(df_data)

Unnamed: 0_level_0,r_hat
date,Unnamed: 1_level_1
2018-01-31,0.0022
2018-02-28,-0.0051
2018-03-31,-0.0120
2018-04-30,-0.0023
2018-05-31,-0.0110
...,...
2023-07-31,-0.0064
2023-08-31,-0.0070
2023-09-30,0.0192
2023-10-31,-0.0023


idx_start = 104
for idx in range

In [None]:
reg = pmh.calc_regression(
    df_forecast["USO"].shift(-1), 
    df_forecast[["Tnote rate", "Tnote rate change"]],
    annual_factor=annual_factor,
    keep_columns=['Alpha', 'Tnote rate Beta', 'Tnote rate change Beta', 'R-Squared'],
    warnings=False,
    # return_model=True,
    # return_fitted_values=True,
)
reg = reg.drop(columns=["Annualized Alpha"])
reg.transpose()

todo

### 6. 

#### (8pts)

Report the out-of-sample r-squared, relative to a baseline forecast which is simply the mean of $\targ$ up to the point the forecast is made.

Does the forecast seem effective?

### 7. 

Report the correlation between 
* OOS forecast
* realized value of $\targ$.

In light of this, how effective does the forecast seem?

### 8.

#### (7pts)

Convert your OOS forecast to a traded return strategy, using the same allocation rule as in part 2.

Report the following performance stats for the OOS forecast strategy.

* mean
* volatility
* max-drawdown

Compare these with the passive return, $r^\targ$ over the same OOS window.

***

# 4. 

We examine FX carry for trading the Mexican peso $\frn$.
* Find the FX and risk-free rate data for this problem on sheet `fx_carry`. As before, these are spot FX prices quoted as USD per $\frn$.
* SOFR is the risk-free rate on USD, and $\frnrate$ is the risk-free rate for $\frn$.
* As in Homework 8, the data is provided such that any row’s date, $t$, is reporting $S_t$ and $r^f_{t,t+1}$.
That is, both of these are known at time t.

In [100]:
df_fx = dfs_raw["fx_carry"].set_index("date")
annual_factor = 252  # days

df_fx.head()

Unnamed: 0_level_0,MXN,SOFR,MXSTR
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2018-04-03,0.055,0.0001,0.0003
2018-04-04,0.0549,0.0001,0.0003
2018-04-05,0.0552,0.0001,0.0003
2018-04-06,0.0546,0.0001,0.0003
2018-04-09,0.0549,0.0001,0.0003


### 1.
#### (3pts)

Transform the data to **log** FX prices and **log** interest rates, just as we did in Homework 8.

$$\begin{align}
\fxspot_t & \equiv \ln\left(\FXspot_t\right)\\[3pt]
\rflogusd_{\wintt} & \equiv \ln\left(1+r^{\text{USD}}_{\wintt}\right)\\[3pt]
\rflogfrn_{\wintt} & \equiv \ln\left(1+r^{\frn}_{\wintt}\right)\\
\end{align}$$


Display the mean of all three series.

In [106]:
df_fx.isna().sum()

MXN      8
SOFR     0
MXSTR    0
dtype: int64

In [111]:
# $$\begin{align}
# \fxspot_t & \equiv \ln\left(\FXspot_t\right)\\[3pt]
# \rflogusd_{\wintt} & \equiv \ln\left(1+r^{\text{USD}}_{\wintt}\right)\\[3pt]
# \rflogfrn_{\wintt} & \equiv \ln\left(1+r^{\frn}_{\wintt}\right)\\
# \end{align}$$

df_log_fx = (df_fx
             .copy()
             .ffill()
             .assign(MXN=lambda df: np.log(df["MXN"]))
             .assign(SOFR=lambda df: np.log(1 + df["SOFR"]))
             .assign(MXSTR=lambda df: np.log(1 + df["MXSTR"]))
             )
df_log_fx.mean()

MXN     -2.9813
SOFR     0.0001
MXSTR    0.0003
dtype: float64

### 2.

Calculate the excess log return to a USD investor of holding $\frn$. Report the following **annualized** stats...
* Mean
* Volatility
* Sharpe ratio.

Assume there are 252 reported days per year for pursposes of annnualization.

In [125]:
temp = df_log_fx["MXSTR"] + df_log_fx["MXN"].shift(-1) - df_log_fx["SOFR"] - df_log_fx["MXN"]

temp = temp.to_frame("Excess Log Return")
# temp["USD"] = df_log_fx["SOFR"]

pmh.calc_summary_statistics(
    temp,
    annual_factor=annual_factor,
    keep_columns=['Annualized Vol', 'Annualized Mean', 'Annualized Sharpe']
)


Assuming excess returns were provided to calculate Sharpe. If returns were provided (steady of excess returns), the column "Sharpe" is actually "Mean/Volatility"


Unnamed: 0,Annualized Mean,Annualized Vol,Annualized Sharpe
Excess Log Return,0.0635,0.128,0.4963


### 3. 

Over the sample, was it better to be long or short $\frn$ relative to USD?
* Did the interest spread help on average?
* Did the USD appreciate or depreciate relative to $\frn$ over the sample?

todo?

### 4.

#### (7pts)

Forecast the growth of the FX rate using the interest-rate differential:

$$\fxspot_{t+1} - \fxspot_t = \alpha + \beta\left(\rflogusd_{\wintt} - \rflogfrn_{\wintt}\right) + \epsilon_{t+1}$$

Report the following OLS stats, (no need to annualize or scale them.)
* $\alpha$
* $\beta$
* r-squared

### 5. 

If we assume the Uncovered Interest Parity to hold true, what would you expect to be true of the regression estimates?

In [None]:
alpha=0, beta = 1

### 6.

Based on the regression results, if we observe an increase in the interest rate on USD relative to $\frn$, should we expect the USD to get stronger (appreciate) or weaker (depreciate)?

### 7.

If the risk free rates in $\frn$ increase relative to risk-free rates in USD, do we expect the forward exchange rate to be higher than the spot exchange rate?

### 8.

Do you think the estimated forecast impact of rates on currency returns would be larger over an annual horizon instead of a daily horizon? Why?

***