[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1GtpixdCynQ1uCi45Nph794qOnx7pu-8Y?usp=sharing)

# <strong> Investment Management 1</strong>
---
#<strong> Assignment 2</strong>
**You have to use this Colab notebook to complete the assignment. To get started, create a copy of the notebook and save it on your Google drive.**

&nbsp;

**Deadline:** See C@mpus. 
The assignment must be completed individually. The TBS plagarism rules apply.

&nbsp;

**Total:** 100 Points

&nbsp;

**Late submission penalty:** there is a penalty-free grace period of two hours past the deadline. Any work that is submitted between 2 hour and 24 hours past the deadline will receive a 20% grade deduction. No other work will be accepted after that. C@mpus submission time will be used, not your local computer time. You can submit your completed assignment as many times as required before the deadline. Consider submitting your work early.

&nbsp;

**Learning outcomes**

In this assignment, you will consider and implement different approaches to comparing and evaluating performance of actively managed investment funds.

You will be provided with a dataset containing historical returns on a sample of actively managed U.S. mutual funds. You are expected to
evaluate the performance of these funds over the analysed sample period using the key absolute and risk-adjusted performance measures discussed in class. You will also use regression analysis to estimate the alpha potential and the factor exposures of the analysed mutual funds. 

&nbsp;

By the end of this assignment, you should be able to:

* upload financial time series data into Python and perform basic operations with `pandas` dataframes

* perform EDA (exploratory data analysis) on a given financial dataset

* estimate single- and multi-factor models for return samples (e.g., CAPM, Fama-French, Carhart, etc)

* test for market timing ability using regression models

&nbsp;


**Data to use**

The data necessary to complete this assignment are in the course GitHub repository - see the "assignment_2" folder. 

The “active_fund_returns” document contains historical monthly returns on 55 actively managed US mutual funds over the 1990 - 2018 period. You can find more information on each fund in the “fund_id” document. Monthly returns on the corresponding benchmark portfolio - the S&P 500 index - are in the “benchmark_returns” document. Note that all sample funds reported the S&P 500 index as their prospectus benchmark.

In addition to the fund and benchmark data available in the course repository, you should use the risk-free rate from the <a href="http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html"> Kenneth French data library</a>. To simplify and automate the process of importing the necessary data from the Kenneth French data library, you may want to consider <a href="https://pandas-datareader.readthedocs.io/en/latest/readers/famafrench.html"> this</a> Python library.

If you require any other data to complete the assignment, you can use a Python financial data library of your choice. There are several libraries to consider, such as `yfinance`, `pandas_datareader`, `yahoo_fin`, `ffn` (highly recommended), and `PyNance`.

You are also free to use any Python data visualisation library of your choice (default is `matplotlib`). Some of the available options include: `Seaborn`, `Bokeh`, `ggplot`, `pygal`, and `Plotly`.

##**What to submit**

Submit a PDF file containing your code, outputs, and write-up from parts 1-4. You can produce a PDF of your Colab file by going to `File >>> Print` and selecting `save as PDF`. See the <a href="https://github.com/mscouse/TBS_investment_management/blob/main/Python_workspace.pdf">Python Workspace</a> document in the course GitHub repository for more information. **Do not submit any other data files produced by your code.**

&nbsp;

You also need to provide a link to your completed Colab file in your submission - see the **"Colab link"** section below.

Please note that you have to use Google Colab to complete this assignment. If you want to use Jupyter Notebook, complete the assignment and upload your Jupyter Notebook file to Google Colab for submission. 

##**Colab link**
Before submitting your work, make sure to include a link to your colab file below.

**Colab Link:** _ _ _ _ _ _ _ _ _ _ _ _

##**Part 1: Loading, visualising and summarising historical data [10 pt]**

The data necessary to complete this assignment are in the course GitHub repository - see the "assignment_2" folder.

The “active_fund_returns” document contains historical monthly returns on 55 actively managed US mutual funds over the 1990 - 2018 period. You can find more information on each fund in the “fund_id” document. Monthly returns on the corresponding benchmark portfolio - the S&P 500 index - are in the “benchmark_returns” document. Note that all sample funds reported the S&P 500 index as their prospectus benchmark.

###Part 1.1. Loading and summarising historical fund returns (6pt)

Import the available fund and benchmark data from the course GitHub repository and store them in `pandas` DataFrame objects named `fund_returns` and `benchmark_return`. Remember to import any required Python libraries before you start working with the data.

&nbsp;

Compute and display the descriptive statistics for all sample funds. These should include: count, mean, standard deviation, min, and max return values. Repeat the same for the benchmark portfolio.
What can you conclude from the computed absolute performance measures? [DISCUSS]

In [None]:
# step 1: import required libraries using "import"
# YOUR CODE HERE

In [None]:
# step 2: import historical fund and benchmark data
# YOUR CODE HERE

In [None]:
# step 3: compute and display the descriptive statistics
# YOUR CODE HERE

###Part 1.2. Visualising and benchmarking performance (4pt)

Using the resulting descriptive statistics from Part 1.1. above, identify 3 funds with the highest and 3 funds with the lowest average monthly returns over the sample period. Visualise their compounded performance over the entire period on the same diagramme using a Python data visualisation library of your choice (default is matplotlib).

Plot the compounded performance of the benchmark portfolio on the same diagramme.

In [None]:
# step 4: visualise compounded fund performance & benchmark 
# YOUR CODE HERE

##**Part 2: Computing and analysing risk-adjusted performance measures [30pt]**

In this part of the assignment, you need to calculate and compare several risk-adjusted performance measures for the sample mutual funds. You should report all performance measures in one table to make the comparison easier. Furthermore, all reported measures should be annualised.

Make sure you import the required libraries first.

###Part 2.1. Computing Sharpe, M-squared and Treynor (25pt)


Compute and compare the risk-adjusted performance measures listed below for any 5 sample funds of your choice. You can compute these measures for all 55 funds, if desired. You should report all performance measures (annualised^) in one table (i.e., dataframe) to make the comparison easier:

1. Sharpe ratio (you can use the Sharpe function from Assignment 1) [5pt]
2. M-squared measure [10pt]
3. Treynor measure [10pt]

To compute the Treynor ratios, you will first need to estimate funds' betas, as follows:

$$
\beta_{i,M} = \frac{cov(r_{i}, r_M)}{var(r_M)}=\rho_{i,M} \times \frac{\sigma_{r_i}}{\sigma_{r_M}}
$$



^Hint:

**To annualise returns:** if the average monthly return is $0.01$, or $1\%$, the annualised average monthly return could be proxied as $0.01 × 12$. This approach ignores compounding. To account for compounding, you would instead use: $(1+0.01)^{12} – 1$. For this project, it is acceptable to ignore compounding and annualise by simply multiplying the monthly returns by 12. 

**To annualise standard deviation:** if the standard deviation of monthly stock returns is $0.04$, or $4.0\%$, the annualised standard deviation is $0.04 \times \sqrt{12}$, or $13.86\%$.

In [None]:
# step 5: upload historical risk-free rate
# YOUR CODE HERE

In [None]:
# step 6: compute fund excess returns
# YOUR CODE HERE

In [None]:
# step 7: compute and report the required risk-adjusted measures
# YOUR CODE HERE

###Part 2.2. Results and conclusions (5pt)

Discuss the results presented in Part 2.1. What can you conclude from the computed risk-adjusted performance measures?


#### **Results and conclusions**

YOUR COMMENTS HERE 


##**Part 3: Multi-factor models [40pt]**

In this part of the assignment, you will be implementing the CAPM, Fama-French, and Carhart models in Python. In addition, you will compute and analyse 2 further risk adjusted performance measures - Jensen's alpha and Information Ratio. 

The "factors" document in the course GitHub repository provides data on several time-series of returns you need to use in this section. In particular, it reports the return on the Fama-French SMB (small-minus-big), HML (value-minus-growth), RMW (robust-minus-weak), and CMA (conservative-minus-aggressive) factors. It also presents the return on the Carhart MOM (winners-minus-losers) factor, and the risk-free rate of return (RF). The SMB, HML, RMW, CMA and MOM factors are already expressed as differences in returns between two portfolios so you do not need to calculate excess returns on them. All factor returns are in decimals.

### 3.1. Implementing the CAPM & computing diversifiable and non-diversifiable risk of a portolio (10pt)

Use the excess fund and benchmark returns computed in the previous sections to estimate the CAPM model as a linear regression. The model should be estimated for any 5 sample funds of your choice. You can estimate the CAPM for all 55 funds, if desired. You should report the coefficient estimates (i.e., alphas and betas) for all funds in one table to make the comparison easier. The table should also include the p-values for each estimated coefficient. 

You may want to use the <a href="https://www.statsmodels.org/stable/index.html">`statsmodels`</a> Python module to estimate the CAPM model. The `statsmodels` module provides classes and functions for the estimation of many different statistical models, as well as for conducting numerous statistical tests, and statistical data exploration. You can import and use the module as follows:

```
# Import statsmodels.formula.api
import statsmodels.formula.api as smf 

# Define the regression formula
capm_model = smf.ols(formula='portfolio_excess ~ market_excess', data=data)

# Fit the regression
capm_model = capm_model.fit()

# print results 
print(capm_model.summary())
```
Note that, in line with our discussion in class, the estimated **beta is a proxy for systematic, non-diversifiable risk**. The **diversifiable, fund-specific, risk is captured by the residual standard error** (or residual standard deviation) of the regression.

In [None]:
# step 8: import required modules and libraries
# YOUR CODE HERE

In [None]:
# step 9: estimate the CAPM model for the sample funds and report the results
# YOUR CODE HERE

### 3.2. Implementing the Fama-French 3-factor model (8pt)

Use the excess fund and benchmark returns computed in the previous sections to 
estimate the Fama-French model as a linear regression. The remaining factors are in the "factors" document (see the course GitHub page).

The model should be estimated for any 5 sample funds of your choice. You can estimate the Fama-French for all 55 funds, if desired. You should report the coefficient estimates (i.e., alphas and all betas) for all funds in one table to make the comparison easier. The table should also include the p-values for each estimated coefficient. 

You may want to use the <a href="https://www.statsmodels.org/stable/index.html">`statsmodels`</a> Python module to estimate the Fama-French model:

```
# Import statsmodels.formula.api
import statsmodels.formula.api as smf 

# Define the regression formula
capm_model = smf.ols(formula='portfolio_excess ~ market_excess + SMB + HML', data=data)

# Fit the regression
FamaFrench_fit = FamaFrench_model.fit()

# print results 
print(capm_model.summary())
```

&nbsp;


**Optional (not part of the assignment):**

You can try constructing the [Fama-French](https://en.wikipedia.org/wiki/Fama%E2%80%93French_three-factor_model) factors - SMB and HML - from scratch, if desired, as follows: 

1. Divide stocks into Big (B) and Small (S) according to their market capitalisation; and divide into Low (L), Middle (M), High (H) according to their Book/Price ratio. This leads to six groups or portfolios: B/L, B/M, B/H, S/L, S/M, S/H.
2. Calculate historical value-weighted returns of these six portfolios.
3. Define factor returns using a long-short approach:

$$
\begin{aligned}
SMB&=\frac{S/L+S/M+S/H}{3}-\frac{B/L+B/M+B/H}{3} \\\\
HML&=\frac{S/H+B/H}{2}-\frac{S/L+B/L}{2}
\end{aligned}
$$

The SMB and HML factors are published on the [Kenneth R. French website](https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html).

In [None]:
# step 10: import required modules and libraries
# YOUR CODE HERE

In [None]:
# step 11: estimate and report the 3-factor Fama-French model for the sample funds
# YOUR CODE HERE

### 3.3. Implementing the Fama-French 5-factor model (5pt)

Follow the steps discussed in Part 3.2. and estimate the 5-factor Fama-French model. Use the excess fund and benchmark returns computed in the previous sections to estimate the model as a linear regression. The remaining factors are in the "factors" document (see the course GitHub page).

As before, the model should be estimated for any 5 sample funds of your choice. You can estimate the Fama-French for all 55 funds, if desired. You should report the coefficient estimates (i.e., alphas and all betas) for all funds in one table to make the comparison easier. The table should also include the p-values for each estimated coefficient. 


In [None]:
# step 12: import required modules and libraries
# YOUR CODE HERE

In [None]:
# step 13: estimate and report the 5-factor Fama-French model for the sample funds
# YOUR CODE HERE

### 3.4. Implementing the Fama-French-Carhart 4-factor model (5pt)

Follow the steps discussed in Part 3.2. and estimate the 4-factor Fama-French-Carhart model. Use the excess fund and benchmark returns computed in the previous sections to estimate the model as a linear regression. The remaining factors are in the "factors" document (see the course GitHub page). 

As before, the model should be estimated for any 5 sample funds of your choice. You can estimate the Fama-French for all 55 funds, if desired. You should report the coefficient estimates (i.e., alphas and all betas) for all funds in one table to make the comparison easier. The table should also include the p-values for each estimated coefficient.


In [None]:
# step 14: import required modules and libraries
# YOUR CODE HERE

In [None]:
# step 15: estimate and report the 4-factor Fama-French-Carhart model for the sample funds
# YOUR CODE HERE

### 3.5. Computing and analysing the Information Ratio (5pt)

Compute Information Ratios using alphas from the CAPM model and the estimated non-diversifiable risk for each fund - see Part 3.1. Note that you do not need to annualise the computed Information ratios.

The Information ratios should be computed for any 5 sample funds of your choice. You can compute the ratios for all 55 funds, if desired. You should report all ratios in one table to make the comparison easier.



In [None]:
# step 16: compute the required Information ratios
# YOUR CODE HERE

### 3.6. Results and conclusions (7pt)

Discuss the results reported in Parts 3.1-3.5. What can you conclude from the regression estimations?

Report whether the estimated alphas and betas vary by model and discuss possible reasons for this.

**Your response/ short explanation:** ________HERE_________

##**Part 4: Market timing evaluation [20pt]**

Portfolio managers often claim to be able to generate abnormal returns through either superior security selection or market timing. There are several approaches to capture and assess managers' market timing skills. The Treynor and Mazuy (TM) and Henriksson-Merton (HM) models are the most used return-based approaches to isolate market timing skills. As part of this section, you will estimate both models for 5 sample funds of your choice and analyse the results. 



### Part 4.1. Estimating and analysing the Treynor and Mazuy model [10pt]

Treynor and Mazuy (1966) suggested that a security characteristic line should be estimated by adding a squared term to the usual linear index model, as follows:


$$(r_{it}-r_{ft}) = \alpha_i + \beta_i(r_{mt}-r_{ft}) + \gamma_i(r_{mt}-r_{ft})^2 + \epsilon_{it}$$


where the coefficient estimate $γ_i$ indicates superior market timing ability if it is positive and significant.

Using previously computed fund and benchmark excess returns, estimate the TM model for any 5 sample funds of your choice and comment on the results. Is there statistically significant evidence of market timing ability?

In [None]:
# step 17: import required libraries
# YOUR CODE HERE

In [None]:
# step 18: estimate the models
# YOUR CODE HERE

**Your response/ short explanation of results:** ________HERE_________

### Part 4.2. Estimating and analysing the Merton and Henriksson model [10pt]

Merton and Henriksson (1981) show that the portfolio beta fluctuates between two values depending on whether the market return is larger or lower. Hence the characteristic line has a higher slope in months in which the market goes up than in months in which the market goes down:

$$(r_{it}-r_{ft}) = \alpha_i + \beta_i(r_{mt}-r_{ft}) + \gamma_i(r_{mt}-r_{ft})\times D + \epsilon_{it}$$

where $D = 1$, if $r_{mt} \geq r_{ft}$ & $D = 0$, if $r_{mt} < r_{ft}$


Therefore, the beta of the portfolio is $\beta_i$ in bear markets and $\beta_i + \gamma_i$ in bull markets. A positive value of $γ_i$ denotes superior market timing ability, assuming it is positive and statistically significant.

Using previously computed fund and benchmark excess returns, estimate the MH model for any 5 sample funds of your choice and comment on the results. Is there statistically significant evidence of market timing ability?


In [None]:
# step 19: import required libraries
# YOUR CODE HERE

In [None]:
# step 20: estimate the models
# YOUR CODE HERE

**Your response/ short explanation of results:** ________HERE_________