# Purpose

I have two goals for this project:

1. To help you master data manipulation and visualization
1. To help you understand the risk-return tradeoff for several measures of risk

# Tasks

## Packages and Settings

In [4]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

In [5]:
%config InlineBackend.figure_format = 'retina'
%precision 4
pd.options.display.float_format = '{:.4f}'.format

## Data

I used the following code cell to download the data for this project.
Leave this code cell commented out and use the CSV files I provided with this notebook.

In [6]:
# import yfinance as yf
# import pandas_datareader as pdr
# import requests_cache
# session = requests_cache.CachedSession(expire_after=1)

In [7]:
# wiki = pd.read_html('https://en.wikipedia.org/wiki/Russell_1000_Index')

In [8]:
# (
#     yf.Tickers(
#         tickers=wiki[2]['Ticker'].str.replace(pat='.', repl='-', regex=False).to_list(),
#         session=session
#     )
#     .history(period='max', auto_adjust=False)
#     .assign(Date = lambda x: x.index.tz_localize(None))
#     .set_index('Date')
#     .rename_axis(columns=['Variable', 'Ticker'])
#     ['Adj Close']
#     .pct_change()
#     .loc['1962':'2022']
#     .to_csv('returns.csv')
# )

In [9]:
# (
#     pdr.DataReader(
#         name='F-F_Research_Data_Factors_daily',
#         data_source='famafrench',
#         start='1900',
#         session=session
#     )
#     [0]
#     .rename_axis(columns='Variable')
#     .div(100)
#     .loc['1962':'2022']
#     .to_csv('ff.csv')
# )

Run the following code cell to read the data for this project.
The `returns.csv` file contains daily returns for the Russell 1000 stocks from 1962 through 2022, and the `ff.csv` contains daily Fama and French factors from 1962 through 2022.

In [10]:
returns = pd.read_csv('returns.csv', index_col='Date', parse_dates=True)
ff = pd.read_csv('ff.csv', index_col='Date', parse_dates=True)

In [11]:
returns

Unnamed: 0_level_0,A,AA,AAL,AAP,AAPL,ABBV,ABC,ABNB,ABT,ACGL,...,YUM,Z,ZBH,ZBRA,ZG,ZI,ZION,ZM,ZS,ZTS
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1962-01-02,,,,,,,,,,,...,,,,,,,,,,
1962-01-03,,0.0153,,,,,,,,,...,,,,,,,,,,
1962-01-04,,0.0000,,,,,,,,,...,,,,,,,,,,
1962-01-05,,-0.0019,,,,,,,,,...,,,,,,,,,,
1962-01-08,,-0.0340,,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-12-23,0.0015,0.0080,0.0119,0.0084,-0.0028,-0.0010,0.0044,0.0045,0.0014,0.0084,...,0.0006,-0.0099,0.0011,0.0029,-0.0099,-0.0052,0.0035,-0.0008,-0.0053,0.0050
2022-12-27,0.0021,0.0149,-0.0142,0.0121,-0.0139,-0.0007,-0.0059,-0.0206,0.0036,0.0038,...,0.0078,-0.0190,0.0047,0.0112,-0.0181,-0.0087,0.0080,-0.0065,-0.0111,-0.0032
2022-12-28,-0.0098,-0.0236,-0.0168,0.0019,-0.0307,-0.0047,-0.0097,-0.0120,-0.0068,-0.0160,...,-0.0045,-0.0372,-0.0101,-0.0166,-0.0365,-0.0042,-0.0178,-0.0015,0.0027,-0.0101
2022-12-29,0.0203,0.0630,0.0308,0.0070,0.0283,0.0020,-0.0078,0.0332,0.0230,0.0081,...,0.0053,0.0217,0.0146,0.0433,0.0234,0.0605,0.0231,0.0404,0.0372,0.0300


In [12]:
ff

Unnamed: 0_level_0,Mkt-RF,SMB,HML,RF
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1962-01-02,-0.0080,0.0087,0.0058,0.0001
1962-01-03,0.0024,0.0021,0.0097,0.0001
1962-01-04,-0.0083,0.0001,0.0066,0.0001
1962-01-05,-0.0134,0.0008,0.0032,0.0001
1962-01-08,-0.0075,0.0021,0.0032,0.0001
...,...,...,...,...
2022-12-23,0.0051,-0.0060,0.0115,0.0002
2022-12-27,-0.0051,-0.0073,0.0142,0.0002
2022-12-28,-0.0123,-0.0025,-0.0029,0.0002
2022-12-29,0.0187,0.0127,-0.0107,0.0002


## Single Stocks

For this section, use the single stock returns in `returns`.
You may select years $t$ and $t+1$, but only use stocks with complete returns data for years $t$ and $t+1$.

### Task 1: Do mean returns in year $t$ predict mean returns in year $t+1$?

In [41]:
def mean_return(returns, ticker, start_date, end_date):
    t = returns[ticker].loc[start_date:end_date].dropna()
    t_mean = t.mean()
    return t_mean

### Task 2: Does volatility in year $t$ predict volatility in year $t+1$?

In [39]:
def vol_returns(returns, ticker, start_date, end_date):
    s = returns[ticker].loc[start_date:end_date].dropna()
    v = s.std().
    return v
vol_returns(returns, 'AAPL', '2016-01-01', '2016-12-31')

0.0147

### Task 3: Do Sharpe Ratios in year $t$ predict Sharpe Ratios in year $t+1$?

In [18]:
# adjclose_16_17 = returns['AAPL'].loc['2016':'2017'].pct_change().resample('Y')
# adjclose_16_17


<pandas.core.resample.DatetimeIndexResampler object at 0x7fa6d0a9dac0>

In [40]:
def sharpe(ri, rf=ff['RF']):
     ri_rf = ri.sub(rf).dropna().groupby(ff.Grouper(freq='Y')).loc['2016']
     return np.sqrt(252) * ri_rf.mean() / ri_rf.std()


In [31]:
sharpe(returns_2020s['AAPL'].loc['2016':'2017'])

NameError: name 'returns_2020s' is not defined

### Task 4: Do CAPM betas in year $t$ predict CAPM betas in year $t+1$?

### Task 5: Does volatility in year $t$ predict *mean returns* in year $t+1

### Task 6: Does CAPM beta in year $t$ predict *mean returns* in year $t+1$?

## Portfolios I

For this section, create 100 random portfolios of 50 stocks each from the daily returns in `returns`.
Equally weight these portfolios and rebalance them daily.
Use the same stocks and years $t$ and $t+1$ as the previous section.

### Task 7: Does volatility in year $t$ predict *mean returns* in year $t+1$?

### Task 8: Does CAPM beta in year $t$ predict *mean returns* in year $t+1$?

## Portfolios II

Calculate monthly volatility and total return for *every stock* and *every month* in `returns`.
Drop stock-months with fewer than 15 returns.
Each month, assign these stocks to one of five portfolios based on their volatility during the previous month.
Equally weight these portfolios and rebalance them monthly.

### Task 9: Do high volatility portfolios have high mean returns and Sharpe Ratios?

## Discussion

### Task 10: Discuss and explain any limitations of your analysis above

# Criteria

1. All tasks are worth ten points
1. Discuss and explain your findings for all ten tasks
1. Here are a few more tips
    1. ***Your goal is to convince me of your calculations and conclusions***
    1. I typically find figures most convincing
    1. If you use correlations, consider how a handful of outliers may affect your findings
    1. Remove unnecessary code, outputs, and print statements
    1. Write functions for calculations that you expect to use more than once
    1. ***I will not penalize code style, but I will penalize submissions that are difficult to follow or do not follow these instructions***
1. How to submit your project
    1. Restart your kernel, run all cells, and save your notebook
    1. Export your notebook to PDF (`File > Save And Export Notebook As ... > PDF` in JupyterLab)
        1. If this export does not work, you can either (1) Install MikTeX on your laptop with the default settings or (2) use DataCamp Workspace to export your notebook to PDF
        1. You do not need to re-run your notebook to export it because notebooks store output cells
    1. Upload your notebook and PDF to Canvas
    1. Upload your PDF only to Gradescope and tag your teammates
    1. Gradescope helps me give better feedback more quickly, but I do not consider it reliable for sharing and storing your submission files