### Sustainable and Entrepreneurial Finance

### Assignment 1 - Portfolio allocation

#### Group 8 - Energy Firms With Available Scope 1 to 3 emissions (TRUCOST)

Useful imports:


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import random
import prettytable
import plotly.graph_objects as go
import os
import math

%matplotlib inline


#### 0 - Importing and preparing datasets for calculation.

Importing the files and creating the raw pandas data frames (it might take a while...).


In [None]:
# Setting path names

github_path = 'https://github.com/percw/Sustainable_and_Entrepreneurial_Finance/blob/master/Data_Excel'

path_gics = f'{github_path}/Trucost_CO2emissions/GICS_map%202018.xlsx?raw=true'
path_sector = f'{github_path}/Trucost_CO2emissions/sector.xlsx?raw=True'
path_returns = f'{github_path}/MSCI_ESGscores/Returns/monthlyreturns.xlsx?raw=True'
path_caps = f'{github_path}/MSCI_ESGscores/Fundamentals/size.xlsx?raw=True'

# Scope paths
path_scope1 = f'{github_path}/Trucost_CO2emissions/scope1.xlsx?raw=true'
path_scope2 = f'{github_path}/Trucost_CO2emissions/scope2.xlsx?raw=true'
path_scope3 = f'{github_path}/Trucost_CO2emissions/scope3.xlsx?raw=true'

# Local paths in case of very slow loading...
#path_gics = './Data_Excel/Trucost_CO2emissions/GICS_map 2018.xlsx'
#path_sector = './Data_Excel/Trucost_CO2emissions/sector.xlsx'
#path_returns = './Data_Excel/MSCI_ESGscores/Returns/monthlyreturns.xlsx'
#path_caps = './Data_Excel/MSCI_ESGscores/Fundamentals/size.xlsx'

# Reading excel files and creating pandas data frames
df_gics_raw = pd.read_excel(path_gics)
df_sector_raw = pd.read_excel(path_sector)
df_returns_raw = pd.read_excel(path_returns)
df_caps_raw = pd.read_excel(path_caps)

df_scope1_raw = pd.read_excel(path_scope1)
df_scope2_raw = pd.read_excel(path_scope2)
df_scope3_raw = pd.read_excel(path_scope3)


Renaming and copying the raw dataframes for convenience, so if we need to rerun som code we don't need to wait for the excel files to be loaded.


In [None]:
df_gics = df_gics_raw.copy()
df_sector = df_sector_raw.copy()
df_returns = df_returns_raw.copy()
df_caps = df_caps_raw.copy()
df_scope1 = df_scope1_raw.copy()
df_scope2 = df_scope2_raw.copy()
df_scope3 = df_scope3_raw.copy()

Renaming the index for returns and market caps from '`Unnamed: 0`' to '`date`'.


In [None]:
# Renaming index data column

df_returns.rename(columns={'Unnamed: 0': 'date'}, inplace=True)
df_caps.rename(columns={'Unnamed: 0': 'date'}, inplace=True)
display(df_returns)
display(df_caps)


Getting the ISIN codes for the energy companies.


In [None]:
industry_code = 1010.0  # based on Global Industry Classification Standard GICS
df_energy = df_sector.loc[df_sector['GICSIG'] == industry_code]
energy_isin = df_energy['ISIN'].values.tolist()

Getting the returns for the companies matching the ISIN codes in `energy_isin`.


In [None]:
# List of all ISIN codes to iterate through
return_cols = df_returns.columns.values.tolist()
display(len(return_cols))
display(len(return_cols) == df_returns.shape[1])

# Creating a list with all the ISIN Energy codes that the returns.xlsx datasheet contains
both = []
for c in return_cols:
    if c in energy_isin:
        both.append(c)


We have 5141 columns, and the same amount of columns in df_returns.


Checking the shape and general characteristics of our new list.


In [None]:
display(return_cols[:4])
display(energy_isin[:4])
display(both[:4])
display(len(return_cols))
display(len(energy_isin))
display(len(both))


Manually checking that the first four columns and companies to the ones displayed above in `df_returns`.


Inserting the date column.


In [None]:
if 'date' not in energy_isin:
    energy_isin.insert(0, 'date')

energy_isin[:4]


Now we will make sure that the energy companies have Scope 1-3 data. 

In [None]:
scope1_nrg = df_scope1[df_scope1['ISIN'].isin(energy_isin)]
scope2_nrg = df_scope2[df_scope2['ISIN'].isin(energy_isin)]
scope3_nrg = df_scope3[df_scope3['ISIN'].isin(energy_isin)]
display(scope1_nrg.shape)
display(scope2_nrg.shape)
display(scope3_nrg.shape)
scope3_nrg

In [None]:
scope1_isin = df_scope1['ISIN'].values.tolist()

In [None]:
# Count the number of NaNs in each row
n_nulls_1 = scope1_nrg.isna().sum(axis=1)

# Filter the rows with 22 NaN values
no_scope_1 = scope1_nrg[n_nulls_1 == 22]

display(no_scope_1)

In [None]:
# Count the number of NaNs in each row
n_nulls_2 = scope2_nrg.isna().sum(axis=1)

# Filter the rows with 22 NaN values
no_scope_2 = scope2_nrg[n_nulls_2 == 22]

display(no_scope_2)

In [None]:
# Count the number of NaNs in each row
n_nulls_3 = scope3_nrg.isna().sum(axis=1)

# Filter the rows with 22 NaN values
no_scope_3 = scope3_nrg[n_nulls_3 == 22]

display(no_scope_3)

So all firms in these datasets have at least one value for each scope 1, 2, and 3. Now we must make sure that the ISINs we work with are part of these ISINs.

In [None]:
scope1_nrg_isin = scope1_nrg['ISIN'].values.tolist()
scope2_nrg_isin = scope2_nrg['ISIN'].values.tolist()
scope3_nrg_isin = scope3_nrg['ISIN'].values.tolist()

Checking if there are any differences between the companies in the respective Scope 1-3 list.

In [None]:
if set(scope1_nrg_isin) == set(scope2_nrg_isin):
    print("scope1_nrg_isin and scope2_nrg_isin have the same elements (order doesn't matter)")
else:
    print("The two lists are different")
    
if set(scope1_nrg_isin) == set(scope3_nrg_isin):
    print("scope1_nrg_isin and scope3_nrg_isin have the same elements (order doesn't matter)")
else:
    print("The two lists are different")

All companies in Scope 1 are in Scope 2 and Scope 3. Thats good.

Putting together the return data for the energy companies.


In [None]:
nrg_returns = df_returns[df_returns.columns.intersection(energy_isin)]
display(nrg_returns)

# Checking the datatypes.
display(nrg_returns.dtypes.unique())


The dataset looks good. We have only float64 values which is as expected and good. Additionally we see that we have 223 columns which is the same as the length of the ISIN list created in the codeblock above.


Now we can filter on the targeted dates which is from 01.01.2005 to 31.12.2020. We'll use a mask to get the observations in this timeframe.


In [None]:
start_date = '2005-01-01'
end_date = '2020-12-31'

# Greater than or equal to the start date and smaller than or equal the end date
mask = (nrg_returns['date'] >= start_date) & (nrg_returns['date'] <= end_date)

nrg_returns = nrg_returns.loc[mask]

display(nrg_returns.iloc[0][0])
display(nrg_returns.iloc[-1][0])


Here we can see that the first and the last column has the correct dates.


Dropping the `NaN` values for companies that have more than 36 months of no return data.


In [None]:
# Drop all companies that have 36 NAN values (3 years) or more
years = 3
months = 12
too_many_nans = years*months

nrg_returns = nrg_returns.dropna(
    thresh=len(nrg_returns) - too_many_nans, axis=1)
nrg_returns


Saving a copy for Question 3 called `nrg_returns_date_column` without the `date` as index.


In [None]:
nrg_returns_date_column = nrg_returns.copy()  # For Q3

Here we set `date` as the index column on the nrg_returns dataset.


In [None]:
if 'date' in nrg_returns.columns.values.tolist():
    nrg_returns.set_index('date', inplace=True)
display(nrg_returns.isnull().sum().sum())
display(nrg_returns)


The next and last check we have to do is to see if there are any difference in the company list of the `nrg_returns` and the `Scope 1-3` list.

In [None]:
# check if all columns are in scope1_nrg_isin to see if all of them have scope 1 to 3 emissions available
# check if all column names of the dataframe are in the list
if set(nrg_returns.columns).issubset(scope1_nrg_isin):
    print("All columns of the dataframe are in the list")
else:
    print("Not all columns of the dataframe are in the list")


Our first plot. Let's plot the monthly returns with date on the x-axis and the return rate on the y-axis.


In [None]:
# Define the plot (plot all lines)

ax = nrg_returns.plot(linewidth=0.5, figsize=(12, 8))
ax.get_legend().remove()  # Removing the legend
ax.set_title('All monthly returns')  # Title
ax.set_ylabel('Returns')  # y-axis label


Here we can see that we have (some) huge outliers distorting the plot. Let's create a treshold of 1000%.


In [None]:
outlier_threshold = 10
outlier_companies = nrg_returns.loc[:, nrg_returns[(
    nrg_returns > outlier_threshold)].any(axis=0)]
outlier_list = outlier_companies.columns.values.tolist()
outlier_list


We have only one outlier, thats good. Let's remove it.


In [None]:
if outlier_list[0] in nrg_returns.columns.values.tolist():
    nrg_returns = nrg_returns.drop(columns=outlier_list)

nrg_returns


Now we can create a new plot without the outlier.


In [None]:
# Define the plot (plot all lines)

ax = nrg_returns.plot(linewidth=0.5, figsize=(12, 8))
ax.get_legend().remove()  # removing the legend
ax.set_title('All monthly returns')  # Title
ax.set_ylabel('Returns')  # y-axis label


This plot looks much better.


Now we want to export the new dataframe to a csv file.


In [None]:
# To get the current working directory
directory = os.getcwd()

# Defining subfolder path
path = directory+'/Clean_Data'

# Checking whether the specified path already exists - will avoid errors when re-running
if os.path.isdir(path):
    print(f'Path {path} already exists')
    pass

# If not, create subfolder
else:
    os.mkdir('Clean_Data')

# Saving the clean merged dataframe as a csv in a subfolder
nrg_returns.to_csv('Clean_Data/nrg_returns.csv')
# files.download('nrg_returns.csv')


#### Getting Market Caps


Taking a closer look at the Market Cap data.


In [None]:
df_caps


Extracting the intersection between the energy companies that are in the the returns dataset (`returns_isin`) and the market caps (`df_caps`).


In [None]:
returns_isin = nrg_returns.columns.values.tolist()
nrg_caps = df_caps[df_returns.columns.intersection(returns_isin)]
nrg_caps


We will for this dataset also set the `date` column as the index.


In [None]:
if 'date' in nrg_caps.columns.values.tolist():  # date as index
    nrg_caps.set_index('date', inplace=True)

nrg_returns.isnull().sum().sum()  # Count NANs


Now we are ready to start answering the questions.


#### QI - Annual average return and annualized volatility for all individual assets over the period 2005-2020. Correlation between individual average returns and volatility individually and between both metrics.


We will loop through the columns representing energy returns (`nrg_returns`). Using method `.std()` to get standard deviation.


In [None]:
df_q1 = pd.DataFrame([])

for a in nrg_returns.columns.values.tolist():

    # Get annualized average return
    avg_monthly = nrg_returns[a].mean()
    annualized_avg_return = avg_monthly*months

    # Get annualized volatility
    std_monthly = nrg_returns[a].std()
    annualized_volatility = std_monthly*math.sqrt(months)

    # Create series
    asset = {'AAR': annualized_avg_return, 'volatility': annualized_volatility}
    series = pd.Series(data=asset, index=['AAR', 'volatility'])

    # Concat
    df_q1 = pd.concat([df_q1, series.rename(a)], axis=1)

# Transpose df for readability
df_q1 = df_q1.T

# Show
display(df_q1.head())
display(df_q1.shape)


Let's have a look a the descriptive statistics of the newly created dataset and plot the variance and annualized average returns.


In [None]:
display(df_q1.describe())
df_q1_plot = df_q1.copy()
df_q1_plot
ax = df_q1.reset_index().plot(kind='scatter', x='index',  y='AAR', figsize=(6, 4))
ax.set_xlabel('Energy companies')
ax.set_title('Annualized average Return for each Energy Company')  # Title

In [None]:
df_q1.corr()

In [None]:
display(df_q1.describe())
df_q1_plot = df_q1.copy()
df_q1_plot
ax = df_q1.reset_index().plot(kind='scatter', x='index',  y='volatility', figsize=(6, 4))
ax.set_xlabel('Energy companies')
ax.set_title('Annualized average Return for each Energy Company')  # Title

In [None]:
display(df_q1.describe())
ax = df_q1.plot(kind='scatter', x='volatility', y='AAR', figsize=(6, 4))
ax.set_title('Annualized average Return against Volatility')  # Title

Lets make a plot with the correlation as well.

In [None]:
# correlation between individual average returns and volatility
corr_arr_vol_q1 = df_q1['AAR'].corr(df_q1['volatility'])
corr_arr_vol_q1

In [None]:
# Set the figure size
plt.figure(figsize=(8, 6))

# Use the function regplot to make a scatterplot
sns.regplot(x=df_q1['volatility'], y=df_q1['AAR'], color='#4A90E2', seed=0)

# Add a title and axis labels
plt.title('Correlation between AAR and Volatility')
plt.xlabel('Volatility')
plt.ylabel('AAR')

# Customize the tick marks
plt.xticks(rotation=45)
plt.yticks(rotation=45)

plt.legend(labels=[f'Correlation: {round(corr_arr_vol_q1, 3)}'])

# Change the background color
sns.set_style("whitegrid")

plt.show()

Let's check if our data is correct by manually calculating the AAR and volatility for the company with ISIN: `AU000000ERA9`.


In [None]:
nrg_returns['AU000000ERA9'].describe()


In [None]:
# Verify that this is correct by manually doing the same for the 2nd asset

# AAR
check_aar = nrg_returns['AU000000ERA9'].mean()
check_aar_annualized = check_aar*months

# Volatility
check_std = nrg_returns['AU000000ERA9'].std()
check_std_annualized = check_std*math.sqrt(months)

rounds = 50

display(round(check_aar_annualized, rounds) == round(df_q1.iloc[2][0], rounds))
display(round(check_std_annualized, rounds) == round(df_q1.iloc[2][1], rounds))


We can see that the values are equal even with 50 decimals.


#### Q.II - Equally-weighted and value-weighted portfolio with monthly rebalancing over the period 2005-2020. Report the following statistics for both portfolios: annualized average return, annualized volatility, minimum return, maximum return, and Sharpe ratio. Plot the time series of return for both portfolios

#### Q.II.I Building dataset


In [None]:
risk_free_rate = 0.05

In [None]:
# Building melted df with market caps

df_q2 = nrg_returns_date_column.drop(columns=['IE00BLNN3691']).copy()
df_q2 = df_q2.melt(id_vars=['date'], var_name='ISIN',
                   value_name='monthly_return')

df_q2['date'] = pd.to_datetime(df_q2['date'], infer_datetime_format=True)
df_q2['year'] = df_q2.date.dt.year
df_q2['month'] = df_q2.date.dt.month
df_q2 = df_q2[['date', 'year', 'month', 'ISIN', 'monthly_return']].copy()

df_size = df_caps.melt(
    id_vars=['date'], var_name='ISIN', value_name='market_cap')
df_q2 = pd.merge(df_q2, df_size, how='left', on=('date', 'ISIN'))

df_q2


In [None]:
# Annual return (sum of monthly returns per year per ISIN)
an_rtrn = df_q2[['year', 'ISIN', 'monthly_return']
                ].groupby(['year', 'ISIN']).sum().copy()


def annual_return(row):
    ISIN = row['ISIN']
    year = row['year']
    return an_rtrn.loc[(year, ISIN)][0]


df_q2['annual_returns'] = df_q2.apply(annual_return, axis=1)
df_q2


#### Q.III.II Building the Equally Weighted Portfolio


In [None]:
# Building equally weighted portfolio with monthly rebalancing
df_q2_e = df_q2.copy()

num_assets = len(df_q2_e.columns.values.tolist())
equal_weight = 1 / num_assets
df_q2_e.head(2)


In [None]:
# Computing the AAR of the equally-weighted portfolio for the 15 years under observation
AAR = df_q2_e[['ISIN', 'monthly_return']].groupby(
    'ISIN').apply(lambda x: x.mean()*months).copy()


def AAR_func(row):
    ISIN = row['ISIN']
    return AAR.loc[(ISIN)][0]


df_q2_e['ew_AAR'] = df_q1.AAR.mean()
df_q2_e['ew_annualized_volatility'] = df_q2_e.monthly_return.std() * \
    math.sqrt(months)

# Computing the annual return of the equally-weighted portolio
eq_weight_port_df = df_q2_e[['year', 'annual_returns']].groupby('year').mean()


def eq_weight_port_func(row):
    year = row['year']
    return eq_weight_port_df.loc[year][0]


df_q2_e['ew_annual_return'] = df_q2_e.apply(eq_weight_port_func, axis=1)

# Computing the monthly return of the value-weighted portfolio
ew_monthly_return_df = df_q2_e[['year', 'month', 'monthly_return']].groupby([
                                                                            'year', 'month']).sum()

def ew_monthly_return_func(row):
    year = row['year']
    month = row['month']
    return ew_monthly_return_df.loc[(year, month)][0]


df_q2_e['ew_monthly_return'] = df_q2_e.apply(ew_monthly_return_func, axis=1)

# Computing portfolio statistics
ew_min = df_q2_e['ew_annual_return'].min()
ew_max = df_q2_e['ew_annual_return'].max()
ew_sharperatio = (df_q2_e['ew_AAR'].mean() - risk_free_rate) / \
    df_q2_e['ew_annualized_volatility'].mean()
# Showing dataframe
df_q2_e[190:196]


In [None]:
ew_AAR = df_q2_e['ew_AAR'].mean()
ew_volatility = df_q2_e['ew_annualized_volatility'].mean()

print(f'Equally-weighted portfolio statistics:\n\nAAR: {ew_AAR}')
print(f'Max yearly return: {ew_max}')
print(f'Min yearly return: {ew_min}')
print(f'Sharpe ratio: {ew_sharperatio}')
print(f'Annualized volatility: {ew_volatility}')


#### Q.III.III Building Value-weighted Portfolio


In [None]:
# Building value weighted portfolio with monthly rebalancing
df_q2_v = df_q2.copy()
df_q2_v.head()


In [None]:
# get monthly total market value
val_weight_df = df_q2_v[['year', 'month', 'market_cap']
                        ].groupby(['year', 'month']).sum().copy()

# function to get value-based weights
def val_weight_func(row):
    year = row['year']
    month = row['month']
    return row['market_cap']/val_weight_df.loc[(year, month)][0]


In [None]:
df_q2_v['value_weight'] = df_q2_v.apply(val_weight_func, axis=1)
df_q2_v

In [None]:
# Computing the monthly returns for each ISIN based on the market cap weight per month
df_q2_v['vw_asset_monthly_weighted_returns'] = df_q2_v['monthly_return'] * \
    df_q2_v['value_weight']

# Computing the AAR of the value-weighted portfolio by summing all the monthly weighted returns across the portfolio and dividing by 16 years of data
df_q2_v['vw_AAR'] = (df_q2_v['vw_asset_monthly_weighted_returns'].sum())/16

# Computing the annual returns of the value-weighted portfolio
vw_annual_return_df = df_q2_v[[
    'year', 'vw_asset_monthly_weighted_returns']].groupby('year').sum()


def vw_annual_return_func(row):
    year = row['year']
    return vw_annual_return_df.loc[(year)][0]


df_q2_v['vw_annual_return'] = df_q2_v.apply(vw_annual_return_func, axis=1)

# Computing the monthly return of the value-weighted portfolio
vw_monthly_return_df = df_q2_v[['year', 'month', 'vw_asset_monthly_weighted_returns']].groupby([
                                                                                               'year', 'month']).sum()


def vw_monthly_return_func(row):
    year = row['year']
    month = row['month']
    return vw_monthly_return_df.loc[(year, month)][0]


df_q2_v['vw_monthly_return'] = df_q2_v.apply(vw_monthly_return_func, axis=1)

# Computing portfolio statistics
df_q2_v['vw_annualized_volatility'] = df_q2_v['vw_monthly_return'].std() * \
    math.sqrt(months)
vw_min = df_q2_v['vw_annual_return'].min()
vw_max = df_q2_v['vw_annual_return'].max()
vw_sharperatio = (df_q2_v['vw_annual_return'].mean(
) - risk_free_rate)/df_q2_v['vw_annualized_volatility'].mean()
# Showing dataframe
df_q2_v


In [None]:
vw_AAR = df_q2_v['vw_AAR'].mean()
vw_volatility = df_q2_v['vw_annualized_volatility'].mean()

print(f'Value-weighted portfolio statistics:\n\nAAR: {vw_AAR}')
print(f'Max yearly return: {vw_max}')
print(f'Min yearly return: {vw_min}')
print(f'Sharpe ratio: {vw_sharperatio}')
print(f'Annualized volatility: {vw_volatility}')


#### Q.II.IV Compare the two portfolios


In [None]:
# Generate sample data
x = df_q2.groupby('year').year.mean()
y1 = df_q2_v.groupby('year').vw_annual_return.mean()
y3 = df_q2_e.groupby('year').ew_annual_return.mean()

# Create figure
fig = go.Figure()

# Add time series traces
fig.add_trace(go.Scatter(x=x, y=y1, name='Value-weighted portfolio annual returns',
              line=dict(color='lightblue', width=5)))
fig.add_trace(go.Scatter(x=x, y=y3, name='Equally-weighted portfolio annual returns',
              line=dict(color='darkgreen', width=5)))

# Update layout
fig.update_layout(title='Portfolio Performance by Year',
                  xaxis_title='Year',
                  yaxis_title='Annual Return',
                  legend_title='Portfolio Type',
                  font=dict(size=16),
                  plot_bgcolor='white')

# Center legend title
fig.update_layout(legend=dict(title=dict(text='Portfolio Type', font=dict(size=18), side='top')),
                  legend_title_font=dict(size=18),
                  legend_title_side='top')

# Center plot title
fig.update_layout(title=dict(text='Portfolio Performance by Year',
                  font=dict(size=22), x=0.4, xanchor='center'))


# Customize axes
fig.update_xaxes(tickvals=x,
                 ticktext=[str(int(val)) for val in x],
                 tickangle=45,
                 dtick=1,
                 tickfont=dict(size=14),
                 gridcolor='lightgray',
                 zeroline=False)

fig.update_yaxes(tickfont=dict(size=14),
                 gridcolor='lightgray',
                 zeroline=False)

fig.show()


In [None]:
## TODO: REMOVE THIS CODE
display(df_q2_v)
#display(df_q2_e.count())

Note: Really high numbers in 2009. See https://www.naturalgasintel.com/2009-called-terrific-year-for-energy-investors-2/


In [None]:
# Create a new table
table = prettytable.PrettyTable()

# Add the columns to the table
table.field_names = ['Portfolio',
                     'Value-weighted portfolio', 'Equally-weighted portfolio']

# Add the rows to the table
tabledf = {
    'Annualized average return': ['Annualized average return', round(vw_AAR, 4), round(ew_AAR, 4)],
    'Annualized Volatility': ['Annualized volatility', round(vw_volatility, 4), round(ew_volatility, 4)],
    'Minimum return': ['Minimum return', round(vw_min, 4), round(ew_min, 4)],
    'Maximum return': ['Maximum return', round(vw_max, 4), round(ew_max, 4)],
    'Sharpe Ratio': ['Sharpe Ratio', round(vw_sharperatio, 4), round(ew_sharperatio, 4)]
}

for row in tabledf:
    table.add_row(tabledf[row])

# Add borders to the table
table.hrules = prettytable.ALL
table.header = True
table.set_style(prettytable.SINGLE_BORDER)

# Save the table to a file
with open('value&equal.txt', 'w') as f:
    f.write(str(table))

# Display the table
print(table)

table


#### Q.III - For this question, limit your set of firms to 100 randomly selected firms. Pay a particular attention to the construction of the covariance matrix. Build an optimal portfolio with minimum variance with monthly rebalancing over the period 2005-2020. Report the following statistics: annualized average return, annualized volatility, minimum return, maximum return, and Sharpe ratio. Comment on the reported statistics in comparison with the equally-weighted and value-weighted portfolio


#### Q.IV - For this question, keep the same randomly selected firms from the previous point. Build an optimal portfolios with various target portfolio returns (e.g., from 2% to 16% with 2% increments). Plot the efficient frontier as well as the individual assets. Which portfolio is the most efficient in terms of Sharpe ratio?


#### Q.V - Choose an appropriate benchmark, which corresponds to the region of your dataset. Compare the performance of your portfolios (equally-weighted, value-weighted, and minimum variance) with the benchmark. Comment on the differences.


#### Q.VI - Compute and comment on the simple correlation between returns, volatility, size.


#### Q.VII - For this question, take the same 100 selected firms. You now create a minimum variance portfolio with monthly rebalancing with an additional constraint: you exclude the smallest firms (bottom tercile of the distribution of the firms’ market capitalization in month t − 1). Report summary statistics on the performance of this portfolio and comment on the differences with the minimum variance from point 3.
