# Happy Portfolio Analysis

In this project we are analysing a new investment portfolio, the Happy Portfolio, based on the scores of countries taken from the [World Happiness Report](https://worldhappiness.report/). We take the role of a fund manager in the early stages of launching a new investment fund. We compare the historic performance of the portfolio against a world equity index and present a decision on whether to progress to a deeper analysis for the new portfolio construction. 

## Country Based ETFs

The Happy Portfolio is made up of country based ETFs. 

The below DataFrame shows a sample of the data for the first 5 months.

In [1]:
# Import relevant libraries

import pickle
import pandas as pd
import plotly.express as px
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from pathlib import Path


In [2]:
# Read in etf data from pkl file

pkl_file = open(Path("./Resources/etf_df.pkl"), 'rb')
dataframe2 = pickle.load(pkl_file)
pkl_file.close()

# load etf data back into Dataframe
all_etf_df = pd.DataFrame(dataframe2)

# Convert month end prices to monthly percentage change and drop na values

all_etf_df = all_etf_df.pct_change().dropna()


In [3]:
# Review dataframe

all_etf_df.head()

Unnamed: 0_level_0,iShares MSCI Finland Capped,iShares MSCI Denmark Capped,iShares MSCI Norway Capped,iShares MSCI Netherlands,iShares MSCI Switzerland Capped,iShares MSCI Sweden Capped,iShares MSCI New Zealand Capped,iShares MSCI Canada,iShares MSCI Austria Capped,iShares MSCI Australia,iShares MSCI Israel Capped,iShares MSCI United Kingdom,ishares S&P 500,iShares MSCI Ireland,iShares Currency Hedged MSCI Germany,iShares MSCI Mexico Capped,iShares MSCI World
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2015-02-01,0.051509,0.078402,0.037898,0.066694,0.049449,0.070284,0.062257,0.059758,0.122674,0.080292,0.032922,0.058758,0.056431,0.126511,0.070004,0.0704,0.066838
2015-03-01,-0.021739,0.028807,-0.058921,-0.012114,-0.003301,-0.035714,-0.021734,-0.029979,-0.031921,-0.034628,0.060809,-0.057068,-0.020621,-0.024607,0.043129,-0.037369,-0.014355
2015-04-01,-0.009481,0.050476,0.128748,0.032832,0.040048,0.016129,0.012232,0.071376,0.069753,0.020122,0.02115,0.06663,0.009684,0.053677,-0.051156,0.011387,0.01783
2015-05-01,0.01047,0.007616,-0.039453,0.017618,0.024899,-0.001764,-0.059926,-0.046016,-0.005928,-0.033448,-0.002129,0.006247,0.013027,0.007641,0.005908,-0.001876,0.004279
2015-06-01,-0.037892,-0.031852,-0.051647,-0.03312,-0.074859,-0.064488,-0.090241,-0.039957,-0.053667,-0.066992,-0.002328,-0.055872,-0.025201,-0.012133,-0.042217,-0.023928,-0.038615


## World Happiness Report

The weightings of the Happy Portfolio are based on the top 15 Happiness Scores for each year.

The Happiness Scores by year are shown below. 

In [4]:
# Read in country data from pkl file

pkl_file = open(Path("./Resources/wh_2015_2019.pkl"), 'rb')
dataframe3 = pickle.load(pkl_file)
pkl_file.close()

# load etf data back into Dataframe
WH_2015_2019 = pd.DataFrame(dataframe3)



In [5]:
# Transpose dataframe to allow mapping to ETFs by Country

WH_2015_2019 = WH_2015_2019.transpose()

# Review Dataframe 

WH_2015_2019.T

Country,Switzerland,Denmark,Norway,Canada,Finland,Netherlands,Sweden,New Zealand,Australia,Israel,Austria,Mexico,United States,Ireland,United Kingdom,Germany,Year
Happiness Score 2015,7.587,7.527,7.522,7.427,7.406,7.378,7.364,7.286,7.284,7.278,7.2,7.187,7.119,,,,2015.0
Happiness Score 2016,7.509,7.526,7.498,7.404,7.413,7.339,7.291,7.334,7.313,7.267,7.119,,7.104,,,,2016.0
Happiness Score 2017,7.494,7.522,7.537,7.316,7.469,7.377,7.284,7.314,7.284,7.213,7.006,,6.993,6.977,,,2017.0
Happiness Score 2018,7.487,7.555,7.594,7.328,7.632,7.441,7.314,7.324,7.272,,7.139,,,6.977,7.19,6.965,2018.0
Happiness Score 2019,7.48,7.6,7.554,7.278,7.769,7.488,7.343,7.307,7.228,7.139,7.246,,,,7.054,,2019.0


In [6]:
# Create a Figure using plotly graphs and create a custom layout
# Update menus to add buttons, create custom x and y axes, change title, 
# and update dimensions



## Portfolio Analysis
### Portfolio Mapping

Each country is mapped to the corresponding ETF. 

The mapping DataFrame is shown below.

In [7]:
# Create Map of ETFs to Countries

etf_country_map = {
    "Finland":"iShares MSCI Finland Capped",
    "Denmark":"iShares MSCI Denmark Capped",
    "Norway":"iShares MSCI Norway Capped",
    "Netherlands":"iShares MSCI Netherlands",
    "Switzerland":"iShares MSCI Switzerland Capped",
    "Sweden":"iShares MSCI Sweden Capped",
    "New Zealand":"iShares MSCI New Zealand Capped",
    "Canada":"iShares MSCI Canada",
    "Austria":"iShares MSCI Austria Capped",
    "Australia":"iShares MSCI Australia",
    "Israel":"iShares MSCI Israel Capped",
    "United Kingdom":"iShares MSCI United Kingdom",
    "United States":"ishares S&P 500",
    "Ireland":"iShares MSCI Ireland",
    "Germany":"iShares Currency Hedged MSCI Germany",
    "Mexico":"iShares MSCI Mexico Capped",
    "Benchmark":"iShares MSCI World"
}

# Create DataFrame of Countries and Corresponding ETFs

etf_country_map_df = pd.DataFrame.from_dict(etf_country_map, orient='index')
etf_country_map_df.columns = ["ETF"]
etf_country_map_df.index.name = "Country"
etf_country_map_df

Unnamed: 0_level_0,ETF
Country,Unnamed: 1_level_1
Finland,iShares MSCI Finland Capped
Denmark,iShares MSCI Denmark Capped
Norway,iShares MSCI Norway Capped
Netherlands,iShares MSCI Netherlands
Switzerland,iShares MSCI Switzerland Capped
Sweden,iShares MSCI Sweden Capped
New Zealand,iShares MSCI New Zealand Capped
Canada,iShares MSCI Canada
Austria,iShares MSCI Austria Capped
Australia,iShares MSCI Australia


The Happiness Scores are then mapped to the ETFs.

In [8]:
# Combine Dataframes to show Happiness Score by ETF (for Portfolio Weights)

country_etf_combined = pd.merge(etf_country_map_df, wh_2015_2019_df,on='Country', how='outer')

country_etf_combined.set_index('ETF', inplace=True)
columns = [2015,2016,2017,2018,2019]
country_etf_combined.columns = columns

country_etf_combined

NameError: name 'wh_2015_2019_df' is not defined

### Portfolio Weightings

For each year the score is weighted so that the total is 100.

The weighted scores are shown below with a temporary Total row to check they add up to 100. 

In [None]:
# Create dictionary of country_etf_combined columns for later calculation to weight the scores for each year.

country_dict = {}

# Loop through the country_etf_combined DataFrame to create a dictionary column for each year.

for col in country_etf_combined.columns:
    country_dict[col] = pd.DataFrame(country_etf_combined[col])
    


In [None]:
# Create new dataframe to receive the newly calculated portfolio weights

country_etf_weighted = pd.DataFrame()

# Loop through the country_dict (yearly Happiness Scores) to rebase the scores to equal 100.

for key in country_dict:
    country_etf_weighted[key] = (country_dict[key]/country_dict[key].sum()) * 100

# Round to 2 dp and review the dataframe
country_etf_weighted = country_etf_weighted.round(decimals=2)

country_etf_weighted.append(country_etf_weighted.sum().rename('Total'))

The portfolio weights for each year are displayed in an interactive donut chart to show the change over time.

Click on each fund to view which years that fund was included. 


In [None]:
# Create a new plot dictionary to loop through the country_etf_weighted dataframe, dropping the NA values for each year.

plot_dict = {}

for col in country_etf_weighted.columns:
    plot_dict[col] = pd.DataFrame(country_etf_weighted[col].dropna())

In [None]:
# Using Plotly graph_objects, plot a donut chart for each years' weightings

# Create dynamic columns - smaller of the number of charts or 3
cols = min(3, len(plot_dict.keys()))

# Create dynamic rows - number of charts / 3 
rows = int(3 * (int(len(plot_dict.keys())/3) + (len(plot_dict.keys()) % 3 > 0))/3)


# Create subplot grid based on dynamic rows and columns
fig = make_subplots(
    cols=cols, 
    rows=rows, 
    specs=[[{"type": "pie"}] * cols] * rows
)

# Set counter for row and column placement of donut charts
counter = 0

# Grid

grid = {
    1:1,
    2:2,
    3:3,
    4:1,
    5:2
}


# Loop through dictionary of dataframes to plot
for key in plot_dict.keys():

    counter +=1
           
    fig.add_trace(go.Pie(
        labels=list(plot_dict[key].index),
        textinfo="none",
        values=plot_dict[key][key],
        hoverinfo='label+value',
        hole=0.5,
        title=f"{key}"
        ), 
        row=-(-counter // cols), 
        col=grid[counter]
    )


for template in ["plotly_dark"]:
    fig.update_layout(
        template=template, 
        title="Portfolio Weightings by Year", 
        height=600,
        width = 1200,
        annotations=[dict(text='', x=0.2, y=0.5, font_size=20, showarrow=False)]
    )

fig

### Portfolio Return

The portfolio weightings table is converted to binary code and then mapped to the original ETF return dataframe. 

In [None]:
# Create a dataframe to mark countries included each year

etfs_yearly = country_etf_weighted.transpose()

# Drop the benchmark
etfs_yearly = etfs_yearly.drop(columns="iShares MSCI World")

# Convert to binary code
for value in etfs_yearly:
    etfs_yearly.loc[etfs_yearly[value] > 0, value] = 1

# Display    
etfs_yearly


The dataframe is expanded to cover the entire period.

In [None]:
# Match ETF returns to years included

etf_yearly_df = all_etf_df.copy().drop(columns='iShares MSCI World')

etf_yearly_df.loc['2015-02-01':'2015-12-31'] = [etfs_yearly.loc[2015]]*11
etf_yearly_df.loc['2016-01-01':'2016-12-31'] = [etfs_yearly.loc[2016]]*12
etf_yearly_df.loc['2017-01-01':'2017-12-31'] = [etfs_yearly.loc[2017]]*12
etf_yearly_df.loc['2018-01-01':'2018-12-31'] = [etfs_yearly.loc[2018]]*12
etf_yearly_df.loc['2019-01-01':'2019-12-31'] = [etfs_yearly.loc[2019]]*12


# Display dataframe
etf_yearly_df.head()

The dataframe is then multiplied by the monthly returns and consolidated into a single return for each month by summing the total and dividing by the number of etfs for the month. 

The totals assume an equal weighted portfolio.

A sample of the data is shown below.

In [None]:
# Multiply years included (etf_yearly) by the monthly returns (all_etf_df)

all_etf_yearly = all_etf_df * etf_yearly_df
all_etf_yearly = all_etf_yearly.sum(axis=1) / all_etf_yearly.count(axis='columns')
all_etf_yearly.head()

In [None]:
# Extract benchmark from all_etf_df dataframe

benchmark = pd.DataFrame(all_etf_df["iShares MSCI World"])


The benchmark monthly returns and Happy Portfolio returns are combined into one Dataframe. A sample is shown below.

In [None]:
# Create plot dataframe by combining benchmark and portfolio return dfs

plot_df_return = pd.concat([benchmark, all_etf_yearly], axis=1)

# Rename columns
columns = ["Happy Portfolio","Benchmark"]
plot_df_return.columns = columns

# Add back in the start value of 0 for the plot
new_row = pd.DataFrame({"Happy Portfolio":0, "Benchmark":0}, index=['2015-01-01'])
plot_df_return = pd.concat([plot_df_return, new_row]).sort_index()

# Review
plot_df_return.head()

In [None]:
# Plot monthly return 2015-2019

fig = px.line(
    plot_df_return,
    title="Monthly Return 2015-2019 (Equal Weighted)",
    height=500,
    width=1000,
    labels={
        "value":"Return",
        "index":"Date",
        "variable":"Happy Portfolio"
    }
)

for template in ["plotly_dark"]:
    fig.update_layout(template=template)

fig

### Summary statistics for the Happy Portfolio and the Benchmark

In [None]:
# Run summary statistics on the portfolio and the benchmark

plot_df_return.describe()

### Distribution of Returns

The distribution of the monthly return over the period is show in a boxplot.

In [None]:
# Plot the distribution on a boxplot

fig = go.Figure()

y0 = plot_df_return['Happy Portfolio']
y1 = plot_df_return['Benchmark']

fig.add_trace(go.Box(x=y0, name='Happy Portfolio'))
fig.add_trace(go.Box(x=y1, name= 'Benchmark'))

fig.update_layout(
    title='Distribution of Monthly Return 2015-2019: Happy Portfolio vs Benchmark (Equal Weights)',
    showlegend=False
)

for template in ["plotly_dark"]:
    fig.update_layout(
        template=template,
        xaxis_title='Distribution'
    )

fig

In [None]:
# Calculate Annual Return using 12 trading periods for monthly data

avg_ann_return = plot_df_return.mean() * 12 * 100
avg_ann_return

In [None]:
# Calculate Annualised Standard Deviation using 12 trading periods

ann_standard_dev = plot_df_return.std() * np.sqrt(12)
ann_standard_dev

In [None]:
# Calculate Sharpe Ratio

sharpe_ratios = avg_ann_return / ann_standard_dev
display(sharpe_ratios)

In [None]:
# Plot Sharpe Ratios

fig = px.bar(
    sharpe_ratios,
    title="Sharpe Ratios of Happy Portfolio vs Benchmark"
)

for template in ["plotly_dark"]:
    fig.update_layout(
        template=template,
        showlegend=False,
        xaxis_title="",
        yaxis_title="Sharpe Ratio"
    )

fig

### Cumulative Portfolio Return

Each year the portfolio is rebalanced according to the weights of the Happiness Score. The returns for each year are calculated below.

In [None]:

# Set starting values for 2015 based on weight
portfolio_start_2015 = country_etf_weighted.iloc[:,0]

# Check total = 100
portfolio_sum = portfolio_start_2015.sum()
print(f"Calcalation check: The portfolio starting value is ${portfolio_sum}.")

# Create new dataframe for 2015 return calc using the 2015 start values
portfolio_return_2015 = pd.DataFrame(portfolio_start_2015)
portfolio_return_2015 = portfolio_return_2015.transpose()

# Rename weighted value index to start date 01-01-2015
portfolio_return_2015 = portfolio_return_2015.reset_index(drop=True)
portfolio_return_2015 = portfolio_return_2015.rename(index={0:'2015-01-01'})

# Append the all_etf_df monthly returns
portfolio_return_2015 = portfolio_return_2015.append(all_etf_df.loc['2015-02-01':'2015-12-31']).sort_index()

# Calculate cumulative return and drop na row
portfolio_return_2015 = (1 + portfolio_return_2015.shift(1)).cumprod().dropna(how='all')

# Drop ETFs not included in portfolio for 2015
portfolio_return_2015 = portfolio_return_2015.dropna(axis=1)

# Combine return for portfolio 
portfolio_return_2015 = portfolio_return_2015.sum(axis=1)

# Review the Series
portfolio_return_2015

In [None]:
# Set starting values for 2016 based on 2015 year end value and 2016 weights
portfolio_start_2016 = portfolio_return_2015[-1] / 100 * country_etf_weighted.iloc[:,1]

# Check total = 2015 year end value $111.69
portfolio_sum = portfolio_start_2016.sum()
print(f"Calcalation check: The portfolio starting value is ${portfolio_sum: .2f}.")

# Create new dataframe for 2016 return calc using the 2016 start values
portfolio_return_2016 = pd.DataFrame(index=["2015-12-31"],data=[portfolio_start_2016])

# Append the all_etf_df monthly returns
portfolio_return_2016 = portfolio_return_2016.append(all_etf_df.loc['2016-01-01':'2016-12-31']).sort_index()

# Calculate cumulative return and drop na row
portfolio_return_2016 = (1 + portfolio_return_2016.shift(1)).cumprod().dropna(how='all')

# Drop ETFs not included in portfolio for 2015
portfolio_return_2016 = portfolio_return_2016.dropna(axis=1)

# Combine return for portfolio
portfolio_return_2016 = portfolio_return_2016.sum(axis=1)

# Review the Series
portfolio_return_2016

In [None]:
# Set starting values for 2017 based on 2016 year end value and 2017 weights
portfolio_start_2017 = portfolio_return_2016[-1] / 100 * country_etf_weighted.iloc[:,2]

# Check total = 2016 year end value $126.55
portfolio_sum = portfolio_start_2017.sum()
print(f"Calcalation check: The portfolio starting value is ${portfolio_sum: .2f}.")

# Create new dataframe for 2017 return calc using the 2017 start values
portfolio_return_2017 = pd.DataFrame(index=["2016-12-31"],data=[portfolio_start_2017])

# Append the all_etf_df monthly returns
portfolio_return_2017 = portfolio_return_2017.append(all_etf_df.loc['2017-01-01':'2017-12-31']).sort_index()

# Calculate cumulative return and drop na row
portfolio_return_2017 = (1 + portfolio_return_2017.shift(1)).cumprod().dropna(how='all')

# Drop ETFs not included in portfolio for 2015
portfolio_return_2017 = portfolio_return_2017.dropna(axis=1)

# Combine return for portfolio
portfolio_return_2017 = portfolio_return_2017.sum(axis=1)

# Review the Series
portfolio_return_2017

In [None]:
# Set starting values for 2018 based on 2017 year end value and 2018 weights
portfolio_start_2018 = portfolio_return_2017[-1] / 100 * country_etf_weighted.iloc[:,3]

# Check total = 2017 year end value $168.07
portfolio_sum = portfolio_start_2018.sum()
print(f"Calcalation check: The portfolio starting value is ${portfolio_sum: .2f}.")

# Create new dataframe for 2018 return calc using the 2018 start values
portfolio_return_2018 = pd.DataFrame(index=["2017-12-31"],data=[portfolio_start_2018])

# Append the all_etf_df monthly returns
portfolio_return_2018 = portfolio_return_2018.append(all_etf_df.loc['2018-01-01':'2018-12-31']).sort_index()

# Calculate cumulative return and drop na row
portfolio_return_2018 = (1 + portfolio_return_2018.shift(1)).cumprod().dropna(how='all')

# Drop ETFs not included in portfolio for 2018
portfolio_return_2018 = portfolio_return_2018.dropna(axis=1)

# Combine return for portfolio
portfolio_return_2018 = portfolio_return_2018.sum(axis=1)

# Review the Series
portfolio_return_2018

In [None]:
# Set starting values for 2019 based on 2018 year end value and 2019 weights
portfolio_start_2019 = portfolio_return_2018[-1] / 100 * country_etf_weighted.iloc[:,3]

# Check total = 2018 year end value $162.80
portfolio_sum = portfolio_start_2019.sum()
print(f"Calcalation check: The portfolio starting value is ${portfolio_sum: .2f}.")

# Create new dataframe for 2019 return calc using the 2019 start values
portfolio_return_2019 = pd.DataFrame(index=["2018-12-31"],data=[portfolio_start_2019])

# Append the all_etf_df monthly returns
portfolio_return_2019 = portfolio_return_2019.append(all_etf_df.loc['2019-01-01':'2019-12-31']).sort_index()

# Calculate cumulative return and drop na row
portfolio_return_2019 = (1 + portfolio_return_2019.shift(1)).cumprod().dropna(how='all')

# Drop ETFs not included in portfolio for 2019
portfolio_return_2019 = portfolio_return_2019.dropna(axis=1)

# Combine return for portfolio
portfolio_return_2019 = portfolio_return_2019.sum(axis=1)

# Review the Series
portfolio_return_2019

The benchmark is rebased to 100 to match the portfolio. 

In [None]:
# Extend benchmark to 2019 - once all consolidated use one code for Benchmark

# Create a new benchmark dataframe, rebase to 100 for comparible data to portfolio

benchmark = pd.DataFrame(index=["Benchmark"], data=[100], columns=["2015-01-01"])

# Append benchmark data from all_etf_df and combine into a single columns

benchmark = benchmark.append(all_etf_df["iShares MSCI World"])
benchmark = benchmark.transpose().sum(axis=1)

# Slice by date 
benchmark = benchmark.loc["2015-01-01":"2019-12-31"]

# Calculate cumulative return and drop na (first value)
benchmark = (1 + benchmark.shift(1)).cumprod().dropna()


In [None]:
# Create plot dataframe by combining benchmark and portfolio return dfs

plot_df_2019 = pd.concat([portfolio_return_2015, portfolio_return_2016, portfolio_return_2017, portfolio_return_2018, portfolio_return_2019], axis=0)

plot_df_2019 = pd.concat([plot_df_2019, benchmark.loc["2015-01-01":"2019-12-31"]], axis=1)

# Rename columns
columns = ["Happy Portfolio","Benchmark"]
plot_df_2019.columns = columns

# Add back in the start value of 100 for the plot
new_row = pd.DataFrame({"Happy Portfolio":100, "Benchmark":100}, index=['2015-01-01'])
plot_df_2019 = pd.concat([plot_df_2019, new_row]).sort_index()

# Review
plot_df_2019.head()

The chart below shows the growth of $100 invested in 2015 and rebalanced each year per the Happiness Score weightings.

In [None]:
# Plot 2015-2019

fig = px.line(
    plot_df_2019,
    title="Growth of $100 invested 2015-2019",
    height=500,
    width=1000,
    labels={
        "value":"Return",
        "index":"Date",
        "variable":"Happy Portfolio"
    }
)

for template in ["plotly_dark"]:
    fig.update_layout(template=template)

fig.show()

## Summary

Over the period 2015-2019 the Happy Portfolio outperformed the benchmark by 45%. 

The average annual returns were 2.7% higher and the standard deviation (risk) was lower. 

The distribution in returns was smaller than the benchmark and the sharpe ratio was higher, illustrating a better return on investment for the level of risk. 

We strongly recommend expanding the research into this new portfolio.