# Building Stock Indices

Note: __yfinance is currently unstable/unreliable__ when it comes to downloading fundamental data with the ticker object. In particular, __ticker.info()__ is flawed.

__Action required__: Check for the __latest yfinance versions__ and update with the following command (Anaconda Prompt / Terminal):

pip install yfinance --upgrade

In the following, I have added an __alternative Yahoo Finance API Wrapper__, which is more stable/reliable: __yahooquery__

__Action required: Please install yahooquery with the following command (Anaconda Prompt / Terminal):__

pip install yahooquery

## Getting started

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf
pd.set_option('display.float_format', lambda x: '%.2f' % x)

In [None]:
df = pd.read_csv("DJI_Const.csv", header = [0, 1], index_col = 0, parse_dates=[0])
df

In [None]:
df.dropna(inplace = True)
df

In [None]:
close = df.Close.copy()
close

In [None]:
returns = close.pct_change() # simple returns
returns

In [None]:
index = "^DJI"

In [None]:
const = close.columns.drop([index])
const

## Building a Price-weighted Index

In [None]:
close[const]

In [None]:
close[const].sum(axis = 1)[0] # sum of prices day 1

__Index (Base Value = 100)__

In [None]:
pwi = close[const].sum(axis = 1).div(close[const].sum(axis = 1)[0]).mul(100)
pwi

In [None]:
pwi.name = "pwi"
pwi

__Cross-Check with DJI Data__

In [None]:
dji_norm = close["^DJI"].div(close["^DJI"][0]).mul(100)
dji_norm

In [None]:
pwi.plot(figsize = (12, 8))
dji_norm.plot()
plt.legend()
plt.show()

__Weights over time__

In [None]:
weights_PWI = close[const].div(close[const].sum(axis = 1), axis = "rows")
weights_PWI 

In [None]:
weights_PWI.plot(figsize = (15, 8), fontsize = 13)
plt.title("PWI - Weights", fontsize = 15)
plt.legend(fontsize = 13)
plt.show()

In [None]:
weights_PWI[["AAPL", "MSFT"]].plot(figsize = (15, 8), fontsize = 13)
plt.title("PWI - Weights", fontsize = 15)
plt.legend(fontsize = 13)
plt.show()

## Building an Equal-weighted Index

In [None]:
returns

In [None]:
mean_ret = returns[const].mean(axis = 1)
mean_ret

In [None]:
ewi = mean_ret.add(1).cumprod().mul(100)
ewi

In [None]:
ewi[0] = 100

In [None]:
ewi.name = "ewi"
ewi

## Building a Value-weighted Index (Part 1)

__Historical Market Caps__

- Hard to get from free Web Soruces
- Still, we can calculate an approximation with yahoo finance data
- Course might cover a paid data source with historical market caps

In [None]:
import yfinance as yf
from yahooquery import Ticker

In [None]:
const

In [None]:
ticker = yf.Ticker(ticker = "AAPL") #yfinance
ticker

In [None]:
info = pd.Series(ticker.get_info()) # yfinance
info

In [None]:
shares = info["sharesOutstanding"] #yfinance
shares

In [None]:
shares = Ticker("AAPL").key_stats["AAPL"]["sharesOutstanding"] #yahooquery
shares

In [None]:
mcap = close.AAPL * shares
mcap

(Simplified) Assumption: __Outstanding Shares remained constant__ in the most recent time period (no new share issues or buy-backs)

In [None]:
mcap = close[const].copy() # dummy df to insert mcaps
mcap

In [None]:
# yfinance
count = 1
for symbol in const:
    try:
        shares = yf.Ticker(ticker = symbol).get_info()["sharesOutstanding"]
        mcap[symbol] = mcap[symbol] * shares
        print(count, end = '\r')
        count += 1
    except Exception as e:
        print("{} not found".format(symbol))
print("Download complete.")

In [None]:
# yahooquery
count = 1
for symbol in const:
    try:
        shares = Ticker(symbols = symbol).key_stats[symbol]["sharesOutstanding"]
        mcap[symbol] = mcap[symbol] * shares
        print(count, end = '\r')
        count += 1
    except Exception as e:
        print("{} not found".format(symbol))
print("Download complete.")

In [None]:
mcap.iloc[-1].sort_values(ascending = False)

In [None]:
plt.figure(figsize = (12, 8))
mcap.iloc[-1].sort_values().plot.pie()
plt.show()

## Building a Value-weighted Index (Part 2)

In [None]:
mcap

In [None]:
total_mcap = mcap.sum(axis = "columns") # total market cap
total_mcap

In [None]:
total_mcap.plot(figsize = (12, 8))
plt.title("Total Market Cap DJIA")
plt.show()

__Weights over time__

In [None]:
weights_VWI = mcap.div(total_mcap, axis = "rows")
weights_VWI

In [None]:
weights_VWI.sum(axis = "columns")

In [None]:
weights_VWI.plot(figsize = (15, 8), fontsize = 13)
plt.title("PWI - Weights", fontsize = 15)
plt.legend(fontsize = 13)
plt.show()

In [None]:
returns # daily return

In [None]:
weights_VWI # weights at the end of the day

In [None]:
mcwr = returns[const].mul(weights_VWI.shift()).sum(axis = "columns")
mcwr # simple returns vwi

In [None]:
vwi = mcwr.add(1).cumprod().mul(100)
vwi

In [None]:
vwi.name = "vwi"
vwi

## Analysis and Comparison (Part 1)

In [None]:
indices = pd.concat([vwi, pwi, ewi], axis = 1).iloc[:-1]
indices

In [None]:
indices.plot(figsize = (12, 8))
plt.title("VWI vs. PWI vs. EWI", fontsize = 15)
plt.show()

__Keep in mind:__ While VWI and PWI are (mostly) __self-rebalancing__, EWI requires/assumes __daily rebalancing__! -> __Trading Costs__! 

## Analysis and Comparison (Part 2)

In [None]:
indices

In [None]:
close

In [None]:
prices_m = pd.concat([close[const], indices], axis = 1)
prices_m

In [None]:
returns_m = prices_m.pct_change().dropna() # simple returns
returns_m

In [None]:
def ann_risk_return(returns_df): # assumes simple returns as input
    summary = pd.DataFrame(index = returns_df.columns)
    summary["ann. Risk"] = returns_df.std() * np.sqrt(252)
    log_returns = np.log(returns_df + 1)
    summary["CAGR"] = np.exp(log_returns.mean() * 252) - 1
    return summary

In [None]:
summary = ann_risk_return(returns_m)
summary

In [None]:
summary.plot(kind = "scatter", x = "ann. Risk", y = "CAGR", figsize = (15,12), s = 50, fontsize = 15)
for i in summary.index:
    plt.annotate(i, xy=(summary.loc[i, "ann. Risk"]+0.005, summary.loc[i, "CAGR"]+0.005), size = 10)
plt.grid()
plt.xlabel("ann. Risk (std)", fontsize = 15)
plt.ylabel("CAGR", fontsize = 15)
plt.title("Risk-Return Analysis", fontsize = 20)
plt.show()

- All three Indexes benefit from the __Portfolio Diversification Effect__!
- __Concentrated Positions__ negatively affect __VWI__
- PWI and EWI are closely together (before Trading Costs)
- But: EWI requires __daily rebalancing__! (Trading Costs)

## The DJIA Total Return Index

In [None]:
weights_PWI # based on Close Prices

In [None]:
total_returns = df["Adj Close"].pct_change() # Adj Close Prices!
total_returns

In [None]:
returns_tr = total_returns[const].mul(weights_PWI.shift()).sum(axis = "columns")
returns_tr # simple returns DJI Total Return Index

In [None]:
dji_tr = returns_tr.add(1).cumprod().mul(100)
dji_tr

In [None]:
returns_m["DJI_TR"] = returns_tr
returns_m

In [None]:
summary = ann_risk_return(returns_m)
summary

In [None]:
summary.plot(kind = "scatter", x = "ann. Risk", y = "CAGR", figsize = (15,12), s = 50, fontsize = 15)
for i in summary.index:
    plt.annotate(i, xy=(summary.loc[i, "ann. Risk"]+0.005, summary.loc[i, "CAGR"]+0.005), size = 10)
plt.grid()
plt.xlabel("ann. Risk (std)", fontsize = 15)
plt.ylabel("CAGR", fontsize = 15)
plt.title("Risk-Return Analysis", fontsize = 20)
plt.show()