# Portfolio Allocation Using Hierarchial Risk Parity ([Explained Here](https://www.youtube.com/watch?v=9MSPeAYBYIY))
HRP portfolios tackle three primary issues commonly associated with quadratic optimizers, especially in the context of Markowitz's Critical Line Algorithm (CLA). These issues are instability, concentration, and underperformance. Monte Carlo simulations demonstrate that HRP achieves reduced out-of-sample variance compared to CLA, despite CLA's primary focus on minimizing variance. Furthermore, HRP generates portfolios with lower out-of-sample risk when compared to conventional risk parity techniques.



You have two options. You can manually create a CSV file with daily returns data from [NASDAQ Historical Data](https://www.nasdaq.com/market-activity/quotes/historical) (This is tedious). Or use the first block of code to request monthly returns data from an API. Manually creating a CSV file is the best option because it provides 10 years of daily data. API only provides a couple years of monthly data.



# API Monthly Returns:
1. Run the first block of code. You'll have to enter 10 stocks. **Record the order you enter the stocks.** The function will gather monthly returns data from the API. The API can only handle 5 request per minute so the function will take about 2 minutes to run. If you pay me I can cop a better subscription 🙏
2. After a dataframe with monthly returns data is recieved, run the third block of code (HRP). The weights (percentages in decimal format) will be given.

In [None]:
import pandas as pd
import requests
import time
from numpy.lib.function_base import append

# Insert Personal Key
apiKey = ""
symbols = []

def portfolio_input():
  for i in range(1, 11):
     stock = input(f"Enter Stock {i}: ")
     symbols.append(stock)
portfolio_input()

def calculate_monthly_returns(symbol):
    url = f'https://www.alphavantage.co/query?function=TIME_SERIES_MONTHLY&symbol={symbol}&apikey={apiKey}'

    r = requests.get(url)
    data = r.json()

    closing_prices = [float(data["Monthly Time Series"][date]["4. close"]) for date in data["Monthly Time Series"]]

    monthly_returns = []

    for i in range(1, len(closing_prices)):
        current_price = closing_prices[i]
        previous_price = closing_prices[i - 1]
        monthly_return = (current_price - previous_price) / previous_price
        monthly_returns.append(monthly_return)

    dates = list(data["Monthly Time Series"].keys())[1:]
    df = pd.DataFrame({len(dataframes) + 1: monthly_returns})

    return df

dataframes = []
for symbol in symbols:
    df = calculate_monthly_returns(symbol)
    if df.empty:
        break
    dataframes.append(df)
    if len(dataframes) % 5 == 0:
        print("Waiting for 62 seconds...")
        time.sleep(62)

if dataframes:
    merged_df = pd.concat(dataframes, axis=1)
    merged_df.dropna(inplace=True)

    csv_filename = 'monthly_returns_official.csv'
    merged_df.to_csv(csv_filename, index=True, header=False)
    print(f'Data saved as {csv_filename}')
    print(merged_df)
else:
    print("No data available for all stocks.")


Enter Stock 1: aapl
Enter Stock 2: msft
Enter Stock 3: tsla
Enter Stock 4: intc
Enter Stock 5: googl
Enter Stock 6: amzn
Enter Stock 7: lulu
Enter Stock 8: gme
Enter Stock 9: pwsc
Enter Stock 10: meta
Waiting for 62 seconds...
Waiting for 62 seconds...
Data saved as monthly_returns_official.csv
          1         2         3         4          5          6         7   \
0   0.078411 -0.024698 -0.048728 -0.092224  -0.003950  -0.047221 -0.019822   
1   0.045670  0.024896  0.036229  0.017928  -0.025336  -0.031375 -0.007160   
2  -0.012624  0.013753 -0.021164 -0.065138  -0.098101  -0.024835 -0.000079   
3  -0.086199 -0.035679 -0.220957 -0.059809   0.026483  -0.075023 -0.123038   
4  -0.042708 -0.064344 -0.194282 -0.012087  -0.126394  -0.125477  0.144609   
5  -0.028171 -0.061707  0.262613  0.051835  -0.033631  -0.020484 -0.041429   
6  -0.106064 -0.134860 -0.008435 -0.236915  -0.131784  -0.087714 -0.150993   
7  -0.021165 -0.006455 -0.157941  0.133574   0.097491   0.094450 -0.007503   
8 

# Merger
**The following code will merge the ten CSV files you downloaded off of NASDAQ into one CSV file for the HRP program**

Put all ten csv files you downloaded off of NASDAQ into one folder and put the path to it here:


```
csv_directory = 'returnsFiles'
```
Then specify where you want the merged file to save:
```
merged_df.to_csv('output/diverseReturns.csv')
```



In [None]:
import pandas as pd
import os

merged_df = pd.DataFrame()

csv_directory = '/content/seventyFivePercent'

for filename in os.listdir(csv_directory):
    if filename.endswith(".csv"):
        stock_data = pd.read_csv(os.path.join(csv_directory, filename))

        stock_name = filename.split('.')[0]
        stock_data['Daily_Return_' + stock_name] = (stock_data['Close'] - stock_data['Open']) / stock_data['Open']

        if merged_df.empty:
            merged_df = stock_data[['Date', 'Daily_Return_' + stock_name]]
        else:
            merged_df = pd.merge(merged_df, stock_data[['Date', 'Daily_Return_' + stock_name]], on='Date', how='outer')

merged_df['Date'] = pd.to_datetime(merged_df['Date'])
merged_df.set_index('Date', inplace=True)
merged_df.to_csv('seventyFivePercentMerged')

print(merged_df.head())


            Daily_Return_HCLTECH  Daily_Return_EWT  Daily_Return_DAL  \
Date                                                                   
2002-08-12              0.046230         -0.017977               NaN   
2002-08-13             -0.038144          0.001143               NaN   
2002-08-14             -0.009952          0.028603               NaN   
2002-08-15              0.000000          0.034141               NaN   
2002-08-16             -0.000794          0.006487               NaN   

            Daily_Return_INFY  Daily_Return_GD  Daily_Return_LULU  \
Date                                                                
2002-08-12                0.0         0.012070                NaN   
2002-08-13                0.0        -0.041843                NaN   
2002-08-14                0.0        -0.014295                NaN   
2002-08-15                0.0        -0.031726                NaN   
2002-08-16                0.0         0.031878                NaN   

           

# Hierarchial Risk Parity

Make sure you replace the path:


```
csv_path = '/content/sample_data/diverseReturns.csv'
```

To the correct relative path to your historical returns data.





In [None]:
import matplotlib.pyplot as mpl
import scipy.cluster.hierarchy as sch
import random
import numpy as np
import pandas as pd

def getIVP(cov, **kargs):
    # Compute the inverse-variance portfolio
    ivp = 1. / np.diag(cov)
    ivp /= ivp.sum()
    return ivp

def getClusterVar(cov, cItems):
    # Compute variance per cluster
    cov_ = cov.loc[cItems, cItems] # matrix slice
    w_ = getIVP(cov_).reshape(-1, 1)
    cVar = np.dot(np.dot(w_.T, cov_), w_)[0, 0]
    return cVar

def getQuasiDiag(link):
    # Sort clustered items by distance
    link = link.astype(int)
    sortIx = pd.Series([link[-1, 0], link[-1, 1]])
    numItems = link[-1, 3] # number of original items
    while sortIx.max() >= numItems:
        sortIx.index = range(0, sortIx.shape[0] * 2, 2) # make space
        df0 = sortIx[sortIx >= numItems]
        i = df0.index
        j = df0.values - numItems
        sortIx[i] = link[j, 0] # item 1
        df0 = pd.Series(link[j, 1], index=i + 1)
        sortIx = pd.concat([sortIx, df0]) # item 2 (modified to use concat)
        sortIx = sortIx.sort_index() # re-sort
        sortIx.index = range(sortIx.shape[0]) # re-index
    return sortIx.tolist()


def getRecBipart(cov, sortIx):
    # Compute HRP allocation
    w = pd.Series(1, index=sortIx)
    cItems = [sortIx] # initialize all items in one cluster
    while len(cItems) > 0:
        cItems = [i[j:k] for i in cItems for j, k in ((0, len(i) // 2), (len(i) // 2, len(i))) if len(i) > 1]
        for i in range(0, len(cItems), 2): # parse in pairs
            cItems0 = cItems[i] # cluster 1
            cItems1 = cItems[i + 1] # cluster 2
            cVar0 = getClusterVar(cov, cItems0)
            cVar1 = getClusterVar(cov, cItems1)
            alpha = 1 - cVar0 / (cVar0 + cVar1)
            w[cItems0] *= alpha # weight 1
            w[cItems1] *= 1 - alpha # weight 2
    return w

def correlDist(corr):
    # Compute the correlation distance
    dist = ((1 - corr) / 2.)**.5 # distance matrix
    return dist

def plotCorrMatrix(path, corr, labels=None):
    # Heatmap of the correlation matrix
    if labels is None:
        labels = []
    mpl.pcolor(corr)
    mpl.colorbar()
    mpl.yticks(np.arange(.5, corr.shape[0] + .5), labels)
    mpl.xticks(np.arange(.5, corr.shape[0] + .5), labels)
    mpl.savefig(path)
    mpl.clf()
    mpl.close() # reset pylab
    return

def generateDataFromCSV(csv_path):
    # Read the data from the CSV file
    data = pd.read_csv(csv_path)

    return data

def main():
    nObs, size0, size1, sigma1 = 10000, 5, 5, .25

    # Provide the path to your CSV file
    csv_path = '/content/seventyFivePercentMerged'

    # Call the modified generateData function
    data = generateDataFromCSV(csv_path)
    cols = [random.randint(0, size0 - 1) for i in range(size1)]
    print([(j + 1, size0 + i) for i, j in enumerate(cols, 1)])
    # 2) Compute and plot correl matrix
    cov, corr = data.cov(), data.corr()
    plotCorrMatrix('HRP3_corr0.png', corr, labels=corr.columns)
    # 3) Cluster
    dist = correlDist(corr)
    link = sch.linkage(dist, 'single')
    sortIx = getQuasiDiag(link)
    sortIx = corr.index[sortIx].tolist() # recover labels
    df0 = corr.loc[sortIx, sortIx] # reorder
    plotCorrMatrix('HRP3_corr1.png', df0, labels=df0.columns)
    # 4) Capital allocation
    hrp = getRecBipart(cov, sortIx)

    return print(hrp)

if __name__ == '__main__':
    main()


[(2, 6), (4, 7), (5, 8), (1, 9), (2, 10)]


  cov, corr = data.cov(), data.corr()
  cov, corr = data.cov(), data.corr()
  link = sch.linkage(dist, 'single')


Daily_Return_HCLTECH    0.050082
Daily_Return_INFY       0.068557
Daily_Return_RMV        0.073394
Daily_Return_GD         0.087791
Daily_Return_DAL        0.023585
Daily_Return_UPS        0.110376
Daily_Return_V (1)      0.067870
Daily_Return_EWZ        0.031774
Daily_Return_MSFT       0.029497
Daily_Return_ESGV       0.075575
Daily_Return_EWT        0.066235
Daily_Return_IEMG       0.256164
Daily_Return_FL         0.028675
Daily_Return_LULU       0.015289
Daily_Return_UAA        0.015136
dtype: float64


Use the [backtester](https://www.portfoliovisualizer.com/backtest-portfolio#analysisResults) to test performance.