# 0. Introduction

In this notebook, I am going to show how to use principal component analysis (PCA) to analyze a portfolio of stocks.
PCA is a way to reduce the number of dimensions in a data set.
It keeps the dimensions that hold the most information about the data.
Applied to a portfolio, PCA isolates the statistical return drivers of a portfolio. This can further be used for hedging risk.

In [None]:
# resources:
# https://pyquantnews.com/how-to-isolate-alpha-with-analysis/

In [None]:
# import libraries
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sn
sn.set_style('whitegrid')
import yfinance as yf
#
from sklearn.decomposition import PCA


# 1. Get the data

In [None]:
symbols = [
    'IBM',
    'MSFT',
    'META',
    'INTC',
    'NEM',
    'AU',
    'AEM',
    'GFI'
]
data = yf.download(symbols, start="2020-01-01", end="2022-11-30")
portfolio_returns = data['Adj Close'].pct_change().dropna()

# 2. Fit a PCA model

In [None]:
pca = PCA(n_components=3)
pca.fit(portfolio_returns)
#obtain components
pct = pca.explained_variance_ratio_
pca_components = pca.components_

# 3. Fit a PCA model

In [None]:
cum_pct = np.cumsum(pct)
x = np.arange(1,len(pct)+1,1)

plt.subplot(1, 2, 1)
plt.bar(x, pct * 100, align="center")
plt.title('Contribution (%)')
plt.xlabel('Component')
plt.xticks(x)
plt.xlim([0, 4])
plt.ylim([0, 100])

plt.subplot(1, 2, 2)
plt.plot(x, cum_pct * 100, 'ro-')
plt.title('Cumulative contribution (%)')
plt.xlabel('Component')
plt.xticks(x)
plt.xlim([0, 4])
plt.ylim([0, 100])