<h1><p style="text-align: center;">Portfolio Evaluation of the Nasdaq-100 Stock Index using the Markowitz Portfolio Theory</p></h1>

  <h2>Introduction</h2>
  <p>The Nasdaq-100 is an index created by the National Association of Securities Dealers Automated Quotation (NASDAQ). This index portfolio, as stated on the official website, comprises "100 of the largest and most innovative non-financial companies listed on the Nasdaq Stock Market based on market capitalization."</p>
  <p>On the other hand, we have the Markowitz Portfolio Theory. Harry Markowitz, an American economist, discovered a mathematical method to determine the optimum portfolio configuration through linear optimization.</p>
  <p>Here, we will conduct a brief analysis of the Nasdaq-100 index using the Markowitz Theory.</p>

  <h2>Markowitz Portfolio Theory</h2>
  <p>The Markowitz Portfolio Theory, developed by Harry Markowitz in 1952, revolutionized investment strategy by introducing a systematic approach to balancing risk and return. The theory emphasizes diversification as a means to reduce portfolio risk, advocating for the combination of assets with low correlations. By spreading investments across various assets, the impact of poor-performing investments can be mitigated. Markowitz introduced the concept of the efficient frontier, a set of portfolios that maximize expected return for a given level of risk or minimize risk for a given level of expected return. The efficient frontier illustrates the optimal trade-off between risk and return. Additionally, the theory incorporates the notion of a risk-free asset, allowing investors to construct portfolios along the Capital Market Line, a tangent line from the risk-free rate to the efficient frontier. This line represents portfolios comprising a risk-free asset and a risky portfolio, accommodating individual risk preferences. Markowitz's groundbreaking work laid the foundation for modern portfolio management, influencing how investors construct portfolios to achieve their financial objectives while navigating the complexities of risk in financial markets.</p>
  <h2>Nasdaq-100</h2>
  <p>The Nasdaq-100, or simply Nasdaq, is a stock market index that tracks the performance of 100 of the largest non-financial companies listed on the Nasdaq Stock Market. Notable for its technology-heavy composition, the Nasdaq-100 includes prominent companies like Apple, Amazon, Microsoft, and Google's parent company, Alphabet. As a market-capitalization-weighted index, larger companies have a more significant impact on its value. The Nasdaq-100 is renowned for its focus on innovative and high-growth sectors such as technology, biotechnology, and telecommunications. It serves as a benchmark for investors interested in gauging the performance of the broader technology and growth-oriented segments of the stock market. Investors often use financial products like exchange-traded funds (ETFs) that track the Nasdaq-100 to gain exposure to this dynamic and influential index, which reflects the evolving landscape of the global economy driven by technological advancements.</p>
  <p>To download the portfolio composition of the Nasdaq-100, I utilize the holdings information of the <em>iShares NASDAQ-100 ETF.</em> The data is saved in the document "EXXT_holdings".</p>

In [1]:
%%capture
import pandas as pd
import yfinance as yf
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

datosNasdaq = pd.read_csv("EXXT_holdings.csv")
# Filtrar los Tickers con un peso diferente de 0%
datosNasdaq = datosNasdaq[datosNasdaq["Weight (%)"] != 0]


<p> The Nasdaq-100 companies are the following: </p>

In [2]:
# Define el símbolo de las acciones en Fund_Holding
Fund_Holding = datosNasdaq['Ticker']
print(list(Fund_Holding))

['AAPL', 'MSFT', 'AMZN', 'NVDA', 'META', 'GOOGL', 'GOOG', 'AVGO', 'TSLA', 'ADBE', 'COST', 'PEP', 'CSCO', 'NFLX', 'CMCSA', 'AMD', 'TMUS', 'AMGN', 'INTC', 'INTU', 'TXN', 'QCOM', 'HON', 'AMAT', 'SBUX', 'ADP', 'BKNG', 'GILD', 'ISRG', 'VRTX', 'MDLZ', 'REGN', 'ADI', 'LRCX', 'PANW', 'MU', 'SNPS', 'PDD', 'CDNS', 'CHTR', 'KLAC', 'CSX', 'PYPL', 'MELI', 'MAR', 'ORLY', 'MNST', 'ASML', 'CTAS', 'ABNB', 'LULU', 'NXPI', 'FTNT', 'WDAY', 'ADSK', 'ODFL', 'PCAR', 'MRVL', 'PAYX', 'CPRT', 'MCHP', 'CRWD', 'SGEN', 'KDP', 'ROST', 'EXC', 'KHC', 'AZN', 'AEP', 'BIIB', 'ON', 'CEG', 'IDXX', 'BKR', 'EA', 'VRSK', 'CTSH', 'DXCM', 'TTD', 'FAST', 'XEL', 'MRNA', 'CSGP', 'FANG', 'GFS', 'GEHC', 'TEAM', 'DDOG', 'WBD', 'DLTR', 'ANSS', 'ZS', 'EBAY', 'ALGN', 'ILMN', 'USD', 'WBA', 'SIRI', 'ZM', 'ENPH', 'JD', 'LCID', 'MLIFT']


<h2>Methodology</h2> 
<p>First of all, the data should be downloaded from Yahoo Finance, to calculate the logarithmic return for each holding using the following equation:</p>
<p style= "text-align: center;"><var>log(Pf/Pi)</var></p>

<p>The data will be downloaded from January 1, 2023 to October 20 of the same year.</p>


In [3]:
%%capture
start_date = "2023-01-01"
end_date = "2023-10-20"
# DataFrame para almacenar los retornos logarítmicos
ret_log_df = pd.DataFrame()
for i in Fund_Holding:
    descarga = yf.download(i, start=start_date, end=end_date)
    if not descarga.empty:
        ret_log = np.log(descarga['Adj Close']).diff()
        ret_log_df[i] = ret_log

# Eliminar filas con NaN
ret_log_df = ret_log_df.dropna(how='all')

[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%*******

  ret_log_df[i] = ret_log


[*********************100%%**********************]  1 of 1 completed


  ret_log_df[i] = ret_log


[*********************100%%**********************]  1 of 1 completed


1 Failed download:
['MLIFT']: Exception('%ticker%: No timezone found, symbol may be delisted')





<p>Now, I compute the covariance matrix</p>

In [4]:
cov_matrix = ret_log_df.cov()

<p>Given that the Markowitz method is a linear optimization algorithm, it becomes imperative to verify three crucial conditions. Firstly, the covariance matrix must exhibit symmetry. Secondly, it should be invertible, ensuring the existence of an inverse matrix. Finally, the eigenvalues of the covariance matrix must be greater than zero, ensuring its positive definiteness. These prerequisites are essential for the proper functioning of the Markowitz method in portfolio optimization.</p>
<p>Symmetry comprobation:</p>

In [5]:
es_simetrica = np.allclose(cov_matrix, cov_matrix.T)
print('The matrix is symmetric: ', es_simetrica)

The matrix is symmetric:  True


<p>In this case, the matrix is symmetric, so, it's possible to continue with the process.</p>
<p>Eigenvalues comprobation:</p>

In [6]:
Vpropios = np.linalg.eigvals(cov_matrix)
Vprop = Vpropios>0
print(Vprop)

[ True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True]


<p>All the eigenvalues are greater than zero.</p>
<p>Now, we just need to check if the matrix has an inverse matrix.</p>

In [None]:
det = np.linalg.det(cov_matrix)
print('determinant: ', det)

The determinant equating to zero signifies the absence of an inverse matrix for the covariance matrix. Consequently, the continuation of the Markowitz Theory process becomes unattainable.

The lack of an inverse matrix in this context may point to redundant information or linear dependencies within the data. To illustrate this, we can calculate the correlation matrix and arrange the values of the upper triangle matrix in descending order.

In [None]:
corr_matrix = ret_log_df.corr()
upper_triangle = corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(bool))
highest_correlations = upper_triangle.stack().sort_values(ascending=False)
print(highest_correlations)

It can be shown that there are very strong correlations. For example, let's plot the correlation between AMAT versus LRCX and SNPS versus CDNS.

In [None]:
sns.scatterplot(x="AMAT", y="LRCX", data=ret_log_df, label="AMAT vs. LRCX")
sns.scatterplot(x="SNPS", y="CDNS", data=ret_log_df, label="SNPS vs. CDNS")
sns.regplot(x="AMAT", y="LRCX", data=ret_log_df, scatter=False)
sns.regplot(x="SNPS", y="CDNS", data=ret_log_df, scatter=False)
plt.title("Correlation Graph Between AMAT versus LRCX and SNPS versus CDNS")
plt.xlabel("AMAT, SNPS")
plt.ylabel("LRCX, CDNS")
plt.legend()
plt.show()

There are definitely very strong correlations, but how many strong correlations are there?

In [None]:
strong_correlations=highest_correlations[highest_correlations>=0.6]
print(strong_correlations)
print(len(strong_correlations))

It has been observed that there exist 137 robust correlations among 5151 relations, leading to a scenario where the determinant is precisely zero. This finding suggests a high degree of correlation and linear dependencies within the data. Now that we have identified this pattern, the next step is to delve into the underlying causes of these strong correlations and linear dependencies.
As we said at the beginning, the composition of the index, is guided by non-financial companies listed on the Nasdaq Stock Market. So, it can be deduced that the composition could correlate in that way. Let's plot some graphs of the data gruoped by different categories. 

In [None]:
sector = datosNasdaq['Sector'].value_counts()
plt.subplot(1, 2, 1)
sector.plot(kind='bar', figsize=(10, 6), color='skyblue')
plt.title('Frecuency by sector')
plt.xlabel('Sector')
plt.ylabel('Frecuency')
#plt.show()
location = datosNasdaq['Location'].value_counts()
plt.subplot(1, 2, 2)
location.plot(kind='bar', figsize=(10, 6), color='skyblue')
plt.title('Frecuency by Location')
plt.xlabel('Location')
plt.ylabel('Frecuency')
plt.show()

Considering that one of the key principles of the Markowitz Portfolio Theory is risk reduction through diversification across assets with low correlations, an analysis of the Nasdaq-100 portfolio reveals a deviation from this principle. Examining both graphs underscores a conspicuous concentration in the Information Technology sector and a geographical focus on the United States. This concentration implies a lack of diversification in both sectorial and country risk. The Nasdaq-100's substantial exposure to a specific sector and geographic region suggests a heightened vulnerability to adverse events within the technology industry and the United States, potentially amplifying risk rather than mitigating it.

Now, let's explore if the portfolio has considerable company concentration. 

In [None]:
print(datosNasdaq[["Name","Weight (%)"]].head())

Apple INC and Microsoft Corp have an 11.02% and 9.9% of participation, respectivly. So, 2 companies, over 100, are the 20% of the portfolio. That shows an important concentration in 2 companies. But, what happen with the other participations? 

In [None]:
#datosNasdaq[["Name","Weight (%)"]].head(11).plot(kind="bar", figsize=(10, 6), color='skyblue')
plt.bar(datosNasdaq["Name"].head(11), datosNasdaq["Weight (%)"].head(11))
plt.title('Companies Participation')
plt.xlabel('Companies')
plt.ylabel('Weight (%)')
plt.xticks(rotation=90)
plt.show()

In [None]:
print(datosNasdaq["Weight (%)"].head(11).sum())

If we add the weights of the first 11 companies, we obtain a 50.91% of participation. It means that the 10% of the companies that comprise the Index are the 50% of participation. So, there is a high company concentration.

This concentration suggests an elevated vulnerability to adverse events within the technology industry and the United States, potentially amplifying risk rather than mitigating it. Investors seeking a diversified portfolio to manage risk may need to consider complementary assets from different sectors and geographic regions to achieve a more balanced and resilient investment strategy.

<h2>Conclusion</h2>

In conclusion, the application of the Markowitz Portfolio Theory to the Nasdaq-100 index yields crucial insights into the risk and return dynamics of this prominent market indicator. The analysis of correlations among individual stocks exposes a notable prevalence of strong correlations, signifying a high level of interdependence and potential linear dependencies within the portfolio. Despite the robust nature of the Nasdaq-100, the identified concentration in the Information Technology sector and a geographic focus on the United States raises concerns about increased vulnerability to sector-specific and country-specific risks. The concentration in a select few companies, exemplified by the significant weightings of Apple Inc and Microsoft Corp, further underscores the risk profile, as a limited number of holdings exert substantial influence on the overall portfolio. As investors contemplate the inclusion of the Nasdaq-100 in their portfolios, these findings emphasize the imperative of implementing additional diversification strategies to effectively manage risk. In essence, while the Nasdaq-100 stands as a dynamic and influential index, an informed recognition of its concentrated nature becomes paramount for the construction of well-balanced and resilient investment portfolios.