<a href="https://colab.research.google.com/github/prof-rossetti/intro-to-python/blob/main/exercises/calculating-beta/Calculating_Beta_to_the_Market_Fall_2023.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook, we will analyze a set of selected stocks, and calculate a given stock's "beta to the market", as one way of assessing the risk of that stock.


## Setup



Installing packages:

In [29]:
%%capture
!pip install yahooquery

## Challenges

### Part 1: Fetching Stock Data

First, update the `symbols` list provided in the cell below, to choose your own list of 5-10 valid stock symbols.

Then run all cells in Part 1 to fetch historical price data for the designated symbols, as well as the market. This data will be stored in a dataframe variable called `histories_df`.

In [30]:
# choose your own stocks here (at least five):
symbols = ["AAPL", "GOOGL", "META",
           "MSFT", "NFLX", "AMZN", "NVDA",
           "BAC", "JPM"
]

In [31]:
# https://yahooquery.dpguthrie.com/guide/ticker/intro/
from yahooquery import Ticker

# adding market index as well (leave this as is):
all_symbols = symbols + ["SPY"]

companies = Ticker(all_symbols)
print(type(companies))

<class 'yahooquery.ticker.Ticker'>


In [32]:
# companies.history()

In [33]:
from pandas import to_datetime

# the prices data is accessible via history method
# ... but it has a multi-level index
# ... so we are simplifying the index to make our lives easier:
histories_df = companies.history()
histories_df["symbol"] = histories_df.index.get_level_values(0)
histories_df["date"] = to_datetime(histories_df.index.get_level_values(1)).date
histories_df.reset_index(drop=True, inplace=True) # use default index (0-based)

histories_df[["date", "symbol", "adjclose"]]

Unnamed: 0,date,symbol,adjclose
0,2023-01-02,AAPL,124.538658
1,2023-01-03,AAPL,125.823189
2,2023-01-04,AAPL,124.488876
3,2023-01-05,AAPL,129.069321
4,2023-01-08,AAPL,129.597061
...,...,...,...
2025,2023-10-16,SPY,436.019989
2026,2023-10-17,SPY,430.209991
2027,2023-10-18,SPY,426.429993
2028,2023-10-19,SPY,421.190002


In [34]:
# checking number of rows and structure:
print("ROWS:", len(histories_df))
print(histories_df["symbol"].value_counts())

ROWS: 2030
AAPL     203
GOOGL    203
META     203
MSFT     203
NFLX     203
AMZN     203
NVDA     203
BAC      203
JPM      203
SPY      203
Name: symbol, dtype: int64


In [35]:
# quick check for null values (because in theory, some stocks may have different history lengths, for example recent IPO vs older company)
nulls_count = histories_df["adjclose"].isnull().sum()
assert nulls_count == 0 #> 0 ok looks good. can proceed without concern for nulls

In [36]:
import plotly.express as px

# quick chart of the market (this chart df shouldn't need to be used for anything else later, just here for charting purposes)
# filtering because right now data is row per symbol per date, so we'd have many different rows for the same date unless we choose just a single symbol
chart_df = histories_df[histories_df["symbol"] == "SPY"]
px.line(chart_df, y="adjclose", x="date", title="Market (SPY)", height=350,
        labels={"date": "Date", "adjclose": "Adjusted Close"}
)

### Part 2: Restructuring the Data





Use the `histories_df` dataframe provided in Part 1, which is structured as: a row per stock symbol per date.

Manipulate the data to format it instead as: a row per date, with a column per symbol, and corresponding cell values of the closing price for that symbol on that day (see example table, below). Store this new dataframe in a variable called `prices_pivot`.

> HINT: use the [`DataFrame.pivot()` function](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pivot.html) or another pivot-based approach


<img width="1218" alt="Screenshot 2023-10-23 at 6 27 04 PM" src="https://github.com/prof-rossetti/intro-to-python/assets/1328807/ec3dd0f4-0049-40e4-a624-eedd29003846">


Questions:

  + 2-A) How many trading days (i.e. rows) are in the provided data?
  + 2-B) What is the earliest date and latest date in the provided data?
  + 2-C) For each stock: print the symbol, as well as that stock's minimum, maximum, and mean price.


### Part 3: Daily Returns

Each stock price is different, but we want to compare their performance, so let's compare them in relative terms.

Process the `prices_pivot` from Part 2, to calculate each stock's percent change in price since the previous day (see example table below). Store this new dataframe in a variable called `returns_df`.

> HINT: consider an approach that loops through the `all_symbols` variable, and creates a new returns column for each symbol

> HINT: leverage the [`DataFrame.pct_change()` function](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pct_change.html), as demonstrated in the professor's "Pandas Package Overview Mega Notebook"

> NOTE: returns for each stock on the first / earliest day should be 0.0

<img width="1223" alt="Screenshot 2023-10-24 at 12 47 02 AM" src="https://github.com/prof-rossetti/intro-to-python/assets/1328807/7ea0fa1c-7254-46f5-b08f-a64084ebf06d">

Questions:

  + 3-A) For each stock: print the symbol, as well as that stock's minimum, maximum, and mean daily returns. Also print the standard deviation of returns.

  + 3-B) Consider standard deviation of returns as a measure of how much a given stock may vary over time, and thus a basic measure of risk. Which stocks' returns have the highest (and lowest) standard deviations?


### Part 4: Cumulative Growth

Process the `prices_pivot` from Part 2, to calculate each stock's cumulative growth in prices since the earliest provided date (see example table below). Store this new dataframe in a variable called `growth_df`.

> HINT: consider an approach that loops through the `all_symbols` variable, and creates a new growth column for each symbol

> HINT: leverage the [`DataFrame.cumprod()` function](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.cumprod.html), as demonstrated in the professor's "Pandas Package Overview Mega Notebook"

> NOTE: growth for each stock on the first / earliest day should be 1.0


<img width="1288" alt="Screenshot 2023-10-23 at 6 27 20 PM" src="https://github.com/prof-rossetti/intro-to-python/assets/1328807/9ba9f25d-e6e1-48f3-a599-e813d078a85c">


Then plot the cumulative growth for all the stocks and the market on the same graph (see example plot below).

<img width="1306" alt="Screenshot 2023-10-23 at 6 27 32 PM" src="https://github.com/prof-rossetti/intro-to-python/assets/1328807/f04bc8a2-d7b8-4d58-90e4-3b3ff056d223">


Questions:

  + 4-A) Which company has the highest cumulative growth since the beginning of the period?
  + 4-B) Which companies have performed better (or worse) than the market ("SPY") over this period?




### Part 5: Correlation

Use the `returns_df` from Part 3 to calculate the Spearman correlation between each pair of stocks.

> HINT: use the [`DataFrame.corr()` function](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.corr.html)

Then plot this correlation matrix as a heatmap (see example plot below).

<img width="639" alt="Screenshot 2023-10-24 at 1 00 42 AM" src="https://github.com/prof-rossetti/intro-to-python/assets/1328807/1c040530-6923-48de-9a02-d8a9a3352373">

Questions:

  + 5-A) The correlation matrix has all 1.0 values on the main diagonal. What does this mean?

  + 5-B) Which pair or pairs of companies are most positively (and negatively) correlated with eachother?

  + 5-C) Which companies are most positively (and negatively) correlated with the market?

  + 5-D) Choose one of the stocks from your analysis. If you own that stock, which other stock from your analysis should you consider buying if you want to hedge your risk?


## Further Exploration

### Part 6: Calculating Beta to the Market (Optional)

Optionally tackle this further exploration challenge, related to calculating beta to the market.




#### Understanding Beta

https://www.investopedia.com/ask/answers/070615/what-formula-calculating-beta.asp


> Beta is a measure used in fundamental analysis to determine the volatility of an asset or portfolio in relation to the overall market. The overall market has a beta of 1.0, and individual stocks are ranked according to how much they deviate from the market.

> A stock that swings more than the market over time has a beta greater than 1.0. If a stock moves less than the market, the stock's beta is less than 1.0. High-beta stocks tend to be riskier but provide the potential for higher returns. Low-beta stocks pose less risk but typically yield lower returns.

> As a result, beta is often used as a risk-reward measure, meaning it helps investors determine how much risk they are willing to take to achieve the return for taking on that risk. A stock's price variability is important to consider when assessing risk. If you think of risk as the possibility of a stock losing its value, beta is useful as a proxy for risk.



> To calculate the beta of a security, the covariance between the return of the security and the return of the market must be known, as well as the variance of the market returns.



\begin{align}
        Beta = \frac{Covariance} {Variance}
\end{align}



Where:
  + Covariance = Measure of a stock's return relative to that of the market
  + Variance = Measure of how the market moves relative to its mean


> **Covariance** measures how two stocks move together. A positive covariance means the stocks tend to move together when their prices go up or down. A negative covariance means the stocks move opposite of each other.

> **Variance**, on the other hand, refers to how far a stock moves relative to its mean. For example, variance is used in measuring the volatility of an individual stock's price over time. Covariance is used to measure the correlation in price moves of two different stocks.

> The formula for calculating beta is the covariance of the return of an asset with the return of the benchmark, divided by the variance of the return of the benchmark over a certain period.

#### Calculating Beta


We saw from the "Understanding Beta" section that we need to calculcate the variance of the market, as well as the covariance of each stock with respect to the market.

Luckily pandas makes this easy.

##### Variance

https://www.investopedia.com/terms/v/variance.asp

<img src="https://www.investopedia.com/thmb/_hIorwcVnDj-oKWhpTu_qnuUldM=/750x0/filters:no_upscale():max_bytes(150000):strip_icc():format(webp)/Variance-TAERM-ADD-Source-464952914f77460a8139dbf20e14f0c0.jpg" height=300>

> FYI: standard deviation is the square root of the variance!

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.var.html



In [37]:
#returns_df.var()

In [38]:
#returns_df.std() ** 2 # squaring the standard deviation, is equivalent to the variance

##### Covariance

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.cov.html

> Computes the pairwise covariance among the series of a DataFrame. The returned data frame is the covariance matrix of the columns of the DataFrame.

> This method is generally used for the analysis of time series data to understand the relationship between different measures across time.

In [39]:
#cov_mat = returns_df.cov()
#cov_mat

If we want to calculate the covariance of "this with respect to that", we can access the specific value from this matrix. For example, the covariance of NFLX with respect to the market:

In [40]:
# if we have well defined index and columns, we can use the loc method and specify the name of the row, then the name of the column
# ... df.loc[row_name, col_name]

#cov_mat.loc["NFLX_returns", "SPY_returns"]

##### Beta

Calculating beta to the market (choose your own symbol as desired):

In [41]:
## calculating beta to market for a given company:
#symbol = "NVDA"
#
## get covariance between this stock and the market
#cov_mat = returns_df.cov()
#cov = cov_mat.loc[symbol + "_returns", "SPY_returns"] # using loc method to access a given [row, col] combo
#print(f"COVARIANCE OF {symbol} WITH RESPECT TO THE MARKET:", cov)

In [42]:
#var = returns_df["SPY_returns"].var()
#print(f"VARIANCE OF THE MARKET:", var)

In [43]:
#beta = cov / var
#print(f"BETA OF {symbol} WITH RESPECT TO THE MARKET:", round(beta,3))

https://www.investopedia.com/investing/beta-gauging-price-fluctuations/

Questions:

  + 6-A) How can we interpret this beta value? What does it tell us about the company's stock, and the risk involved?