# Project Introduction

In the below project we will explore the power of python on financial analysis. Python is a terrific tool for analyses of any type, and financial analysis is no exception. Using only code, we will analyze and plot the returns of Tesla (NASDAQ: TSLA) and Amazon (NASDAQ: AMZN) and compare those returns to the broader market as approximated by the S&P 500. We will also calculate 2 common measures of risk and volatility - Beta and the Sharpe Ratio, both of which are commonly used by financial analysts when evaluating a stock.

We will approximate the performance of the S&P 500 using the SPDR S&P 500 ETF which has ticker "SPY". This is a commonly used benchmark in real world financial analysis.

On top of it all we will design our code to be reproducible so that once complete, we can load in different data that has a similar format and produce the same results for that new data!

In [1]:
# import packages
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
%matplotlib inline

import warnings
warnings.filterwarnings("ignore")

plt.style.use('seaborn')

## 1. Import and Look at the Data
Before starting any sort of analysis or project it is always important to look at your data to get a feel for what you are dealing with. Below you will read in your data from a csv (comma separated values) file, take a peak at a few rows of the data, and generate some summary statistics to give a better picture of what is contained in your file.

In [2]:
# read in the stock data ('Stock_Data.csv') using pandas and save as a dataframe called df
# in your read statement include parse_dates=['Date'] and set the 'Date' column to the index using index_col
# print the first 5 rows of the dataframe

### Your Code Here ###

In [3]:
# use pandas .describe() function on df to generate summary statistics on the data

### Your Code Here ###

In [4]:
# print out the latest prices for each ticker

### Your Code Here ###

In [5]:
# trim df to only include data from the last 5 years onward name this new dataframe trim_df
# use the pandas .iloc function and the variable num_days to do this
# print the last 5 rows
# Hint: put a negative sign infront of num_days followed by a colon to get the right timeframe

num_days = 252 * 5 #there are 252 trading days in a year so we multiply that number by 5 to get the last 5 years of prices 

### Your Code Here ###

In [6]:
# create variables to save your ticker names - this will allow you to use the same code on a different 
# file with different stock data and still produce the same results without having to rewrite any code
# name the variables t1, t2, and m1

### Your Code Here ###

## 2. Visualize the Data
Now that you have taken a look at your data and have a bit of an understanding of what it contains, the next step is to visualize the information with plots. Try plotting the prices of each ticker over the past 5 years on 3 separate plots

In [7]:
# plot the 5 year stock prices for each ticker

### Your Code Here ###

If done correctly you should see some interesting shapes in the data. It looks like AMZN and SPY have somewhat similar shapes, though SPY took more of a dip than AMZN from COVID-19. TSLA on the other hand has seen monstrous growth over the past 12 months and even over the past 2 months!

## 3. Calculating Daily Returns
These charts are pretty interesting, but we can probably get more insight from our data by augmenting it with a few additional columns. While looking at the price of a given security can be informative it can be more valuable to analyze a security's returns. This can be done in 2 main ways - analyzing daily returns and analyzing total return for a given time period. We will focus on daily returns now and then total returns later.

In [8]:
# create 3 new columns in trim_df called TSLA Daily Returns, AMZN Daily Returns, and SPY Daily Returns
# that contain the daily % change in price for each ticker. Use the pct_change() function to do this
# print the first 5 rows. if done correctly the first row should have NaN for these news columns
# (remember to use the t1, t2, and m1 variables for reproducability)

### Your Code Here ###

## 4. Plot Daily Returns
Now that we have the daily returns for each of our securities let's plot them on a single chart to compare them to one another

In [9]:
# plot the daily returns for each ticker on one plot
# make sure to add labels and a legend for each security so that you can interpret the results
# (remember to use the t1, t2, and m1 variables for reproducability)

### Your Code Here ###

Wow! The TSLA daily returns are all over the place! This plot tells us that TSLA is more volatile than the broader market (SPY). We will check our assumption that TSLA is more volatile than SPY be caluclating its Beta and sharpe ratio, which are 2 measures of risk/reward for a given security. It also appears that AMZN is more volatile than SPY as well. We can check this assumption using Beta.

## 5. Calculating Beta
Beta is a statistical measure of a stock's volatility in relation to the overall market and is a tool commonly used and almost always included in any analysis of a stock's returns. The formula for beta is below:


\begin{align}
\beta_s = \frac{Cov(r_s,r_m)}{Var(r_m)}
\end{align}

Where:
- $\beta_s$ = beta of a stock
- $r_s$ = return on the stock
- $r_m$ = return on the market
- $Cov(r_s,r_m)$ = covariance of the stock and the market
- $Var(r_m)$ = variance of the market

By definition, the beta of the market (the S&P 500 in our case) is 1.0. The stock with a higher beta is expected to be more volatile than the market, and a stock with a lower beta is less volatile. For example, a beta of 1.1 means that for every 1% **move up or down** in the market, the stock will move 1.1%.

Below we will calculate the betas for TSLA and AMZN and compare them.

Learn more about beta here: https://www.investopedia.com/terms/b/beta.asp

In [10]:
# create 3 variables rs1, rs2, and rm1 that contain TSLA, AMZN, and SPY Daily Returns from our trim_df dataframe
# use .iloc[1:] to avoid the NaN row in the first position of each column
# (remember to use the t1, t2, and m1 variables for reproducability)

### Your Code Here ###

In [11]:
# the beta formula above can seem a little daunting. thankfully numpy makes the math very easy for us
# use the np.cov() function to calculate the covariance matrix rs1, rm1 and rs2, rm1 (always have rm1 in the second positon)
# save down these matrices as mat_rs1 and mat_rs2. print out one of the matrices to view it

### Your Code Here ###

In [12]:
# this matrix is a 2 x 2 numpy array and it contains the covariance of the stock and the market as well as the variance 
# of the individual stock and of the market. the covariance is in positon [0, 1] and the variance of the market is in
# position [1, 1]. using these positons save down 3 new variables cov_rs1, cov_rs2, and var_rm

### Your Code Here ###

In [13]:
# execute this cell to see what the betas for each stock are 
# (this cell's code comes with notebook - students don't need to write)

### Your Code Here ###

Interesting. It looks like our hypothesis for TSLA holds true. TSLA has a higher beta than AMZN and SPY (beta of 1.0), which we expected from our daily returns chart. This indicates that the stock is more volatile and potentially riskier than AMZN and SPY. AMZN on the other hand appears to be less volatile than the market based on the 5 year beta, but in our daily returns plot it seemed more volatile? Why would this be? Look back up at our original plots of price for AMZN and SPY. Notice that SPY had a huge dip in early 2020, but AMZN had only a little dip (the dip was due to the market's reaction to COVID-19)? Well this dip in SPY drastically increased SPY's variance, thus driving down beta for both TSLA and AMZN. You can check this yourself by doing the same calculation for the 5 years leading up February 2020. The beta's for TSLA and AMZN should both be well over 1.0.

What a beta < 1.0 for AMZN could indicate is that in normal market conditions AMZN's price moved around more than the market, but in the face of the COVID-19 volatility, AMZN was actually less volatile than the rest of the market.

## 6. Calculating Total Return
The next thing we might be interested in is the overall return for each security over our given timeframe (past 5 years). When we used the pct_change() function to calcualte daily returns, we were essentially taking the next days price and dividing it by the price of the day before. Now that we want to calculate total return we need to divide each price by the very first price in our timeframe. Let's do it below:

In [14]:
# create 3 new columns in trim_df called TSLA Total Return, AMZN total Return, and SPY Total Return
# calculate these by taking the price column and dividing it by the first element of the price column and subtract 1
# for each stock - remember the first element is in position 0
# use pandas .iloc to grab the first element of a column
# print the first 5 rows
# (remember to use the t1, t2, and m1 variables for reproducability)

### Your Code Here ###

## 7. Plotting Total Return
Now that we have calculated total return for each stock, we can plot all of them on the same graph to do an apples to apples comparison of their performance

In [15]:
# reuse the code from plotting the daily returns but use it for Total Return this time

### Your Code Here ###

In [16]:
# print the overall return for each security over the past 5 years (last element in each Total Return column)

### Your Code Here ###

Wow this looks pretty interesting. If you were to invest \\$1,000 in the S&P 500 5 years ago you would have a total value of \\$1,670 which isn't bad. If, on the other hand you invested that \\$1,000 in TSLA you would have \\$5,460! And if instead of either of those you invested the \\$1,000 in AMZN you would have \\$6,130!

## 8. Calculating Sharp-Ratio
The Sharpe ratio, developed by Nobel prize winner William F. Sharpe, is a measure of return for an investment compared to its risk. The ratio is the average return earned in excess of the risk-free rate per unit of volatility or total risk. The formula for the Sharpe Ratio is below:

\begin{align}
Sharpe Ratio = \frac{R_s - R_f}{\sigma_s}
\end{align}

Where:
- $R_s$ = return of a stock
- $R_f$ = the risk-free rate or the returns of another benchmark
- $\sigma_s$ = standard deviation of the stock's excess returns

Typically the higher the Sharpe Ratio the more attractive the risk adjusted return of the stock is.
For this project we will be using the return of the S&P 500 (SPY) instead of the risk-free rate. Substituting a benchmark like this for the risk-free rate allows us to see how much more return we get for a unit of risk by investing in something other than the benchmark.

Learn more about the Sharp Ratio here: https://www.investopedia.com/terms/s/sharperatio.asp

In [17]:
# calculate the excess return for each stock by subtracting the DAILY return of SPY from the SAILY return of the stock
# save down these as t1_excess_return and t2_excess_return

### Your Code Here ###

In [18]:
# calculate the average and standard deviation of the excess returns
# save this down as t1_avg_excess_return, t2_avg_excess_return, t1_sd_excess_return, and t2_sd_excess_return

### Your Code Here ###

In [19]:
# calculate the daily sharpe ratio of each stock by t1_avg_excess_return / t1_sd_excess_return  
# save down as t1_daily_sharpe_ratio and t2_daily_sharpe_ratio

### Your Code Here ###

In [20]:
# annualize the daily sharpe ratio by multiplying t1_daily_sharpe_ratio * np.sqrt(252)
# 252 is the number of trading days in a year (week days minus holidays)
# save the results as t1_annual_sharpe_ratio and t2_annual_sharpe_ratio

### Your Code Here ###

In [21]:
# print the annualized sharpe ratio for each ticker

### Your Code Here ###

Consistent with what we found in beta, AMZN has a higher expected risk to reward ratio than TSLA, which was more volatile. Try running these calcuations for different time periods and see if this relationship stays the same or changes.

# Conclusion
Congratulations! You have just performed some pretty sophisticated financial analysis using just python. If you were able to get through the entire notebook using the variables created at the very beginning, (t1, t2, and m1) then you can see the power of python by running this notebook again on different stocks and seeing the results! Try rerunning the whole notebook but this time import Stock_Data2.csv in the initial cell.

*This project was created by Sam DeTrempe for exclusive use by the Greenwood Project.*