In [43]:
from IPython.display import display, display_html, Math, Latex

import pandas as pd
import numpy as np
import numpy_financial as npf
import yfinance as yf
import matplotlib.pyplot as plt
from datetime import datetime
from threading import Thread

## Group Assignment
### Team Number: XX
### Team Member Names: Rehan, Rui, Anton
### Team Strategy Chosen: RISKY

*Delete before submission*

IMPORTANT SPECIFICATIONS DURING CODING:

- Stocks must be in US Market

- Stocks must have an average monthly volume of at least 200 000 shares between January 01, 2022 to October 31, 2022.

- Months must have at least 20 trading days

- Must have a minimum of 12 stocks and a maximum of 25 stocks

- If we choose n stocks for our portfolio, each stock must be minimum of (100/(2n))% of the portfolio when weighted by value (i.e., the overall value of the shares purchased in that particular stock) as of closing prices on November 25, 2022

- No individual stock may make up more than 25% of the portfolio when weighted by value (i.e., the overall value of the shares purchased in that particular stock) as of closing prices on November 26, 2022.

- Must spend all 500 000 USD on portfolio

- Teams will purchase their stocks at the closing prices on November 25, 2022

There are more specifications for how our code should be presented before submission, look into the assignment doc for info.

### Extracting tickers and adding them to a list

In [44]:
# Initializing a start and end date for our portfolio
start_date = '2022-01-01'
end_date = '2022-11-01'

# Initializing a dataframe for 'raw' data extracted from the .csv file
tickers_raw = pd.read_csv('Tickers_Example.csv', header=None)[0].tolist()

# Empty data structures to store ticker data in
tickers = []
tickers_hist = {}

# Function which consumes a ticker and determines the validation based on prerequisites
def validate_ticker(ticker):

    # Extracting ticker info from yFinance
    ticker_info = yf.Ticker(ticker).info

    # Trying every stock and excepting those that throw an error
    try:
        # If the stock is valid, we check for each prerequisite:
        # Checking for USD currency and ensuring it's on the US market
        if ticker_info['currency'] == 'USD' and ticker_info['market'] == 'us_market':
            ticker_hist = yf.Ticker(ticker).history(start=start_date, end=end_date, interval='1d').dropna()

            # Checking monthly volume
            ticker_monthly_trading_days = ticker_hist['Volume'].groupby(pd.Grouper(freq='MS')).count()
            ticker_monthly_volume = ticker_hist['Volume'].groupby(pd.Grouper(freq='MS')).sum()

            # Checking if the month has at least 20 trading days
            for month in ticker_monthly_trading_days.index:
                if ticker_monthly_trading_days.loc[month] < 20:
                    ticker_monthly_volume.drop(month, inplace=True)

            # Checking if the average monthly volume is greater than or equal to 200,000 USD
            if ticker_monthly_volume.mean() >= 200000:
                tickers.append(ticker)
                tickers_hist[ticker] = ticker_hist
            else:
                print(f'{ticker} Ticker does not meet average monthly volume requirements')
        else:
            print(f'{ticker} Ticker does not reference stock denominated in USD')
    except:
        print(f'Error: {ticker} Ticker does not reference a valid stock')

# Empty data structure for threading
threads = []

# Checking validity of each ticker in list of tickers given from threading
for ticker in tickers_raw:
    thread = Thread(target=validate_ticker, args=[ticker])
    thread.start()
    threads.append(thread)

# Using threading
for thread in threads:
    thread.join()

Error: TWX Ticker does not reference a valid stock
Error: PCLN Ticker does not reference a valid stock
Error: CELG Ticker does not reference a valid stock
Error: RTN Ticker does not reference a valid stock
Error: AGN Ticker does not reference a valid stock
RY.TO Ticker does not reference stock denominated in USD
TD.TO Ticker does not reference stock denominated in USD


### List of Valid Tickers

In [45]:
print(tickers)

['PFE', 'ORCL', 'SLB', 'SO', 'TGT', 'NEE', 'QCOM', 'PM', 'MS', 'MRK', 'NKE', 'UNP', 'UNH', 'T', 'KO', 'SPG', 'CVS', 'MO', 'SBUX', 'MSFT', 'GOOG', 'JPM', 'UPS', 'AMZN', 'BAC', 'ABT', 'MON', 'BA', 'C', 'BK', 'BIIB', 'ACN', 'LMT', 'COP', 'CAT', 'COF', 'CL', 'ABBV', 'LLY', 'BMY', 'GM', 'TXN', 'BLK', 'PG', 'USB', 'AXP', 'AIG', 'COST', 'CMCSA', 'PYPL', 'KMI', 'OXY', 'AAPL', 'PEP', 'CSCO']


## Strategy and Data Analytics

The strategy our team has chosen is to go risky, meaning we must optimize a portfolio so that it is driven away as much as possible, from the initial starting value of $500,000. Since we are pursuing a risky strategy, there are a few factors to an optimal portfolio.

Since we want the most risk and most reward, we want to minimize any diversification involved. Essentially, out of the minimum and maximum requirements for stock diversification, we want to pick the least amount of stocks, which is a minimum of 12 and keep diversification at an all time low. Furthermore, our stocks can net heavy volume, however, volume is useless if the market direction for these assets is all over the place. Therefore, we need all of our stocks to move in the same direction. 

Overall, we will move forward with the minimum amount of stocks involved, 12, and keep track of various factors like risk to reward ratio, positive risk, negative risk, betas and options markets.

### Measuring Beta

Beta calculates the volatility of a stock given the covariance of the stock relative to the broader stock market and the variance of the stock. We can use this to our advantage and find the higher volatility stocks so that we can create a riskier portfolio. The formula for Beta is as follows:

$$
\beta_i = \frac {\mathrm{Cov} (r_i,r_m)}{\mathrm{Var} (r_m)}
$$


Where $ \beta_i $ is the market beta of a stock, $ {\mathrm{Cov} (r_i,r_m)} $ is the covariance between the stock and the market index (in our case, the S&P 500), $ {\mathrm{Var} (r_m)} $ is the variance of the market index. Since a volatile stock can net us a gain or a loss, we want the highest magnitude. Thus, we will calculate both an upper bound beta and a lower bound beta and yield the highest magnitude out of the two as a contender for our final portfolio

In [46]:
# We will be using S&P 500 as our measure of how the overall stock market is performing
sp_index = yf.Ticker("^GSPC") # ticker for S&P 500

# Extracting close prices of the S&P 500, percentage returns and its corresponding variance
sp_hist = sp_index.history(start=start_date, end=end_date).filter(like="Close")
sp_hist["Returns"] = sp_hist['Close'].pct_change()
sp_var = sp_hist["Returns"].var()

In [95]:
frame = {}
beta_values = []
min_stocks = 12
max_stocks = 25

# calculate_betas consumes a list of tickers then creates a dataframe of the tickers and their respective betas
def calculate_betas(tickers):
    for i in range(len(tickers)):
        # Getting the ticker and calculating its returns
        ticker_hist = tickers_hist[tickers[i]]
        ticker_close = ticker_hist["Close"].pct_change()

        # Initializing temporary dataframe with 2 columns containing the ticker's returns and S&P 500's returns
        frame = {tickers[i]: ticker_close,
                 "S&P500": sp_hist['Returns']}
        temp_dataframe = pd.DataFrame(frame)

        # Calculating Beta and appending onto a list of betas
        beta = temp_dataframe.cov() / sp_var
        beta_values.append(beta.iat[0,1])

    return beta_values

# Storing the tickers with their respective betas in a dataframe 
beta_frame = {"Ticker": tickers,
              "Beta": calculate_betas(tickers)}
beta_dataframe = pd.DataFrame(beta_frame)

# Sorting the betas in both ascending and descending order
beta_ascending = beta_dataframe.sort_values("Beta", ascending = True)
beta_descending = beta_dataframe.sort_values("Beta", ascending = False)

---

### The stocks with the lowest betas:

In [88]:
beta_ascending.iloc[:min_stocks]

Unnamed: 0,Ticker,Beta
26,MON,0.010761
32,LMT,0.307426
9,MRK,0.312476
39,BMY,0.314697
37,ABBV,0.345826
17,MO,0.365149
36,CL,0.377901
7,PM,0.401662
3,SO,0.422764
43,PG,0.465361


### The stocks with the highest betas:

In [94]:
beta_descending.iloc[:min_stocks]

Unnamed: 0,Ticker,Beta
49,PYPL,1.651318
23,AMZN,1.604552
6,QCOM,1.454741
40,GM,1.39308
20,GOOG,1.310727
10,NKE,1.299055
27,BA,1.298421
42,BLK,1.266132
52,AAPL,1.265959
35,COF,1.250358


## Sharpe Ratio vs Sortino Ratio

Let's consider the Sortino Ratio. One might wonder what the difference is between both ratios.

The Sharpe Ratio measures risk-adjusted return by comparing the return of an investment with its risk. The formula for the Sharpe Ratio is: <br>

$\begin{align}\large S_a = \frac{ R_p - R_f }{\sigma_o} \end{align}$

where $\normalsize R_p$ is expected return, $\normalsize R_f$ is the risk-free return rate and $\normalsize \sigma_o$ is the standard deviation of the excess returns.

On the other hand, the Sortino Ratio also measures risk-adjusted return. However, the formula for the Sortino Ratio is: <br>

$\begin{align}\large S_a = \frac{ R_p - R_f }{\sigma_d} \end{align}$

where $\normalsize R_p$ is expected return, $\normalsize R_f$ is the risk-free return rate and $\normalsize \sigma_d$ is the standard deviation of the negative or downside returns.

Although both equations are **almost** identical, the difference is that the Sharpe Ratio dislikes volatility as it accounts for any excess returns, whereas the Sortino Ratio accounts for any negative or downside returns. Consequently, graphed sharpe ratios would produce a graph that would more likely be linear, contrary to a more exponential graph of Sortino Ratios.

The Sharpe ratio is more useful when evaluating low-volatility investment portfolios, whereas the Sortino ratio is more useful when evaluating high-volatility investment portfolios. As we are looking to maximize our risk on our portfolio, it would be a good idea to assess both the Sortino ratio and the Sharpe ratio.

### Sharpe Ratio

In [None]:
# Creating empty dataframes to store data in
sharpe = pd.DataFrame()
data = pd.DataFrame()
sharpe_ratios = []

# sharpe['Total Value'] = data['Total Value']
# sharpe['Total Value Percentage Change'] = data['Total Value'].pct_change()

sharpe_ratio = data.pct_change().mean() / data.pct_change().std()
print(sharpe_ratio)







Series([], dtype: float64)


## Risk Coefficient

To further maximize risk from our selection of tickers, we can rearrange our ratios to solve for risk coefficient.


## Contribution Declaration

The following team members made a meaningful contribution to this assignment:

Insert Names Here.