In [1]:
%load_ext autoreload
%load_ext autotime
%autoreload 2

time: 891 µs (started: 2021-11-24 11:47:39 -08:00)


# Options Apps Project

## Goal
Creating a suite of equity option web apps which expands on, and was inspired by, [Harshit Tyagi’s](https://www.linkedin.com/in/tyagiharshit/) [article](https://www.kdnuggets.com/2021/09/-structured-financial-newsfeed-using-python-spacy-and-streamlit.html) on building a financial newsfeed app.

## Skills Involved
Web scraping, NLP, options price modeling, predictive modeling of volitility, and building apps with Streamlit.

## Imports

In [2]:
import sys, os, math
import requests
from bs4 import BeautifulSoup
import spacy
from spacy import displacy
import yfinance as yf
import pandas as pd
import numpy as np
import seaborn as sns
from datetime import datetime
import time

#  setting path
gparent = os.path.join(os.pardir)
sys.path.append(gparent)

from src import helper_functions as h

time: 2.93 s (started: 2021-11-24 11:47:39 -08:00)


## requests: Grabbing Headlines

In [3]:
# grabbing headlines
r = requests.get("http://feeds.marketwatch.com/marketwatch/topstories/")
r

<Response [200]>

time: 215 ms (started: 2021-11-24 11:47:42 -08:00)


## bs4: Saving Headlines

In [4]:
# saving headlines to a list
soup = BeautifulSoup(r.content, features='lxml')
headlines = soup.findAll('title')
print(f'{type(headlines)}\n')
print(headlines)

<class 'bs4.element.ResultSet'>

[<title>MarketWatch.com - Top Stories</title>, <title>MarketWatch.com - Top Stories</title>, <title>The Tell: There are no municipal-market bond vigilantes when it comes to climate risk, this study confirms</title>, <title>The Ratings Game: Alibaba stock just suffered the biggest 5-day selloff in its history, but Susquehanna analyst stays ‘positive’</title>, <title>Market Snapshot: U.S. stocks dip after Federal Reserve minutes show some officials in November favored a faster pace of tapering</title>, <title>The Fed: ‘Some’ on Fed thought faster pace of tapering bond buys was warranted, meeting minutes show</title>, <title>: What is the ‘metaverse’ and how much will it be worth? Depends on whom you ask</title>, <title>Metals Stocks: Gold halts 4-session price slide as U.S. investors turn to Thanksgiving</title>, <title>: Thanksgiving food for thought: Nearly 20 million Americans don’t have enough to eat</title>, <title>Futures Movers: Oil holds ground as

### Grabbing Headline String
Grabbing headline and printing headline and isolated text.

In [5]:
# grabbing test headline
test_headline = headlines[2]
print(f'{test_headline}\n')
print(test_headline.text)

<title>The Tell: There are no municipal-market bond vigilantes when it comes to climate risk, this study confirms</title>

The Tell: There are no municipal-market bond vigilantes when it comes to climate risk, this study confirms
time: 880 µs (started: 2021-11-24 11:47:42 -08:00)


## spaCy: Tests
Testing tokenization and name extraction on a single string.

### Loading Model

In [6]:
# loading spacy model
nlp = spacy.load('en_core_web_sm')

time: 501 ms (started: 2021-11-24 11:47:42 -08:00)


### Tokenizing String
Tokenizing the test headline.

In [7]:
# checking the test case
processed_hline = nlp(test_headline.text)
print(f'{test_headline}\n')
for token in processed_hline:
  print(token)

<title>The Tell: There are no municipal-market bond vigilantes when it comes to climate risk, this study confirms</title>

The
Tell
:
There
are
no
municipal
-
market
bond
vigilantes
when
it
comes
to
climate
risk
,
this
study
confirms
time: 12.4 ms (started: 2021-11-24 11:47:42 -08:00)


## Saving List of Tokenized Headlines

In [8]:
processed_hlines = [nlp(headlines[i].text) for i in range(len(headlines))]
for line in processed_hlines:
    print(line)

MarketWatch.com - Top Stories
MarketWatch.com - Top Stories
The Tell: There are no municipal-market bond vigilantes when it comes to climate risk, this study confirms
The Ratings Game: Alibaba stock just suffered the biggest 5-day selloff in its history, but Susquehanna analyst stays ‘positive’
Market Snapshot: U.S. stocks dip after Federal Reserve minutes show some officials in November favored a faster pace of tapering
The Fed: ‘Some’ on Fed thought faster pace of tapering bond buys was warranted, meeting minutes show
: What is the ‘metaverse’ and how much will it be worth? Depends on whom you ask
Metals Stocks: Gold halts 4-session price slide as U.S. investors turn to Thanksgiving
: Thanksgiving food for thought: Nearly 20 million Americans don’t have enough to eat
Futures Movers: Oil holds ground as inventories rise, traders await OPEC+ response to reserve release
The Fed: Fed Daly says open to supporting a faster tapering of bond purchases, if economic trends continue
Market Extr

## Getting Org Names From Headlines
Visualizing named entities (real world objects) in the headlines and creating a set of organiziations.

In [9]:
# pulling company name tokens from headlines
companies = []
for title in processed_hlines:
    doc = nlp(title.text)
    if len(doc.ents) != 0:
        displacy.render(doc, style='ent')
    else:
        pass
    for token in doc.ents:
        if token.label_ == 'ORG':
            companies.append(token.text)
        else:
            pass
companies = set(companies)
print(companies)

{'Fed Daly', 'Fed', 'Federal Reserve'}
time: 109 ms (started: 2021-11-24 11:47:43 -08:00)


## Scraping S&P 500 Stock Table w/Requests & BeautifulSoup

In [10]:
# scraping S&P wikipedia page
r = requests.get('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')

# parsing the html
soup = BeautifulSoup(r.text, 'lxml')

# extracting the table
table = soup.find('table', {'class': 'wikitable sortable'})

# printing row with first stock
print(table.findAll('tr')[1:2])

[<tr>
<td><a class="external text" href="https://www.nyse.com/quote/XNYS:MMM" rel="nofollow">MMM</a>
</td>
<td><a href="/wiki/3M" title="3M">3M</a></td>
<td><a class="external text" href="https://www.sec.gov/edgar/browse/?CIK=66740" rel="nofollow">reports</a></td>
<td>Industrials</td>
<td>Industrial Conglomerates</td>
<td><a href="/wiki/Saint_Paul,_Minnesota" title="Saint Paul, Minnesota">Saint Paul, Minnesota</a></td>
<td>1976-08-09</td>
<td>0000066740</td>
<td>1902
</td></tr>]
time: 578 ms (started: 2021-11-24 11:47:43 -08:00)


### Ticker Symbols
Grabbing the ticker symbol from the first cell of each row.

In [11]:
# making list of symbols
symbols = [row.findAll('td')[0].text for row in table.findAll('tr')[1:]]

# checking length
print(f'List Length: {len(symbols)} \n')

# checking first 5 symbols
print(f'First five symbols: {symbols[:5]}')

List Length: 505 

First five symbols: ['MMM\n', 'ABT\n', 'ABBV\n', 'ABMD\n', 'ACN\n']
time: 14.9 ms (started: 2021-11-24 11:47:43 -08:00)


The length of the list looks good, but we have extra characters we need to strip from the symbols.

In [12]:
# stripping new line character from the strings 
symbols = list(map(lambda s: s.strip(), symbols))

# checking first 5 symbols
print(symbols[:5])

['MMM', 'ABT', 'ABBV', 'ABMD', 'ACN']
time: 1e+03 µs (started: 2021-11-24 11:47:44 -08:00)


Looks good. We can apply the same technique to the second cell of the row to scrape and save the company name, and to the fifth cell to grab the company industry. 

### Names
Grabbing the company name from the second cell of each row.

In [13]:
# making list of names
names = [row.findAll('td')[1].text for row in table.findAll('tr')[1:]]

# checking first five names
print(names[:5])

['3M', 'Abbott Laboratories', 'AbbVie', 'Abiomed', 'Accenture']
time: 15.4 ms (started: 2021-11-24 11:47:44 -08:00)


## S&P 500 Dataframe
Creating a data frame of stocks in the S&P 500 index.

In [14]:
# making a data dictionary
data = {'Company Name': names, 'Symbol': symbols}

# creating data frame from the data
stocks_df = pd.DataFrame.from_dict(data)

# checking shape and first five rows
print(stocks_df.shape)
stocks_df.head()

(505, 2)


Unnamed: 0,Company Name,Symbol
0,3M,MMM
1,Abbott Laboratories,ABT
2,AbbVie,ABBV
3,Abiomed,ABMD
4,Accenture,ACN


time: 10.2 ms (started: 2021-11-24 11:47:44 -08:00)


### Checking yf Stock Info Dictionary Keys
Checking the dictionary keys available for pulling stock information.

In [15]:
# instantiating a ticker object
ACN = yf.Ticker('ACN')

time: 2.81 ms (started: 2021-11-24 11:47:44 -08:00)


In [16]:
# checking keys
ACN.info.keys()

dict_keys(['zip', 'sector', 'fullTimeEmployees', 'longBusinessSummary', 'city', 'phone', 'country', 'companyOfficers', 'website', 'maxAge', 'address1', 'fax', 'industry', 'address2', 'ebitdaMargins', 'profitMargins', 'grossMargins', 'operatingCashflow', 'revenueGrowth', 'operatingMargins', 'ebitda', 'targetLowPrice', 'recommendationKey', 'grossProfits', 'freeCashflow', 'targetMedianPrice', 'currentPrice', 'earningsGrowth', 'currentRatio', 'returnOnAssets', 'numberOfAnalystOpinions', 'targetMeanPrice', 'debtToEquity', 'returnOnEquity', 'targetHighPrice', 'totalCash', 'totalDebt', 'totalRevenue', 'totalCashPerShare', 'financialCurrency', 'revenuePerShare', 'quickRatio', 'recommendationMean', 'exchange', 'shortName', 'longName', 'exchangeTimezoneName', 'exchangeTimezoneShortName', 'isEsgPopulated', 'gmtOffSetMilliseconds', 'quoteType', 'symbol', 'messageBoardId', 'market', 'annualHoldingsTurnover', 'enterpriseToRevenue', 'beta3Year', 'enterpriseToEbitda', '52WeekChange', 'morningStarRiskR

time: 3.16 s (started: 2021-11-24 11:47:44 -08:00)


### Data for Stocks in the News
Creating a data frame of price and dividend information for S&P 500 stocks in the news in the following steps:

- Create a data dictionary from the list of companies in the headlines.

- Create a data frame from data dictionary.

In [17]:
# creating empty stock info dictionary
stock_data = {
    'Company': [],
    'Symbol': [],
    'currentPrice': [],
    'dayHigh': [],
    'dayLow': [],
    '52wkHigh': [],
    '52wkLow': [],
    'dividendRate': []
    
}

# loading stocks from s&p dataframe and appending data from yf
for company in companies:

    try:
        if stocks_df['Company Name'].str.contains(company).sum():
            symbol = stocks_df[stocks_df['Company Name'].\
                                str.contains(company)]['Symbol'].values[0]
            org_name = stocks_df[stocks_df['Company Name'].\
                                str.contains(company)]['Company Name'].values[0]
            stock_data['Company'].append(org_name)
            stock_data['Symbol'].append(symbol)
            stock_info = yf.Ticker(symbol).info
            stock_data['currentPrice'].append(stock_info['currentPrice'])
            stock_data['dayHigh'].append(stock_info['dayHigh'])
            stock_data['dayLow'].append(stock_info['dayLow'])
            stock_data['52wkHigh'].append(stock_info['fiftyTwoWeekHigh'])
            stock_data['52wkLow'].append(stock_info['fiftyTwoWeekLow'])            
            
            # converting dividend None types to floats
            dividend = stock_info['dividendRate']
            if dividend != None:
                dividend = dividend
            else:
                dividend = 0
            stock_data['dividendRate'].append(dividend)
        else:
            pass
    except:
        pass

time: 2.69 s (started: 2021-11-24 11:47:47 -08:00)


In [18]:
# checking dict
stock_data

{'Company': ['Federal Realty Investment Trust'],
 'Symbol': ['FRT'],
 'currentPrice': [131.64],
 'dayHigh': [131.64],
 'dayLow': [130.02],
 '52wkHigh': [135.56],
 '52wkLow': [81.85],
 'dividendRate': [4.28]}

time: 2.04 ms (started: 2021-11-24 11:47:50 -08:00)


### Data Frame of S&P 500 Stocks in the News

In [19]:
in_the_news = pd.DataFrame(stock_data)
in_the_news.head()

Unnamed: 0,Company,Symbol,currentPrice,dayHigh,dayLow,52wkHigh,52wkLow,dividendRate
0,Federal Realty Investment Trust,FRT,131.64,131.64,130.02,135.56,81.85,4.28


time: 6.79 ms (started: 2021-11-24 11:47:50 -08:00)


In [20]:
# checking Dtypes
in_the_news.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 8 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Company       1 non-null      object 
 1   Symbol        1 non-null      object 
 2   currentPrice  1 non-null      float64
 3   dayHigh       1 non-null      float64
 4   dayLow        1 non-null      float64
 5   52wkHigh      1 non-null      float64
 6   52wkLow       1 non-null      float64
 7   dividendRate  1 non-null      float64
dtypes: float64(6), object(2)
memory usage: 192.0+ bytes
time: 6.2 ms (started: 2021-11-24 11:47:50 -08:00)


## Individual Stock Price Helper Function
Checking the helper function.

In [21]:
h.get_prices('MMM')

Unnamed: 0,Company,Symbol,Current Price,Intraday High,Intraday Low,52wkHigh,52wkLow,Dividend
0,3M Company,MMM,178.05,179.45,177.71,208.95,163.38,5.92


time: 2.84 s (started: 2021-11-24 11:47:50 -08:00)


In [22]:
h.get_prices('AAPL')

Unnamed: 0,Company,Symbol,Current Price,Intraday High,Intraday Low,52wkHigh,52wkLow,Dividend
0,Apple Inc.,AAPL,161.05,161.98,159.64,165.7,115.17,0.88


time: 2.98 s (started: 2021-11-24 11:47:53 -08:00)


In [23]:
h.get_prices('TSLA')

Unnamed: 0,Company,Symbol,Current Price,Intraday High,Intraday Low,52wkHigh,52wkLow,Dividend
0,"Tesla, Inc.",TSLA,1129.32,1130.6,1062,1243.49,539.49,0


time: 2.92 s (started: 2021-11-24 11:47:56 -08:00)


## Options Markets Tests

In [24]:
# creating a fuction to display available expirations
def get_expirations(symbol : str):
    stock = yf.Ticker(symbol)
    return stock.options

time: 582 µs (started: 2021-11-24 11:47:59 -08:00)


In [25]:
get_expirations('AAPL')

('2021-11-26',
 '2021-12-03',
 '2021-12-10',
 '2021-12-17',
 '2021-12-23',
 '2021-12-31',
 '2022-01-21',
 '2022-02-18',
 '2022-03-18',
 '2022-04-14',
 '2022-05-20',
 '2022-06-17',
 '2022-07-15',
 '2022-09-16',
 '2023-01-20',
 '2023-03-17',
 '2023-06-16',
 '2023-09-15',
 '2024-01-19')

time: 235 ms (started: 2021-11-24 11:47:59 -08:00)


In [26]:
get_expirations('MMM')

('2021-11-26',
 '2021-12-03',
 '2021-12-10',
 '2021-12-17',
 '2021-12-23',
 '2021-12-31',
 '2022-01-21',
 '2022-04-14',
 '2022-06-17',
 '2022-07-15',
 '2023-01-20',
 '2024-01-19')

time: 146 ms (started: 2021-11-24 11:47:59 -08:00)


In [27]:
def options_mrkt_test(symbol : str, expiration : str, option : str):
    stock = yf.Ticker(symbol)
    current_price = stock.info['currentPrice']
    dividend = stock.info['dividendRate']
    url = ('https://www.treasury.gov/resource-center/'
    'data-chart-center/interest-rates/Pages/TextView.aspx?data=yield')
    rates = pd.read_html(url)
    risk_free_rate = rates[1]['3 mo']
    print(f"Current Price: ${current_price}")
    print(f"Dividend: ${dividend}")
    print(f"3 Month TBill Rate: {risk_free_rate.to_string().split()[1]}%")
    opt = stock.option_chain(expiration)
    if option == 'calls':
        return opt.calls.sort_values(by='strike')
    elif option == 'puts':
        return opt.puts.sort_values(by='strike')

time: 1.21 ms (started: 2021-11-24 11:47:59 -08:00)


In [28]:
options_mrkt_test('MMM', '2021-11-26', 'calls' )

Current Price: $178.05
Dividend: $5.92
3 Month TBill Rate: 0.05%


Unnamed: 0,contractSymbol,lastTradeDate,strike,lastPrice,bid,ask,change,percentChange,volume,openInterest,impliedVolatility,inTheMoney,contractSize,currency
0,MMM211126C00170000,2021-11-24 16:24:54,170.0,9.0,7.8,8.2,-1.15,-11.330046,1.0,2,0.336921,True,REGULAR,USD
1,MMM211126C00175000,2021-11-24 15:41:17,175.0,3.15,2.94,3.2,-3.1,-49.6,3.0,1,0.161141,True,REGULAR,USD
2,MMM211126C00177500,2021-11-24 18:51:16,177.5,1.33,0.92,1.11,-1.17,-46.799995,30.0,33,0.125253,True,REGULAR,USD
3,MMM211126C00180000,2021-11-24 19:24:54,180.0,0.2,0.16,0.25,-0.7,-77.77778,297.0,415,0.140634,False,REGULAR,USD
4,MMM211126C00182500,2021-11-24 19:27:45,182.5,0.03,0.02,0.11,-0.21,-87.5,76.0,405,0.191414,False,REGULAR,USD
5,MMM211126C00185000,2021-11-24 19:03:25,185.0,0.03,0.02,0.04,-0.08,-72.72727,31.0,745,0.222664,False,REGULAR,USD
6,MMM211126C00187500,2021-11-24 19:23:08,187.5,0.02,0.0,0.02,-0.01,-33.333336,12.0,406,0.25782,False,REGULAR,USD
7,MMM211126C00190000,2021-11-24 15:20:28,190.0,0.03,0.0,0.07,0.0,0.0,2.0,368,0.376959,False,REGULAR,USD
8,MMM211126C00192500,2021-11-24 15:36:50,192.5,0.05,0.0,0.4,0.02,66.66667,2.0,24,0.532231,False,REGULAR,USD
9,MMM211126C00195000,2021-11-24 16:15:37,195.0,0.03,0.01,0.25,0.01,50.0,3.0,54,0.554692,False,REGULAR,USD


time: 3.36 s (started: 2021-11-24 11:47:59 -08:00)


In [29]:
h.options_mrkt('MMM', '2021-11-26', 'calls' )

Current Price: $178.05
Dividend: $5.92
3 Month TBill Rate: 0.05%


Unnamed: 0,contractSymbol,lastTradeDate,strike,lastPrice,bid,ask,change,percentChange,volume,openInterest,impliedVolatility,inTheMoney,contractSize,currency
0,MMM211126C00170000,2021-11-24 16:24:54,170.0,9.0,7.8,8.2,-1.15,-11.330046,1.0,2,0.336921,True,REGULAR,USD
1,MMM211126C00175000,2021-11-24 15:41:17,175.0,3.15,2.94,3.2,-3.1,-49.6,3.0,1,0.161141,True,REGULAR,USD
2,MMM211126C00177500,2021-11-24 18:51:16,177.5,1.33,0.92,1.11,-1.17,-46.799995,30.0,33,0.125253,True,REGULAR,USD
3,MMM211126C00180000,2021-11-24 19:24:54,180.0,0.2,0.16,0.25,-0.7,-77.77778,297.0,415,0.140634,False,REGULAR,USD
4,MMM211126C00182500,2021-11-24 19:27:45,182.5,0.03,0.02,0.11,-0.21,-87.5,76.0,405,0.191414,False,REGULAR,USD
5,MMM211126C00185000,2021-11-24 19:03:25,185.0,0.03,0.02,0.04,-0.08,-72.72727,31.0,745,0.222664,False,REGULAR,USD
6,MMM211126C00187500,2021-11-24 19:23:08,187.5,0.02,0.0,0.02,-0.01,-33.333336,12.0,406,0.25782,False,REGULAR,USD
7,MMM211126C00190000,2021-11-24 15:20:28,190.0,0.03,0.0,0.07,0.0,0.0,2.0,368,0.376959,False,REGULAR,USD
8,MMM211126C00192500,2021-11-24 15:36:50,192.5,0.05,0.0,0.4,0.02,66.66667,2.0,24,0.532231,False,REGULAR,USD
9,MMM211126C00195000,2021-11-24 16:15:37,195.0,0.03,0.01,0.25,0.01,50.0,3.0,54,0.554692,False,REGULAR,USD


time: 3.45 s (started: 2021-11-24 11:48:03 -08:00)


## Dividend Tests

In [30]:
msft = yf.Ticker('MSFT')

time: 3.28 ms (started: 2021-11-24 11:48:06 -08:00)


In [31]:
exdiv = msft.info['exDividendDate']
exdiv

1637107200

time: 3.15 s (started: 2021-11-24 11:48:06 -08:00)


In [32]:
exdate = datetime.fromtimestamp(exdiv).ctime()
exdate

'Tue Nov 16 16:00:00 2021'

time: 1.86 ms (started: 2021-11-24 11:48:10 -08:00)


In [33]:
mmm = yf.Ticker('MMM')

time: 2.7 ms (started: 2021-11-24 11:48:10 -08:00)


In [34]:
exdiv2 = mmm.info['exDividendDate']
exdiv2

1637193600

time: 2.72 s (started: 2021-11-24 11:48:10 -08:00)


In [35]:
exdate2 = datetime.fromtimestamp(exdiv2).ctime()
exdate2

'Wed Nov 17 16:00:00 2021'

time: 2.6 ms (started: 2021-11-24 11:48:12 -08:00)


In [36]:
bmy = yf.Ticker('BMY')

time: 2.51 ms (started: 2021-11-24 11:48:12 -08:00)


In [37]:
exdiv3 = bmy.info['exDividendDate']
exdiv3

1632960000

time: 3.02 s (started: 2021-11-24 11:48:13 -08:00)


In [38]:
exdate3 = datetime.fromtimestamp(exdiv3).ctime()
exdate3

'Wed Sep 29 17:00:00 2021'

time: 2.19 ms (started: 2021-11-24 11:48:16 -08:00)
