<a href="https://colab.research.google.com/github/kevinhhl/options-pricing-tools-and-trading-strategies/blob/main/Black_Scholes_Merton_Model_Part1_Screening_YF_for_theoretical_edges.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

---
### **Overview**


**Black-Scholes-Merton Model ("BSM") - Part 1**
<br>
This script will allow us to obtain options data provided by Yahoo Finance, and then for each quoted contract, to compute the theoretical value of it by applying the BSM. The goal is to find the difference between each contract’s quoted price and its theoretical value to allow us to ask  further questions with regards to whether there will possibly be theoretical edges for trading those contracts with abnormal discrepancies.
<br>

**Disadvantages to BSM** are, such as, but not limited to:
* Was modeled for European-style options; early-exercise is not allowed
* Assumes price to follow a random walk according to a brownian motion with a constant drift. (I have created a separate notebook to explore this topic. [Link here.](https://github.com/kevinhhl/portfolio-management-tools/blob/main/Monte_Carlo_Simulation_Random_Walk.ipynb))
* Ignores dividends
* Inputs can be subjective, especially for expected volatility
* Taxes and transaction costs are ignored

<br>

**Black-Scholes-Merton Model (“BSM”) - Part 2**
<br>
There will be another part to this project in "Part 2". Next time, position analysis will be the focus. We will find ways to present multi-legged spreads in table formats to summarize the common Greeks.

### **Imports:**

In [None]:
!pip install yahoo_fin

In [None]:
import math
from scipy.stats import norm
import pandas as pd
from pandas import DataFrame
from yahoo_fin import options
from datetime import date


### **Understanding the Model:**

The derivation of the BSM is rather complex. Through research [1], I use my own words to summarize how I understand it from practical perspective.

The premise of the Black-Scholes Model contains two components. It can be described as taking the difference of: <br>
* (a) the expected value of a stock in the event that it reaches above/below exercise price on the date of expiration for call/put options, respectively, and 
* (b) the expected payout at the exercise price

>* These components are expressed as:
<br>\begin{equation}
TV_{call}=se^{rt}N(d_{1})-xN(d_{2})
\end{equation}
<br>\begin{equation}
TV_{put}=xe^{-rt}N(-d_{2})-sN(-d_{1})
\end{equation}
> 
> **Where**:
>* TV is the theoretical value
* s is the spot price at the current moment
* x is the exercise price of the option
* t is the time till maturity in no. of years
* r is the risk free rate per annum
* σ is the annualized volatility
* N(d) is the CDF of a normal distribution.
<br>
<br>\begin{equation}
d_{1} =\frac{ln(\frac{S}{X})+(r+\frac{\sigma}{2}^2)t}{\sigma\sqrt{t}}
\end{equation}
<br>\begin{equation}
d_{2}=d_{1}-\sigma\sqrt{t}
\end{equation}


*Contemplative observations:*


* The probability of price to be in-the-money for call/put options can be pictured as if seeing the data points '*d*' landing in the right/left sides of the CDF (for call/put options, respectively). 
* Probability of the put option to be in-the-money would be N(-d), which represents the left tail of the CDF; speaking of the area from negative-infinity to 0-d. The probability of a call option to be in-the-money would be 1-N(-d); being the right-tail of the CDF.
* Price is not normally distributed. Instead, a lognormal distribution would better describe it. We still want to associate probability of price reaching X with the normal distribution, so we need to make the adjustments to *d*. Also, we will want to risk-adjust *d* according to *r*

<br>

---
*References:*

[1] Natenberg, Sheldon. <i>Chapter 18, Option Volatility and Pricing, Second Edition</i>. McGraw-Hill Edu., 2015.



### **Implementation:**

In [None]:
class BSM:
  
  def __init__(self, s,t,r,sigma):
    # Input(s) that are not set by constructor: 
    # self.x - changes as we loop through the options chain provided by Yahoo Finance

    # inputs do not change:
    self.s = s
    self.t = t
    self.r = r
    self.sigma = sigma

  def set_x(self, x):
    self.x = x
    self._recalc()

  def _recalc(self):
    # When input variable 'x' changes, will need to recalc
    self.d1 = self._d1() 
    self.d2 = self.d1 - self.sigma * math.sqrt(self.t)
    self.tv_call = self.s * norm.cdf(self.d1) - self.x*math.exp(-self.r*self.t)*norm.cdf(self.d2)
    self.tv_put = self.x * math.exp(-self.r*self.t)-self.s+self.tv_call

  def _d1(self):
    a = math.log(self.s/ self.x)
    b = (self.r+self.sigma**2/2)*self.t
    return(a+b)/self.sigma*math.sqrt(self.t)


---
*Validations:*

* Checking calculations with @YuChenAmberLu's version in [<link\>](https://github.com/YuChenAmberLu/Options-Calculator)

In [None]:
# Validations with YuChenAmberLu's version:
model = BSM(s=100,t=0.128767,r=0.2,sigma=0.2)
model.set_x(100)
assert (model.tv_call-4.112199).round(5)==0
assert (model.tv_put-1.569736).round(5)==0

---
### **Application:**
> **Comparing with Yahoo Finance**
>
> The idea of this project is to loop through certain option chains in Yahoo and see if we can scan for options with theoretical edges. This notebook is a mere illustration of how it can be done, we will only calculate TSLA's calls that are near at-the-money. 

In [None]:
ticker = "TSLA"
exp_dates = options.get_expiration_dates(ticker)
exp_dates

['February 17, 2023',
 'February 24, 2023',
 'March 3, 2023',
 'March 10, 2023',
 'March 17, 2023',
 'March 24, 2023',
 'March 31, 2023',
 'April 21, 2023',
 'May 19, 2023',
 'June 16, 2023',
 'July 21, 2023',
 'September 15, 2023',
 'December 15, 2023',
 'January 19, 2024',
 'March 15, 2024',
 'June 21, 2024',
 'September 20, 2024',
 'January 17, 2025',
 'June 20, 2025']

In [None]:
expirdate = 'February 24, 2023'           # by confirming with options.get_expiration_dates(str)
option_type = "calls"
chain = options.get_options_chain(ticker, expirdate)[option_type]
chain

Unnamed: 0,Contract Name,Last Trade Date,Strike,Last Price,Bid,Ask,Change,% Change,Volume,Open Interest,Implied Volatility
0,TSLA230224C00015000,2023-02-15 3:40PM EST,15.0,197.85,0.0,0.0,0.0,-,50,24,0.00%
1,TSLA230224C00020000,2023-01-17 11:09AM EST,20.0,108.18,192.4,192.9,0.0,-,3,3,"1,784.96%"
2,TSLA230224C00025000,2023-01-18 9:34AM EST,25.0,111.18,0.0,0.0,0.0,-,3,0,0.00%
3,TSLA230224C00030000,2023-01-18 9:42AM EST,30.0,104.87,0.0,0.0,0.0,-,-,3,0.00%
4,TSLA230224C00055000,2023-02-06 9:36AM EST,55.0,141.75,0.0,0.0,0.0,-,1,3,0.00%
...,...,...,...,...,...,...,...,...,...,...,...
114,TSLA230224C00320000,2023-02-16 3:56PM EST,320.0,0.02,0.0,0.0,0.0,-,155,1616,50.00%
115,TSLA230224C00330000,2023-02-16 3:36PM EST,330.0,0.02,0.0,0.0,0.0,-,299,1145,50.00%
116,TSLA230224C00340000,2023-02-16 3:07PM EST,340.0,0.01,0.0,0.0,0.0,-,90,1481,50.00%
117,TSLA230224C00350000,2023-02-16 2:04PM EST,350.0,0.01,0.0,0.0,0.0,-,47,3018,50.00%


In [None]:
# Our fixed model inputs: 
s     = 202       
t     = 7/365
r     = .0386
sigma = 0.71      # 71% IV, pretty insane

# Exercise price x will be obtained by iterating through the chain

# And let's only show exercise prices of ATM +/- $20 
range_filter = range(s-20,s+20)

print("{}; {}; expiring on {}".format(ticker, option_type, expirdate))
print("Holding all variables constant, quoted price vs. theoretical values (TV) as follows:")

model = BSM(s=s,t=t,r=r,sigma=sigma)

for i in range(len(chain)):
  x = chain["Strike"][i]
  model.set_x(x)
  q = chain["Last Price"][i]
  tv = None
  if option_type == "calls":
    tv = model.tv_call
  elif option_type == "puts":
    tv = model.tv_put
  if x in range_filter:
    print("strike:{},\tquote={}\t\tTV={};\t\tdifference= {}".format(x, q, tv.round(2), (q-tv).round(2)))

TSLA; calls; expiring on February 24, 2023
Holding all variables constant, quoted price vs. theoretical values (TV) as follows:
strike:185.0,	quote=19.9		TV=15.94;		difference= 3.96
strike:190.0,	quote=15.84		TV=13.57;		difference= 2.27
strike:195.0,	quote=12.37		TV=11.23;		difference= 1.14
strike:200.0,	quote=9.35		TV=8.9;		difference= 0.45
strike:205.0,	quote=6.95		TV=6.6;		difference= 0.35
strike:210.0,	quote=5.0		TV=4.31;		difference= 0.69
strike:215.0,	quote=3.49		TV=2.04;		difference= 1.45
strike:220.0,	quote=2.44		TV=-0.21;		difference= 2.65


**Observations:** 

According to our BSM, holding all model-inputs constant at each iteration, TSLA calls with strike prices ranging from 185 to 220 expiring on Feb. 24, 2023 are all overpriced. On top of this observation, the model doesn't account for transaction costs. There is no theoretical edge to buying these options as single-legs.