## CHEME 5660: Estimating the Share Price of a Single Risky Asset using a Binomial Lattice Model

<img src="./figs/Fig-Binomial-Lattice-Schematic.png" style="margin:auto; width:30%"/>

### Binomial lattice model
A binomial lattice model assumes that each discrete time increment, the state of the system, e.g., the share price of equity, the spot rate, etc., can either increase by a factor $u$ with probability $p$ or decrease by a factor $d$ with probability $(1-p)$. Different models can be developed for the specific values of the tuple $(u,d,p)$. One particular model is the Cox, Ross, and Rubinstein (CRR) model:

* [Cox, J. C.; Ross, S. A.; Rubinstein, M. (1979). "Option pricing: A simplified approach". Journal of Financial Economics. 7 (3): 229. CiteSeerX 10.1.1.379.7582. doi:10.1016/0304-405X(79)90015-1](https://www.sciencedirect.com/science/article/pii/0304405X79900151?via%3Dihub)

#### Cox, Ross and Rubinstein (CRR) model
The [CRR binomial lattice model](https://en.wikipedia.org/wiki/Binomial_options_pricing_model) was initially developed for options pricing in 1979. However, one of the critical aspects of estimating an option’s price is calculating the underlying asset’s share price. Thus, let's use the [CRR model](https://en.wikipedia.org/wiki/Binomial_options_pricing_model) to compute the share price of a stock, Advanced Micro Devices, Inc, with the ticker symbol [AMD](https://finance.yahoo.com/quote/AMD?.tsrc=applewf). In the [CRR model](https://en.wikipedia.org/wiki/Binomial_options_pricing_model) model, the `up` and `down` moves are symmetric:

$$ud = 1$$

where the magnitude of an `up` move $u$ is given by:

$$u = \exp(\sigma\sqrt{\Delta{T}})$$

The quantity $\sigma$ denotes a _volatility parameter_, and $\Delta{T}$ represents the time step. The probability $p$ of an `up` move in a [CRR model](https://en.wikipedia.org/wiki/Binomial_options_pricing_model) is given by:

$$p = \frac{\exp(\mu\Delta{T}) - d}{u - d}$$

where $\mu$ denotes a _return parameter_. In the [CRR model](https://en.wikipedia.org/wiki/Binomial_options_pricing_model) model paradigm, the return parameter $\mu$ and the volatility parameter $\sigma$ take on common values:
* The return parameter $\mu$ is a _risk-free_ rate of return; the _risk-free_ rate $\bar{r}$ can be approximated by the [yield on T = 10-year United States Treasury debt security](https://ycharts.com/indicators/10_year_treasury_rate). 
* The volatility parameter $\sigma$ is the [implied volatility](https://www.investopedia.com/terms/i/iv.asp); the implied volatility is the market's view of the likelihood of changes in a given security's price.

### Lab setup
The code block below installs (and loads) any [Julia](https://julialang.org) packages that we need to complete the calculations. 

In [1]:
import Pkg; Pkg.activate("."); Pkg.resolve(); Pkg.instantiate();

[32m[1m  Activating[22m[39m project at `~/Desktop/julia_work/CHEME-5660-Markets-Mayhem-Example-Notebooks/labs/lab-2-Binomial-Pricing-Single-Assets`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-5660-Markets-Mayhem-Example-Notebooks/labs/lab-2-Binomial-Pricing-Single-Assets/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/julia_work/CHEME-5660-Markets-Mayhem-Example-Notebooks/labs/lab-2-Binomial-Pricing-Single-Assets/Manifest.toml`


In [2]:
using PQEcolaPoint
using DataFrames
using CSV
using Statistics
using Dates

### Load the lab 2 code library
The call to the `include` function loads the `CHEME-5660-Lab-2-Library.jl` library into the notebook; the library contains types and functions we use during the lab:

* The `E(X::Array{Float64,1},p::Array{Float64,1}) -> Float64` and `Var(X::Array{Float64,1}, p::Array{Float64,1}) -> Float64` functions compute the expectation and variance of the binomial price estimates given an array `X::Array{Float64,1}` of price values and associated probabilities `p::Array{Float64,1}`.
* The `build_probability_dictionary(model::CRRLatticeModel, levels::Int64) -> Dict{Int64, Array{Float64,1}}` function constructs a dictionary of probabilities for each level of the tree; keys are tree levels.
* The `build_nodes_dictionary(levels::Int64) -> Dict{Int64,Array{Int64,1}}` function constructs a dictionary of node indexs for each level of the tree; keys are the tree levels.

In [3]:
# load the code library -
include("CHEME-5660-Lab-2-Library.jl");

#### a) Load experimental data for AMD
Load the historical OHLC data set for Advanced Micro Devices, Inc with ticker symbol [AMD](https://finance.yahoo.com/quote/AMD?.tsrc=applewf) into a [DataFrame](https://dataframes.juliadata.org/stable/). The OHLC data is stored in a comma seperated value (CSV) file format; use the [CSV](https://csv.juliadata.org/stable/) package to read the data and load into a [DataFrame](https://dataframes.juliadata.org/stable/).

In [4]:
df = CSV.read("./data/AMD-OHLC-2020-8-25-to-2022-09-27.csv", DataFrame)

Unnamed: 0_level_0,volume,volume_weighted_average_price,open,close,high,low,timestamp
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Float64,Float64,DateTime
1,4.92344e7,84.8235,83.36,86.35,86.62,82.35,2020-08-25T04:00:00
2,4.71573e7,86.1775,86.9694,86.02,87.72,85.2,2020-08-26T04:00:00
3,4.21942e7,84.3753,86.35,83.8,86.58,82.94,2020-08-27T04:00:00
4,4.07233e7,85.1628,84.3,85.55,86.04,84.19,2020-08-28T04:00:00
5,9.06559e7,90.0989,85.05,90.82,92.64,85.05,2020-08-31T04:00:00
6,5.61026e7,91.4198,91.92,92.18,92.51,90.1899,2020-09-01T04:00:00
7,5.03659e7,90.7119,94.01,90.22,94.28,88.74,2020-09-02T04:00:00
8,8.74623e7,83.9462,87.84,82.54,88.47,81.59,2020-09-03T04:00:00
9,8.22678e7,80.442,81.45,82.01,84.39,76.33,2020-09-04T04:00:00
10,5.49545e7,79.9129,78.05,78.69,81.88,78.0,2020-09-08T04:00:00


#### b) Estimate CRR model parameters for AMD
In the [CRR model](https://en.wikipedia.org/wiki/Binomial_options_pricing_model) model paradigm, the return parameter $\mu$ and the volatility parameter $\sigma$ take on common values:
* The return parameter $\mu$ is a _risk-free_ rate of return; the _risk-free_ rate $\bar{r}$ can be approximated by the [yield on T = 10-year United States Treasury debt security](https://ycharts.com/indicators/10_year_treasury_rate). 
* The volatility parameter $\sigma$ is related to the [implied volatility](https://www.investopedia.com/terms/i/iv.asp); the implied volatility is the market's view of the likelihood of changes in a given security's price in the next year.

In [5]:
# number of days per year -
B = 252.0;

In [6]:
# How many days do we want to simulation?
L = 30; # units:days number of tree levels (note: the tree data model is 1 based)

In [7]:
# Set the risk free rate -
r̄ₘ = 0.0397; # μ parameter value

In [8]:
# What is the Implied Volatility (IV) -
IV = 56.4 # implied volatility (30-day average value from 09/27/22, https://marketchameleon.com/Overview/AMD/IV/)
σₘ = (IV/100.0)*sqrt(L/B)

0.19459848773454388

In [9]:
# What is the initial share price?
𝒜 = 10
Sₒ = df[end-𝒜,:close]
D = df[end-𝒜, :timestamp];

# where are we starting from -
println("Simulation starts from $(D) where Sₒ = $(Sₒ) USD/share")

Simulation starts from 2022-09-13T04:00:00 where Sₒ = 77.03 USD/share


In [10]:
# build a CRR lattice model -
model = build(CRRLatticeModel; number_of_levels=(L+1), Sₒ = Sₒ, σ = σₘ, μ = r̄ₘ, T = (L/B));

In [11]:
# Get the estimated prices at all the nodes -
P = model.data[:,1];

In [12]:
# What else is stored in the model?
p = model.p
u = model.u
d = model.d

# show -
println("The probability of an UP move: $(p) where the UP factor u = $(u)")

The probability of an UP move: 0.5033067768893008 where the UP factor u = 1.0121322186145414


#### c) Analysis of the binomial lattice price values

#### Build: node dictionary `id`

In [13]:
id = build_nodes_dictionary(L) # zero based

Dict{Int64, Vector{Int64}} with 31 entries:
  5  => [16, 17, 18, 19, 20, 21]
  16 => [137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, …
  20 => [211, 212, 213, 214, 215, 216, 217, 218, 219, 220  …  222, 223, 224, 22…
  12 => [79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91]
  24 => [301, 302, 303, 304, 305, 306, 307, 308, 309, 310  …  316, 317, 318, 31…
  28 => [407, 408, 409, 410, 411, 412, 413, 414, 415, 416  …  426, 427, 428, 42…
  8  => [37, 38, 39, 40, 41, 42, 43, 44, 45]
  17 => [154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, …
  30 => [466, 467, 468, 469, 470, 471, 472, 473, 474, 475  …  487, 488, 489, 49…
  1  => [2, 3]
  19 => [191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, …
  0  => [1]
  22 => [254, 255, 256, 257, 258, 259, 260, 261, 262, 263  …  267, 268, 269, 27…
  6  => [22, 23, 24, 25, 26, 27, 28]
  23 => [277, 278, 279, 280, 281, 282, 283, 284, 285, 286  …  291, 292, 293, 29…
  11 => [67, 68, 69, 70,

#### Build: probability dictionary `pd`
The probability dictionary holds the probability values for each node at a particular time level:

$$P(S_{t} = S_{\circ}u^{k}d^{t-k}) = \binom{t}{k}p^{k}\left(1-p\right)^{t-k}$$

where $t$ denotes the time index and $k=0,1,\dots,t$.

In [14]:
pd = build_probability_dictionary(model, L) # zero based

Dict{Int64, Vector{Float64}} with 31 entries:
  5  => [0.0322971, 0.159364, 0.314539, 0.310406, 0.153164, 0.0302302]
  16 => [1.6956e-5, 0.000267732, 0.0019816, 0.00912597, 0.0292697, 0.0693241, 0…
  20 => [1.08807e-6, 2.14754e-5, 0.000201335, 0.00119214, 0.00500001, 0.0157898…
  12 => [0.000264237, 0.00312918, 0.0169843, 0.0558705, 0.124057, 0.195883, 0.2…
  24 => [6.98211e-8, 1.65369e-6, 1.87675e-5, 0.00013582, 0.000703685, 0.0027777…
  28 => [4.48041e-9, 1.23803e-7, 1.64938e-6, 1.41068e-5, 8.70089e-5, 0.00041215…
  8  => [0.00411777, 0.0325093, 0.112287, 0.221624, 0.27339, 0.215838, 0.106501…
  17 => [8.53409e-6, 0.000143173, 0.00113033, 0.00557741, 0.0192644, 0.0494293,…
  30 => [1.13497e-9, 3.36016e-8, 4.80821e-7, 4.42869e-6, 2.95009e-5, 0.00015138…
  1  => [0.503307, 0.496693]
  19 => [2.16184e-6, 4.05352e-5, 0.000360023, 0.00201332, 0.00794746, 0.0235291…
  0  => [1.0]
  22 => [2.75627e-7, 5.98411e-6, 6.20075e-5, 0.000407951, 0.00191231, 0.0067938…
  6  => [0.0162554, 0.0962506,

In [15]:
pd[2]

3-element Vector{Float64}:
 0.2533177116626964
 0.49997813045320877
 0.2467041578840948

#### Extract: prices at $T=\star$

In [47]:
# get the prices and probability for some T
T = 6
p = pd[T];
X = P[id[T]]

7-element Vector{Float64}:
 82.81011660780494
 80.83675970459886
 78.91042770157345
 77.03
 75.1943827049069
 73.40250798873167
 71.65333347013728

#### Compute the expectation and the variance of the estimated price values
The expectation for a discrete random variable $X$, denoted by $\mathbb{E}(X)$ is defined as:

$$\mathbb{E}(X) = \sum_{x\in{X\left(\Omega\right)}}xp_{X}(x)$$

while the variance of $X$, denoted by $\text{Var}(X)$ is defined as:

$$\text{Var}(X) = \mathbb{E}(X^2) - \mathbb{E}(X)^2$$

In [48]:
# Compute the expected value
S̄ = E(X,p)

77.10049516395543

In [49]:
# Compute the variance and stdev
σ̄ = sqrt(Var(X,p))

2.277537327961592

In [50]:
println("In T = $(T) days from $(D) we expect the share price of AMD to be $(S̄) ± $(σ̄) USD/share")

In T = 6 days from 2022-09-13T04:00:00 we expect the share price of AMD to be 77.10049516395543 ± 2.277537327961592 USD/share


#### d) Did we hit or did we miss?
Suppose we define a `hit` as being within $\pm$1.96$\sigma$ of the expected value; otherwise, we `miss`.

In [51]:
# get the actual value of the AMD share price -
actual_price = df[end - (𝒜 - T), :close]
actual_date = df[end - (𝒜 - T), :timestamp]

# what happend actually?
println("Actual close price on $(actual_date) was $(actual_price) USD/share")

Actual close price on 2022-09-21T04:00:00 was 74.48 USD/share


In [52]:
# hit or miss logic?
ℒ = S̄ - 1.96*σ̄; # lower bound
𝒰 = S̄ + 1.96*σ̄; # upper bound

hit_flag = false
if (actual_price >= ℒ && actual_price <= 𝒰)
    hit_flag = true
end

# print -
print("Hit: $(hit_flag). Values L = $(ℒ), S = $(actual_price), U = $(𝒰)")

Hit: true. Values L = 72.63652200115071, S = 74.48, U = 81.56446832676015