In [3]:
import numpy as np
import pandas as pd
import yfinance as yf
import itertools

---
## User Inputs and Preliminaries

We begin by allowing users to self-define a list of assets ("stocks") they'd like to include in this basket-option.

"time_count" refers to time steps this model takes, with time being modelled as a finite set $\mathbb{T} = \{0, 1, ..., \textnormal{T}\}$.

"risk_free_rate" simply refers to the return of risk-free assets, where a unit of money invested in it (i.e. bonds, savings) at time $t = 0$ will yield $\textnormal{R} = (1 + r)$ units of money after 1 time-step. There is the assumption that the risk-free rate will remain unchanged at all times.

"option_type" is a choice between "call" or "put".

"selected_time" is an optional parameter (otherwise zero). ** Ignore for now **

In [4]:
stocks = ["HSBA.L", "RR.L", "BP.L", "BLND.L", "AV.L"]  # list of stocks
asset_count = len(stocks)

time_count = 5
risk_free_rate = 1
strike_price = 350
option_type = "call"
selected_time = 0

scenarios = []
if selected_time > 0:
    for i in range(asset_count):
        while True:
            try:
                scenario = list(map(int, input(f"Enter scenario for asset {i+1} up to time {selected_time} (use 1 for up and 0 for down): ").strip().split()))
                if all(val in (0, 1) for val in scenario) and len(scenario) == selected_time:
                    scenarios.append(scenario)
                    break
                else:
                    print(f"Invalid input. Please enter exactly {selected_time} integers consisting of 0s and 1s only.")
            except ValueError:
                print("Invalid input. Please enter integers (0 or 1) separated by spaces.")

if selected_time > 0:
    print(f"Scenarios for each asset up to time {selected_time}:", scenarios)

In [5]:
stock_data = yf.download(" ".join(stocks), start="2023-01-01", end="2023-06-30", rounding=True)["Close"]
stock_data.to_csv("yf_stock_data.csv", index=True, header=True)

[*********************100%%**********************]  5 of 5 completed


In [6]:
raw_data = pd.DataFrame(stock_data)
raw_data.head()

Ticker,AV.L,BLND.L,BP.L,HSBA.L,RR.L
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2023-01-03,448.9,408.2,483.35,529.9,98.91
2023-01-04,456.8,412.8,465.85,543.5,101.4
2023-01-05,448.7,409.5,471.75,565.3,102.66
2023-01-06,456.0,410.2,477.05,568.6,102.9
2023-01-09,455.0,410.0,479.3,563.2,103.76


In [7]:
raw_data.fillna(method='ffill', inplace=True)
clean_data = raw_data
clean_data.head()

  raw_data.fillna(method='ffill', inplace=True)


Ticker,AV.L,BLND.L,BP.L,HSBA.L,RR.L
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2023-01-03,448.9,408.2,483.35,529.9,98.91
2023-01-04,456.8,412.8,465.85,543.5,101.4
2023-01-05,448.7,409.5,471.75,565.3,102.66
2023-01-06,456.0,410.2,477.05,568.6,102.9
2023-01-09,455.0,410.0,479.3,563.2,103.76


---
## Finding Initial Prices, and Factors

In this cell, we need to define initial prices $S_i(0)$, up and down factors $U$ and $D$, all crucial for option pricing. Initial prices can simply be found by using the most recent price for each asset in our dataset.

Factors $U$ and $D$ are not found so straightforwardly and must be derived from the data that we have. Firstly, we must consider, how much data is an appropiate amount? Because we know that while more datapoints usually leads to better accuracy, in the case of financial data, it is important not to use too much information that may be irrelevant to the present day, i.e. past economic shocks. Hull (2015) suggests closing prices over the last 90 - 180 days.

A derivation of $U$ and $D$ comes from the popular Cox, Ross, and Rubinstein (1979), giving 

$$
U = e^{\sigma\sqrt{\Delta t}}, D = e^{-\sigma\sqrt{\Delta t}}
$$

$\sigma$ refers to the volatility (standard deviation) of an asset. To determine this, we use Hull (2015). We begin by calculating the log-returns for each observation with:

$$
u_i = \textnormal{ln}\left(\frac{S(i)}{S(i-1)}\right)\ \ \textnormal{for}\ i = 1, 2, ..., n
$$

Then compute conventional standard deviations for $u_i$ with

$$
s = \sqrt{\frac{1}{n-1}\sum_{i=1}^n (u_i - \bar{u})^2}
$$

Note that $U$ and $D$ are reciprocal, whereby they are symmetrical around 1. Consequently, the product of $U$ and $D$ is 1.

$$
U \times D = e^{\sigma\sqrt{\Delta t}} \times e^{-\sigma\sqrt{\Delta t}} = e^0 = 1
$$

In [8]:
initial_prices = []
up_factors = []
down_factors = []

delta_t = 1

for stock in stocks:
    clean_data[f'{stock}_Log_Return'] = np.log(clean_data[stock] / clean_data[stock].shift(1))

    st_dev = clean_data[f'{stock}_Log_Return'].std()

    U = round(np.exp(st_dev * np.sqrt(delta_t)), 3)
    D = round(np.exp(-st_dev * np.sqrt(delta_t)), 3)

    initial_prices.append(clean_data[stock].iloc[-1])
    up_factors.append(U)
    down_factors.append(D)

print("Initial Prices:", initial_prices)
print("Up:", up_factors)
print("Down:", down_factors)

Initial Prices: [618.8, 148.7, 454.8, 300.6, 388.3]
Up: [1.015, 1.03, 1.02, 1.018, 1.015]
Down: [0.985, 0.971, 0.98, 0.982, 0.986]


---
## Formula: no-arbitrage interval upper-bound $C_{\textnormal{max}}$

Where single-asset options, within the binomial model, are priced such that they achieve a "no-arbitrage" price (it is not possible to make a sure profit without risk to any capital outlay), basket options encounter an "issue" where there is no definitive no-arbitrage price, and for extremely long and complex reasons, we have a no-arbitrage price interval, defined as the region within $C_{\textnormal{min}}$ and $C_{\textnormal{max}}$.

For the sake of simplicity, we use Kedra, Libman and Steblovskaya (2022)'s explicit formula for determining the upper-bound $C_{\textnormal{max}}$ as our pricing model.

$$
C_{\textnormal{max}}(v) = \sum_{k_0 + \dots + k_m = n-k} \frac{(n-k)!}{k_0!\dots k_m!}\textnormal{P}(\mu_0)^{k_0}\dots \textnormal{P}(\mu_m)^{k_m}\textnormal{X}(\omega_1 \dots\omega_k\mu_0\dots\mu_0\dots\mu_m\dots\mu_m)
$$

Rather than tackling the formula directly, we will do so in a more procedural fashion to simplify its use.

In [9]:
set_i = list(range(asset_count + 1))
I_factor = time_count - selected_time

set_I = list(itertools.product(set_i, repeat=I_factor))
print(set_I)

[(0, 0, 0, 0, 0), (0, 0, 0, 0, 1), (0, 0, 0, 0, 2), (0, 0, 0, 0, 3), (0, 0, 0, 0, 4), (0, 0, 0, 0, 5), (0, 0, 0, 1, 0), (0, 0, 0, 1, 1), (0, 0, 0, 1, 2), (0, 0, 0, 1, 3), (0, 0, 0, 1, 4), (0, 0, 0, 1, 5), (0, 0, 0, 2, 0), (0, 0, 0, 2, 1), (0, 0, 0, 2, 2), (0, 0, 0, 2, 3), (0, 0, 0, 2, 4), (0, 0, 0, 2, 5), (0, 0, 0, 3, 0), (0, 0, 0, 3, 1), (0, 0, 0, 3, 2), (0, 0, 0, 3, 3), (0, 0, 0, 3, 4), (0, 0, 0, 3, 5), (0, 0, 0, 4, 0), (0, 0, 0, 4, 1), (0, 0, 0, 4, 2), (0, 0, 0, 4, 3), (0, 0, 0, 4, 4), (0, 0, 0, 4, 5), (0, 0, 0, 5, 0), (0, 0, 0, 5, 1), (0, 0, 0, 5, 2), (0, 0, 0, 5, 3), (0, 0, 0, 5, 4), (0, 0, 0, 5, 5), (0, 0, 1, 0, 0), (0, 0, 1, 0, 1), (0, 0, 1, 0, 2), (0, 0, 1, 0, 3), (0, 0, 1, 0, 4), (0, 0, 1, 0, 5), (0, 0, 1, 1, 0), (0, 0, 1, 1, 1), (0, 0, 1, 1, 2), (0, 0, 1, 1, 3), (0, 0, 1, 1, 4), (0, 0, 1, 1, 5), (0, 0, 1, 2, 0), (0, 0, 1, 2, 1), (0, 0, 1, 2, 2), (0, 0, 1, 2, 3), (0, 0, 1, 2, 4), (0, 0, 1, 2, 5), (0, 0, 1, 3, 0), (0, 0, 1, 3, 1), (0, 0, 1, 3, 2), (0, 0, 1, 3, 3), (0, 0, 1, 3, 