# Optimization Data Analysis

### Tasks:
1. ~~Generate empirical price and add as column~~
2. ~~Calculate relative normalized 2-norm for each data and add as column~~
3. Plots
    - expiry
    - risk-free rate
    - volatility
    - strike price
    - bases
    - call v put

## 0: Setup

### 0.1: Importing Packages and Data

In [1]:
import numpy as np
import pandas as pd
from scipy.stats import norm

In [2]:
option_prices = pd.read_csv("option_pricing.csv")

### 0.2: Function Definitions

#### 0.2.1: Empirical Option Price

According to the Black-Scholes differential equation:

$$\frac{\partial V}{\partial t} + \frac{\sigma ^ 2 S ^ 2}{2} S^2 \frac{\partial^2 V}{\partial S^2} + r S \frac{\partial V}{\partial S} - r V = 0$$

which is solved/transformed into the following (for European call and put prices, respectively):

$$C(S, t) = S\Phi(d_1) - Ke^{-rt}\Phi(d_2)$$
$$P(S, t) = Ke^{-rt}\Phi(-d_2) - S\Phi(-d_1)$$

where $\Phi$ is the standard normal cdf and $d_1 = \frac{\ln\left(\frac{S}{K}\right) + (r + \frac{\sigma ^ 2}{2})T}{\sigma\sqrt{T}}$, $d_2 = d_1 - \sigma\sqrt{T}$.

In [3]:
def empirical_option(row) -> float:
    # Setting variables based on the row input.
    # This would have been easier with just putting the functions in multiple parameter inputs, but this is non-trivial in Python
    S_0 = 100
    K = row["Strike Price"]
    r = row["Risk-Free Rate"]
    sigma = row["Volatility"]
    T = row["Expiry Time"]
    call_flag = row["Call Flag"]
    
    d_1 = (np.log(S_0 / K) + (r + sigma ** 2 / 2) * T) / (sigma * np.sqrt(T))
    d_2 = d_1 - sigma * np.sqrt(T)
    
    # call price
    if call_flag == 1:
        return S_0 * norm.cdf(d_1) - K * np.exp(-r * T) * norm.cdf(d_2)
    # put price
    return K * np.exp(-r * T) * norm.cdf(-d_2) - S_0 * norm.cdf(-d_1)

### 0.2.2: Normalized $L^2$ Error

The pricing errors were calculated using the normalized $L^2$ error, which is defined by the following equation:

$$\frac{\left\lVert y - \hat{y}\right\rVert_2}{\lVert y\rVert_2}$$

where $\hat{y}$ is the predicted value by either the original or optimized pricing method and $y$ is the empirical price.

In [4]:
# Implementation is ugly/elss generalizable with two function but is much easier in practice
def normalized_l2_error_original(row) -> float:
    return np.linalg.norm(row["Original Method Price"] - row["Empirical Price"]) / np.linalg.norm(row["Empirical Price"])
def normalized_l2_error_optimized(row) -> float:
    return np.linalg.norm(row["Optimized Method Price"] - row["Empirical Price"]) / np.linalg.norm(row["Empirical Price"])

## 1: Data Manipulation (adding and modifying columns)

### 1.1: Adding Empirical Price

In [5]:
option_prices["Empirical Price"] = option_prices.apply(empirical_option, axis = 1)

### 1.2: Adding Original and Optimized Method Price Errors

In [6]:
option_prices["Original Error"], option_prices["Optimized Error"] = option_prices.apply(normalized_l2_error_original, axis = 1), option_prices.apply(normalized_l2_error_optimized, axis = 1)

In [7]:
option_prices.head()

Unnamed: 0,Expiry Time,Risk-Free Rate,Volatility,Strike Price,Original Method Price,Optimized Method Price,Basis,Call Flag,Empirical Price,Original Error,Optimized Error
0,1,0.02,0.1,102,3.70643,4.0281,0,1,3.997243,0.072753,0.00772
1,1,0.02,0.1,102,3.70643,4.10252,2,1,3.997243,0.072753,0.026337
2,1,0.02,0.1,102,3.70643,4.03161,1,1,3.997243,0.072753,0.008598
3,1,0.02,0.1,105,2.53435,2.74178,0,1,2.751949,0.079071,0.003695
4,1,0.02,0.1,105,2.53435,2.68088,1,1,2.751949,0.079071,0.025825
