Homework 2:
    
Consider the closing prices of Facebook stock for the period March 4, 2019 - March 3, 2020.

Consider Black-Scholes model with non-constant trend

mu(t) = theta_1 + theta_2 * log(S(t)/S(t − 4 * dt))

where dt is the length of one trading day in years, 

and constant volatility sigma(s, t) = theta_0

Find maximum likelihood estimates for the parameters theta_0, theta_1 and theta_2 
by using the approach discussed in lecture notes (that is based on using the Euler’s approximation of the equation for dS(t)). 

Can we assume that the observed stock prices follows the market model with the function µ and constant σ? 

Can we assume that the market model with the non-constant trend is better than the one with constant trend? 

Explain!

All comments and explanations should be included (together with copies of obtained numerical results) as Python comments in your solution file, or a Jupyter notebook file with explanations and comments should be submitted.

In [1]:
import numpy as np

# read in the data
FB_data = np.loadtxt("FB.csv", delimiter = ',', skiprows = 1, usecols = (5, ))

n = len(FB_data)
dt = 1/n

# a shorter name for the log-return data
S = FB_data 

i = np.arange(0, n-1)

x = np.log(S[i + 1]/S[i])

sigma = np.std(x)/np.sqrt(dt)
mu = np.mean(x)/dt + sigma**2/2

print(mu, sigma)

def f(theta):
    # assume that S and dt are defined outside the function
    i = np.arange(4, n-1)
    Y = (S[i + 1] - S[i])/S[i]
    sigma_vals = theta[0]
    mu_vals = theta[1] + theta[2]*np.log(S[i]/S[i - 4])
    m = (mu_vals - sigma_vals**2/2)*dt
    return np.sum((Y - mu_vals*dt)**2/(2*sigma_vals**2*dt) + np.log(sigma_vals))

# good starting values are known constants from simpler (constant coefficients) model
from scipy import optimize
theta_opt1 = optimize.fmin(f, [sigma, mu, 0])
print(theta_opt1)

theta_opt2 = optimize.fmin(f, theta_opt1)
print(theta_opt2)

theta_opt3 = optimize.fmin(f, theta_opt2)
print(theta_opt3)

theta_opt4 = optimize.fmin(f, theta_opt3)
print(theta_opt4)

print("The optimal parameters are: theta_0: " + str(round(theta_opt4[0], 4)) + ", theta_1: " +  str(round(theta_opt4[1], 4)) + 
      " and theta_2: " +  str(round(theta_opt4[2], 4)))

0.14167586383919634 0.2694854647674284
Optimization terminated successfully.
         Current function value: -202.622223
         Iterations: 139
         Function evaluations: 247
[ 0.26793118  0.13827731 -3.98933859]
Optimization terminated successfully.
         Current function value: -202.622223
         Iterations: 58
         Function evaluations: 100
[ 0.26793118  0.13827731 -3.98933859]
Optimization terminated successfully.
         Current function value: -202.622223
         Iterations: 58
         Function evaluations: 100
[ 0.26793118  0.13827731 -3.98933859]
Optimization terminated successfully.
         Current function value: -202.622223
         Iterations: 58
         Function evaluations: 100
[ 0.26793118  0.13827731 -3.98933859]
The optimal parameters are: theta_0: 0.2679, theta_1: 0.1383 and theta_2: -3.9893


In [2]:
from scipy import stats

stats.shapiro(x)

(0.9541699886322021, 3.8017884662622237e-07)

In [3]:
stats.anderson(x, dist = 'norm')

AndersonResult(statistic=2.279897564396123, critical_values=array([0.567, 0.646, 0.775, 0.904, 1.075]), significance_level=array([15. , 10. ,  5. ,  2.5,  1. ]))

The result of the Shapiro test tells us that there is a 3.8018e-07 chance the data come from a normal distribution.

The Anderson test tells us that since the test statistic is 2.2799 and is much greater than 1.075, the test statistic at the 1% confidence level, it is extremely unlikely that the log-stock return data come from a normal distribution.

Therefore, it is highly unlikely that the log-return data come from the Black-Scholes model with constant coefficients.

In [4]:
# two parameters in constant trend and constant volatility model
AIC_constant_model = 2*2 + 2*f([sigma, mu, 0]) 

# three parameters in the model with non-constant trend
AIC_non_constant_trend = 2*3 + 2*f(theta_opt4) 

print(AIC_constant_model, AIC_non_constant_trend)

-400.9990278516483 -399.2444468960589


Because the model with constant coefficients has a lower AIC value (-400.999) than that of the model with non-constant trend (-399.244), the model with constant coefficients is a better model than that with non-constant coefficients, and thus is preferred.