In [1]:
import numpy as np
import statsmodels.api as sm
import pandas as pd
from scipy.linalg import toeplitz

In [2]:
df = pd.read_csv('market_price_autoregression_df.csv')

## Overview

This notebook provides an Ordinary Least Squares (OLS) estimate of a linear model for the market price parameters. The OLS estimate will provide a basis for considering the goodness of fit for the model, hypothesis testing for the estimate coefficients of each parameter, and a test for autocorrelation.

In [3]:
input_vars = ['p', 
                'e_hat', 'e_star', 
                'cumsum_e_hat', 'cumsum_e_star', 
                'delta_e_hat', 'delta_e_star']
output_var = ['y']

input_data = df[input_vars]
output_data = df[output_var]

## OLS Model Results

In [5]:
ols_model = sm.OLS(output_data, input_data)
results = ols_model.fit()
results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.551
Model:,OLS,Adj. R-squared:,0.547
Method:,Least Squares,F-statistic:,131.6
Date:,"Mon, 21 Sep 2020",Prob (F-statistic):,2e-108
Time:,09:52:40,Log-Likelihood:,2139.1
No. Observations:,651,AIC:,-4264.0
Df Residuals:,644,BIC:,-4233.0
Df Model:,6,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
p,0.9977,0.001,1005.002,0.000,0.996,1.000
e_hat,0.6923,0.103,6.693,0.000,0.489,0.895
e_star,-0.3811,0.124,-3.067,0.002,-0.625,-0.137
cumsum_e_hat,0.0841,0.021,3.973,0.000,0.043,0.126
cumsum_e_star,0.0032,0.001,3.879,0.000,0.002,0.005
delta_e_hat,-0.3155,0.234,-1.349,0.178,-0.775,0.144
delta_e_star,0.4858,0.245,1.981,0.048,0.004,0.967

0,1,2,3
Omnibus:,81.285,Durbin-Watson:,2.029
Prob(Omnibus):,0.0,Jarque-Bera (JB):,459.796
Skew:,-0.373,Prob(JB):,1.43e-100
Kurtosis:,7.049,Cond. No.,1120.0


## Summary

- The OLS estimate shows an R-squared of 0.551, which suggests the model inputs explain about 55% of the variance. 

- The estimated coefficents show statistical significance at the 5% level for all but one of the parameters, which suggests we can reject the null hypothesis for those parameters. 

- Finally, the Durbin-Watson test of correlation between the residuals/error term shows a value close to 2, which suggests the autocorrelation is low enough to not correct.