# Normalizing

In [1]:
import sys
sys.path.append('..')
import utils

In [2]:
df, df_test = utils.get_index_2018_market_value_splits(market_name='ftse')
df['rw'] = utils.get_random_walk_data()
df.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['wn'] = wn


Unnamed: 0_level_0,market_value,wn,rw
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1994-01-07,3445.98,4281.452265,1122.139662
1994-01-10,3440.58,6330.626269,1080.34786
1994-01-11,3413.77,3796.365352,1082.095245
1994-01-12,3372.02,6560.80361,1083.639265
1994-01-13,3360.01,5847.603454,1067.146255


In [3]:
import statsmodels.tsa.stattools as sts

In [4]:
benchmark = df.market_value.iloc[0]
df['norm'] = df.market_value.div(benchmark).mul(100)
sts.adfuller(df.norm)

(-1.9041551418836287,
 0.3301089327703105,
 6,
 5014,
 {'1%': -3.4316548765428174,
  '5%': -2.8621166146845334,
  '10%': -2.5670769326348926},
 19541.17381480549)

## Normalizing Returns

In [5]:
df["returns"] = df.market_value.pct_change(1).mul(100)
df = df.iloc[1:]

- We often rely on normalized returns
- They account for the absolute profitability of the investment in contrast to prices
- They allow us to compare the relative profitability as opposed to non-normalized returns

In [6]:
benchmark_returns = df.returns.iloc[0]
df['norm_returns'] = df.returns.div(benchmark_returns).mul(100)
sts.adfuller(df.norm_returns)

(-12.770265719497212,
 7.798058336039225e-24,
 32,
 4987,
 {'1%': -3.431661944885779,
  '5%': -2.8621197374408225,
  '10%': -2.5670785949998973},
 80114.49116124898)

In [7]:
import numpy as np
models = [None]
llrs = []
model_ar_x_1 = None
for i in np.arange(1, 10):
    (model_ar_x_1, llr_x) = utils.ARMA_LLR_test(df.norm_returns, model_ar_x_1, (i, 0))
    models.append(model_ar_x_1)
    llrs.append(llr_x)
    print(f'LLR test, Lags: {i}, p-value: {llr_x}')

statsmodels.tsa.arima_model.ARMA and statsmodels.tsa.arima_model.ARIMA have
been deprecated in favor of statsmodels.tsa.arima.model.ARIMA (note the .
between arima and model) and
statsmodels.tsa.SARIMAX. These will be removed after the 0.12 release.

statsmodels.tsa.arima.model.ARIMA makes use of the statespace framework and
is both well tested and maintained.

removed, use:




LLR test, Lags: 1, p-value: None
LLR test, Lags: 2, p-value: 0.0
LLR test, Lags: 3, p-value: 0.0
LLR test, Lags: 4, p-value: 0.001
LLR test, Lags: 5, p-value: 0.0
LLR test, Lags: 6, p-value: 0.001
LLR test, Lags: 7, p-value: 0.44
LLR test, Lags: 8, p-value: 0.148
LLR test, Lags: 9, p-value: 0.885


In [8]:
model_ret_ar_1 = models[1].fit()
model_ret_ar_1.summary()

0,1,2,3
Dep. Variable:,norm_returns,No. Observations:,5020.0
Model:,"ARMA(1, 0)",Log Likelihood,-40351.743
Method:,css-mle,S.D. of innovations,749.388
Date:,"Thu, 17 Dec 2020",AIC,80709.487
Time:,14:58:35,BIC,80729.05
Sample:,01-10-1994,HQIC,80716.342
,- 04-05-2013,,

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
const,-11.9731,10.339,-1.158,0.247,-32.237,8.291
ar.L1.norm_returns,-0.0230,0.014,-1.631,0.103,-0.051,0.005

0,1,2,3,4
,Real,Imaginary,Modulus,Frequency
AR.1,-43.4387,+0.0000j,43.4387,0.5000


- Normalizing has no effect on model selection