## Target Testing

This Notobook intends to study the distribution of the target, assessing its pertinence and its capacity to accurately describe the financial trend. We will use different target parameters, such as the period, the goalreturn and the option logreturn in order to find which parameters most successfully describe the financial trend. 

Lastly, the best parameters will be defined as the default parameters of the method add_target 





In [16]:
import sys
from pathlib import Path
import pandas as pd

PROJECT_ROOT = Path.cwd().parent
sys.path.insert(0, str(PROJECT_ROOT))

In [6]:
from src.data_loader import load_data

spy = load_data()
spy.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits,Capital Gains
0,2000-01-03 00:00:00-05:00,93.388752,93.388752,90.63276,91.617043,8164300,0.0,0.0,0.0
1,2000-01-04 00:00:00-05:00,90.416213,90.750869,87.965349,88.034248,8089800,0.0,0.0,0.0
2,2000-01-05 00:00:00-05:00,88.152371,89.156339,86.459404,88.191742,12177900,0.0,0.0,0.0
3,2000-01-06 00:00:00-05:00,87.955515,89.136655,86.774376,86.774376,6227200,0.0,0.0,0.0
4,2000-01-07 00:00:00-05:00,88.388636,91.813942,88.231151,91.813942,8066500,0.0,0.0,0.0


In [7]:
from src.features import add_MA, add_EMA, add_returns, add_volatility, add_distances, add_cumulated_returns, add_rsi, add_target, add_all_features

df1 = spy.copy()
df1 = add_all_features(df1)
df1 = add_target(df1)

In [8]:
df1 = df1.drop(['Dividends', 'Stock Splits', 'Capital Gains'], axis=1)
df1.head()


Unnamed: 0,Date,Open,High,Low,Close,Volume,MA10,MA50,EMA20,Return,Log Return,Volatility,Distance_MA50,Distance_EMA20,Cumulated_Return_5d,RSI14,Trend
0,2000-01-03 00:00:00-05:00,93.388752,93.388752,90.63276,91.617043,8164300,,,91.617043,,,,,0.0,,,Non-Bullish
1,2000-01-04 00:00:00-05:00,90.416213,90.750869,87.965349,88.034248,8089800,,,89.736076,-0.039106,-0.039891,,,-0.018965,,,Bullish
2,2000-01-05 00:00:00-05:00,88.152371,89.156339,86.459404,88.191742,12177900,,,89.169006,0.001789,0.001787,,,-0.01096,,,Bullish
3,2000-01-06 00:00:00-05:00,87.955515,89.136655,86.774376,86.774376,6227200,,,88.477713,-0.016071,-0.016202,,,-0.019252,,,Bullish
4,2000-01-07 00:00:00-05:00,88.388636,91.813942,88.231151,91.813942,8066500,,,89.284718,0.058077,0.056453,,,0.028328,,,Non-Bullish


In [9]:
df1['Trend'].value_counts(normalize=True)

Trend
Non-Bullish    0.636609
Bullish        0.363391
Name: proportion, dtype: float64

In [37]:
dates = df1['Date'].astype(str).apply(lambda x: x[:4])
dates

date_trend_df = pd.DataFrame([dates, df1['Trend']]).T
date_trend_df['Trend'] = date_trend_df['Trend'].apply(lambda x: 1 if x == "Bullish" else 0)

infos = date_trend_df.groupby(['Date']).mean()
infos

Unnamed: 0_level_0,Trend
Date,Unnamed: 1_level_1
2000,0.119048
2001,0.193548
2002,0.115079
2003,0.678571
2004,0.222222
2005,0.186508
2006,0.390438
2007,0.183267
2008,0.059289
2009,0.678571


In [46]:
# Test on the functionning of the definition on other tickers

qqq = load_data(ticker = 'QQQ')
dow = load_data(ticker = 'DIA')

qqq = add_target(qqq)
dow = add_target(dow)
print(qqq['Trend'].value_counts(normalize=True))
print(dow['Trend'].value_counts(normalize=True))

Trend
Non-Bullish    0.557252
Bullish        0.442748
Name: proportion, dtype: float64
Trend
Non-Bullish    0.654739
Bullish        0.345261
Name: proportion, dtype: float64


Similar to the Dow Jones, less aggressive than the Nasdaq, the target seems coherent, now we will test if it is sensitive to a little gaussian noise on the price