# Factor data folder

Generate the data to compute the factor used in the Neural Network input.

Signals and factors:
 - trading signal: intersection of 20-day and 50-day moving average, 
 - trading signal: turtle trading signal with investment when stock reaches new price low and sell when it reaches new price high,
 - momentum factors: price return over 9 month, earnings momentum, and Dividends momentum. 
 - value factor:  - EBITDA/EV: (Earnings before Interest, Tax, Depreciation and Amortization) / (Entreprise Value)
 with Enterpise Value = Total Debt + Market Value - Cash
 
Trading signal output a value +1 or -1 to dictate if the rules (like a moving average crossing) dictates to buy/sell a stock. The factor is a numerical value to which ranking or z-scoring $\frac{x - \mu}{\sigma}$ might indicate the strength of a signal and the investment decision. In our context, the investment decision based on those factor values will be generated by the Neural Network.

NEED TO DO: FINISH THE LAST FACTOR SIGNALS ONCE DQN IS DONE LIKE RISK REVERSALS SIGNAL (volume of put contracts bought versus volume of call contracts bought)

## 1. load the libraries and constants

In [39]:
# To support both python 2 and python 3
from __future__ import division, print_function, unicode_literals

import numpy as np
import pandas as pd
import os

SOURCE_FOLDER = 'Data-processed'
TARGET_FOLDER = 'Data-factor'

DESCRIPTION_FILE = "data_list.csv"
DATA_FILE = "data_content.csv"

## 2. load the data

In [40]:
# Use only the time-series and the static dataset. 
df_desc = pd.read_csv(os.path.join(os.getcwd(),SOURCE_FOLDER, DESCRIPTION_FILE),encoding='utf-8',index_col=0)
df_data = pd.read_csv(os.path.join(os.getcwd(),SOURCE_FOLDER, DATA_FILE),encoding='utf-8', index_col=0)

In [41]:
df_desc.head()

Unnamed: 0_level_0,NAME,SICUR,TYPE
MNEM,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
PA1436583006,CARNIVAL,,Equity
US22160K1051,COSTCO WHOLESALE,,Equity
US4581401001,INTEL,,Equity
AN8068571086,SCHLUMBERGER,,Equity
NASCOMP,NASDAQ COMPOSITE,,Index


In [42]:
df_data.head()

Unnamed: 0_level_0,PA1436583006-MVC,PA1436583006-P,PA1436583006-PH,PA1436583006-PL,PA1436583006-PO,US22160K1051-MVC,US22160K1051-P,US22160K1051-PH,US22160K1051-PL,US22160K1051-PO,...,USBSINV.B-ES,USCAPUS.R-ES,USCAPUTLQ-ES,USCNFBUSQ-ES,USCNFCONQ-ES,USCRDCONB-ES,USCSHPM%E-ES,USPENONFO-ES,USUMINM1R-ES,ECSWF1Y-IR
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2004-01-02,33610.74,39.82,40.25,39.75,39.91,16637.42,36.32,37.42,36.18,37.19,...,,,,,,,,,,0.5781
2004-01-05,34121.62,40.43,40.61,40.185,40.28,16559.55,36.15,36.61,35.86,36.44,...,,,,,,,,,,0.5938
2004-01-06,34297.64,40.75,40.8,40.3,40.43,16747.36,36.56,36.93,36.06,36.1,...,,,,,,,,,,0.5469
2004-01-07,34380.14,40.83,41.0,40.55,40.7,16985.55,37.08,37.175,36.52,36.58,...,,,,,,,,,,0.5781
2004-01-08,34754.78,41.24,41.3,40.85,40.87,17068.01,37.26,37.53,37.02,37.4,...,,,,,,,,,,0.5156


## 3. Generate factors time-series

### 3.1 Moving-average intersection signal

First, let's start with a classic factor. ie. the intersection of the 20-day moving-average and the 50-day moving-average.

In [43]:
# Compute the moving average using one source i.e. closing price
SHORT_MA = 20
LONG_MA = 50
def moving_average(df, length, isin_name):
    return df.rolling(window=length, min_periods=5, center=False).mean().rename("-".join([isin_name,"MA"]))

def intersection_moving_average(df, length_short, length_long, isin_name):
    df_short = moving_average(df, length_short, isin_name)
    df_long = moving_average(df, length_long, isin_name)
    return pd.Series(np.where(
        df_short[length_short:] > df_long[length_short:],
        1.0, -1.0), index=df_short[length_short:].index).rename("-".join([isin_name,"FMA"]))
    
df_short_ma = pd.concat([
    moving_average(df_data["-".join([isin,"P"])],SHORT_MA,isin) for isin in df_desc[
        df_desc['TYPE'] == 'Equity'].index],axis=1)
df_fast_ma = pd.concat([
    moving_average(df_data["-".join([isin,"P"])],LONG_MA,isin) for isin in df_desc[
        df_desc['TYPE'] == 'Equity'].index],axis=1)
df_signal_ma = pd.concat([
    intersection_moving_average(df_data["-".join([isin,"P"])],SHORT_MA, LONG_MA,isin) for isin in df_desc[
        df_desc['TYPE'] == 'Equity'].index],axis=1)
df_signal_ma

Unnamed: 0_level_0,PA1436583006-FMA,US22160K1051-FMA,US4581401001-FMA,AN8068571086-FMA
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2004-01-30,1.0,1.0,1.0,1.0
2004-02-02,1.0,1.0,-1.0,1.0
2004-02-03,1.0,1.0,-1.0,1.0
2004-02-04,1.0,1.0,-1.0,1.0
2004-02-05,1.0,1.0,-1.0,1.0
...,...,...,...,...
2013-12-25,1.0,-1.0,1.0,-1.0
2013-12-26,1.0,-1.0,1.0,-1.0
2013-12-27,1.0,-1.0,1.0,-1.0
2013-12-30,1.0,-1.0,1.0,-1.0


### 3.2 Turtle trading signal

Second, let's start with a classic buy low, sell high turtle trading. Buy when a new low is established and sell when a new high is established.

Comment: that trading strategy would run horrible risk trading metrics.

In [44]:
# Compute the turtle trading using one source, i.e. the closing price. The high low is over 20 days 
WINDOW_HIGH_LOW =  20
def turtle_signal(df, length, isin_name):
    df_max = df.shift(1).rolling(length, min_periods=10, center=False).max()
    df_min = df.shift(1).rolling(length, min_periods=10, center=False).min()
    df_out = pd.Series(0, index=df.index).rename("-".join([isin_name,"FTURTLE"]))
    df_out[df_max < df ]= -1  # Sell when new high
    df_out[df_min > df ]= 1  # Buy when new low                                        
    return df_out
        
df_turtle = pd.concat([
    turtle_signal(df_data["-".join([isin,"P"])],WINDOW_HIGH_LOW,isin) for isin in df_desc[
        df_desc['TYPE'] == 'Equity'].index],axis=1)
df_turtle.head()

Unnamed: 0_level_0,PA1436583006-FTURTLE,US22160K1051-FTURTLE,US4581401001-FTURTLE,AN8068571086-FTURTLE
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2004-01-02,0,0,0,0
2004-01-05,0,0,0,0
2004-01-06,0,0,0,0
2004-01-07,0,0,0,0
2004-01-08,0,0,0,0


### 3.3 Dividend momentum factor.

Dividend rate momentum using the dividend paid divided by the total asset value. 

In [45]:
for isin in df_desc[df_desc['TYPE'] == 'Equity'].index:
    df_data["-".join([isin,"Div_rate"])] = df_data["-".join([isin,"WC05376"])] / df_data["-".join([isin,"WC02999"])]
df_dividend = pd.concat([
    df_data["-".join([isin,"Div_rate"])].rolling(
        window=120, min_periods=5, center=False).mean().rename("-".join([isin,"DIVMOM"])) for isin in df_desc[
        df_desc['TYPE'] == 'Equity'].index],axis=1)
df_dividend

Unnamed: 0_level_0,PA1436583006-DIVMOM,US22160K1051-DIVMOM,US4581401001-DIVMOM,AN8068571086-DIVMOM
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2004-01-02,,,,
2004-01-05,,,,
2004-01-06,,,,
2004-01-07,,,,
2004-01-08,,,,
...,...,...,...,...
2013-12-25,0.027459,0.046814,0.048725,0.023964
2013-12-26,0.027391,0.043738,0.048725,0.023964
2013-12-27,0.027391,0.043738,0.048725,0.023964
2013-12-30,0.027067,0.042653,0.048706,0.023964


### 3.4 EBITDA to EV factor.

Divide the Earnings before interest tax depreciation and amortisation with the enterprise value with
 - "WC18191" : EBIT
 - "WC01151": Depreciation and Amortisation
 - "MVC": Market Value of a company
 - "WC02001": Cash
 - "WC03255": Total Debt
 
thus:
 ("WC18191" + "WC01151") / ("MVC" * 100 + "WC03255" - "WC02001")

In [46]:
print(df_data["-".join(['US22160K1051',"WC02001"])].head())
print(df_data["-".join(['US22160K1051',"MVC"])].head())  # multiply by 100 to have the same unit. USD
print(df_data["-".join(['US22160K1051',"WC03255"])].head())

Date
2004-01-02          NaN
2004-01-05    2688638.0
2004-01-06          NaN
2004-01-07          NaN
2004-01-08          NaN
Name: US22160K1051-WC02001, dtype: float64
Date
2004-01-02    16637.42
2004-01-05    16559.55
2004-01-06    16747.36
2004-01-07    16985.55
2004-01-08    17068.01
Name: US22160K1051-MVC, dtype: float64
Date
2004-01-02          NaN
2004-01-05    1320935.0
2004-01-06          NaN
2004-01-07          NaN
2004-01-08          NaN
Name: US22160K1051-WC03255, dtype: float64


In [47]:
for isin in df_desc[df_desc['TYPE'] == 'Equity'].index:
    df_data["-".join([isin,"EBITDA2EV"])] = (df_data["-".join([isin,"WC18191"])] + df_data["-".join([isin,"WC01151"])]) / (
    df_data["-".join([isin,"MVC"])] * 100 + df_data["-".join([isin,"WC03255"])] - df_data["-".join([isin,"WC02001"])])
df_ebibtda2ev = pd.concat([
    df_data["-".join([isin,"EBITDA2EV"])] for isin in df_desc[
        df_desc['TYPE'] == 'Equity'].index],axis=1)
df_ebibtda2ev

Unnamed: 0_level_0,PA1436583006-EBITDA2EV,US22160K1051-EBITDA2EV,US4581401001-EBITDA2EV,AN8068571086-EBITDA2EV
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2004-01-02,,,,
2004-01-05,0.279958,6.438075,2.936817,0.559438
2004-01-06,,,,
2004-01-07,,,,
2004-01-08,,,,
...,...,...,...,...
2013-12-25,,,,
2013-12-26,,,,
2013-12-27,,,,
2013-12-30,0.266737,1.442688,2.060182,0.775365


### 3.5 trailing return factor.

fast 1-month and slow 9-month trailing returns.
Since there is 20 trading days in a month we have 1 month = 20 days  and 9 months = 180 days

In [48]:
FAST_TRAIL = 20
SLOW_TRAIL = 180
for isin in df_desc[df_desc['TYPE'] == 'Equity'].index:
    df_data["-".join([isin,"FST"])] = df_data["-".join([isin,"P"])].pct_change(periods=FAST_TRAIL)
    df_data["-".join([isin,"SLW"])] = df_data["-".join([isin,"P"])].pct_change(periods=SLOW_TRAIL)
df_slow_trailing_rtn = pd.concat([
    df_data["-".join([isin,"SLW"])] for isin in df_desc[
        df_desc['TYPE'] == 'Equity'].index],axis=1)
df_fast_trailing_rtn = pd.concat([
    df_data["-".join([isin,"FST"])] for isin in df_desc[
        df_desc['TYPE'] == 'Equity'].index],axis=1)                                    

Other factor to be implemented:
as previous one, backward looking factors for single stocks:
 - Total Assets to Price =  "WC02999" / "MVC" *100
 - Moving average of Current Ratio = MA("WC08106") :"CURRENT RATIO" (Ability to pay debt),
 - Moving average of Profitability = MA("WC08316") :"OPERATING PROFIT MARGIN",

also for indices and market conditions:
 -  Trailing returns of major indices,
 - Moving average of Profitability = MA("WC08316") :"OPERATING PROFIT MARGIN", 

forward looking indicators, 
 - difference between volume Call and Put for a stock weighted by open interest: "OI":"Open Interest", "O3": "Implied Volatility - 3 Month Constant Maturity (Cont Series)"
 - difference between volume Call and Put for an index weighted by open interest: "OI":"Open Interest", "O3": "Implied Volatility - 3 Month Constant Maturity (Cont Series)"
 - slope of Out-of-the-money us At-the-money:  (ie risk metrics of Deep out of the money put)

## 4. Output results

In [49]:
# Save the trading signal
df_signal_ma.to_csv(os.path.join(os.getcwd(),TARGET_FOLDER, 'moving_average_signal.csv'), encoding='utf-8')
df_turtle.to_csv(os.path.join(os.getcwd(),TARGET_FOLDER, 'turtle_signal.csv'), encoding='utf-8')

# Save the trading factors.
df_factor_ts = pd.concat([df_dividend, df_ebibtda2ev, df_slow_trailing_rtn,
           df_fast_trailing_rtn], axis=1,sort=True)
df_factor_ts.to_csv(os.path.join(os.getcwd(),TARGET_FOLDER, 'data_factor.csv'), encoding='utf-8')
df_factor_ts.tail()

Unnamed: 0_level_0,PA1436583006-DIVMOM,US22160K1051-DIVMOM,US4581401001-DIVMOM,AN8068571086-DIVMOM,PA1436583006-EBITDA2EV,US22160K1051-EBITDA2EV,US4581401001-EBITDA2EV,AN8068571086-EBITDA2EV,PA1436583006-SLW,US22160K1051-SLW,US4581401001-SLW,AN8068571086-SLW,PA1436583006-FST,US22160K1051-FST,US4581401001-FST,AN8068571086-FST
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2013-12-25,0.027459,0.046814,0.048725,0.023964,,,,,0.189385,0.128661,0.111451,0.239787,0.089804,-0.053358,0.064017,0.004093
2013-12-26,0.027391,0.043738,0.048725,0.023964,,,,,0.179242,0.106107,0.099465,0.242563,0.094988,-0.054214,0.07802,0.01097
2013-12-27,0.027391,0.043738,0.048725,0.023964,,,,,0.197179,0.09923,0.081995,0.228646,0.105599,-0.041717,0.080169,0.026256
2013-12-30,0.027067,0.042653,0.048706,0.023964,0.266737,1.442688,2.060182,0.775365,0.15978,0.088335,0.105646,0.214188,0.119665,-0.040785,0.097665,0.017458
2013-12-31,0.027067,0.042653,0.048706,0.023964,,,,,0.165699,0.089229,0.109188,0.228661,0.129958,-0.032122,0.093302,0.032424
