# Introduction

En este punto, tengo unificados los dataframes de las tres temporalidades. Partiendo que la temporalidad más baja es en la que se realizará la operativa, hay que seleccionar aquellos estados en los que se cumpla que la columna FLIP esté activa (True), que son los momentos en los que se detecta un nuevo flip de estado.

Los pasos a realizar son éstos:

* Cargar los archivos 'merged' de cada instrumento
* Seleccionar las entradas con FLIP==True en la temporalidad más baja
* Selecciono las variables útiles entre las diversas FUZ_...
* Guardar dataframes en ./filtered
* Formar el vector de entrada-salida con el loopback y la predicción
* Dividir en grupos de entrenamiento y validación
* Guardar los diferentes grupos en dataframes para una posterior recarga.


In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
# import required packages
# Append relative path to FuzzyMarketState.py
import os
import sys
sys.path.append('../../common/')

from FuzzyMarketState import FuzzyMarketState
from FuzzyLib import Fuzzifier, FuzzyVar
import MyUtils

import plotly
import plotly.plotly as py
import plotly.graph_objs as go
from plotly.graph_objs import *
from plotly.tools import FigureFactory as FF
import plotly.tools as tls
plotly.offline.init_notebook_mode(connected=True)

import logging
logging.basicConfig(level=logging.DEBUG, stream=sys.stdout)

import pandas as pd
import random
import datetime

print('Packages loaded!!')

Using TensorFlow backend.


Packages loaded!!
DEBUG:matplotlib.pyplot:Loaded backend module://ipykernel.pylab.backend_inline version unknown.


#### Cargo archivos 'merged'

In [19]:
# for each forex-pair load each timeframe
path = '../csv_data/indicators/fuzzified/merged'
items = dict()
fms = 'fms'

# r=root, d=directories, f = files
for r, d, f in os.walk(path):
  for file in f: 
    # file has format: SYMBOL_h1h4d1.csv
    tokens = file.split('_')
    print('tokens={}'.format(tokens))
    if len(tokens) == 2:
      symbol = tokens[0]
      if symbol not in items.keys():
        print('New item: {}', symbol)
        item_file = '{}/{}'.format(path, file)
        print('loading csv file: {}'.format(item_file))
        items[symbol] = {'file' : item_file, 'df' : pd.read_csv(item_file, sep=';')}
        print('loaded {} rows'.format(items[symbol]['df'].shape[0]))
        print('+++++++++++++++++++++++++++++++++++++')
        


tokens=['AUDUSD', 'h1h4d1.csv']
New item: {} AUDUSD
loading csv file: ../csv_data/indicators/fuzzified/merged/AUDUSD_h1h4d1.csv
loaded 2245 rows
+++++++++++++++++++++++++++++++++++++
tokens=['EURAUD', 'h1h4d1.csv']
New item: {} EURAUD
loading csv file: ../csv_data/indicators/fuzzified/merged/EURAUD_h1h4d1.csv
loaded 2245 rows
+++++++++++++++++++++++++++++++++++++
tokens=['EURCAD', 'h1h4d1.csv']
New item: {} EURCAD
loading csv file: ../csv_data/indicators/fuzzified/merged/EURCAD_h1h4d1.csv
loaded 2245 rows
+++++++++++++++++++++++++++++++++++++
tokens=['EURCHF', 'h1h4d1.csv']
New item: {} EURCHF
loading csv file: ../csv_data/indicators/fuzzified/merged/EURCHF_h1h4d1.csv
loaded 2245 rows
+++++++++++++++++++++++++++++++++++++
tokens=['EURGBP', 'h1h4d1.csv']
New item: {} EURGBP
loading csv file: ../csv_data/indicators/fuzzified/merged/EURGBP_h1h4d1.csv
loaded 2245 rows
+++++++++++++++++++++++++++++++++++++
tokens=['EURJPY', 'h1h4d1.csv']
New item: {} EURJPY
loading csv file: ../csv_data/ind

#### Selección de eventos FLIP == True en temporalidad más baja

In [20]:
# check columns with FLIP text
for symbol in items.keys():
  df = items[symbol]['df']
  log_columns = ''
  for c in df.columns:
    if 'FLIP' in c:
      log_columns += ' {}'.format(c)
  print('FLIP columns in {} = {}'.format(symbol, log_columns))
  print('+++++++++++++++++++++++++++++++++++++')
    

FLIP columns in AUDUSD =  h1_FLIP h4_FLIP d1_FLIP
+++++++++++++++++++++++++++++++++++++
FLIP columns in EURAUD =  h1_FLIP h4_FLIP d1_FLIP
+++++++++++++++++++++++++++++++++++++
FLIP columns in EURCAD =  h1_FLIP h4_FLIP d1_FLIP
+++++++++++++++++++++++++++++++++++++
FLIP columns in EURCHF =  h1_FLIP h4_FLIP d1_FLIP
+++++++++++++++++++++++++++++++++++++
FLIP columns in EURGBP =  h1_FLIP h4_FLIP d1_FLIP
+++++++++++++++++++++++++++++++++++++
FLIP columns in EURJPY =  h1_FLIP h4_FLIP d1_FLIP
+++++++++++++++++++++++++++++++++++++
FLIP columns in EURNZD =  h1_FLIP h4_FLIP d1_FLIP
+++++++++++++++++++++++++++++++++++++
FLIP columns in EURUSD =  h1_FLIP h4_FLIP d1_FLIP
+++++++++++++++++++++++++++++++++++++
FLIP columns in GBPCAD =  h1_FLIP h4_FLIP d1_FLIP
+++++++++++++++++++++++++++++++++++++
FLIP columns in GBPJPY =  h1_FLIP h4_FLIP d1_FLIP
+++++++++++++++++++++++++++++++++++++
FLIP columns in GBPUSD =  h1_FLIP h4_FLIP d1_FLIP
+++++++++++++++++++++++++++++++++++++
FLIP columns in NZDUSD =  h1_FLI

In [21]:
# filter rows where h1_FLIP == True
for symbol in items.keys():
  df = items[symbol]['df']
  items[symbol]['df_filtered'] = df[(df.h1_FLIP == True)].copy()
  print('Selected {} rows in {}'.format(items[symbol]['df_filtered'].shape[0], symbol))
  print('+++++++++++++++++++++++++++++++++++++')
    

Selected 376 rows in AUDUSD
+++++++++++++++++++++++++++++++++++++
Selected 361 rows in EURAUD
+++++++++++++++++++++++++++++++++++++
Selected 320 rows in EURCAD
+++++++++++++++++++++++++++++++++++++
Selected 341 rows in EURCHF
+++++++++++++++++++++++++++++++++++++
Selected 340 rows in EURGBP
+++++++++++++++++++++++++++++++++++++
Selected 362 rows in EURJPY
+++++++++++++++++++++++++++++++++++++
Selected 328 rows in EURNZD
+++++++++++++++++++++++++++++++++++++
Selected 326 rows in EURUSD
+++++++++++++++++++++++++++++++++++++
Selected 295 rows in GBPCAD
+++++++++++++++++++++++++++++++++++++
Selected 376 rows in GBPJPY
+++++++++++++++++++++++++++++++++++++
Selected 344 rows in GBPUSD
+++++++++++++++++++++++++++++++++++++
Selected 376 rows in NZDUSD
+++++++++++++++++++++++++++++++++++++
Selected 310 rows in USDCAD
+++++++++++++++++++++++++++++++++++++
Selected 343 rows in USDCHF
+++++++++++++++++++++++++++++++++++++
Selected 380 rows in USDJPY
+++++++++++++++++++++++++++++++++++++


#### Selección de variables FUZ_...

Hay que seleccionar las variables que den información sobre el estado del mercado y sobre la posición del precio respecto de algún parámetro (resistencia, fibo, etc...)

Las variables que definen el estado del mercado son las siguientes:

* FUZ_BOLLINGER_WIDTH_Gx: indican volatilidad relativa ya que da información de la anchura de las bandas respecto de los últimos 50 periodos.
* FUZ_BOLLINGER_b_Gx: indica la posición del precio en las bandas. Indica estados de sobrecompra-sobreventa
* FUZ_BULLISH_TREND_Gx/FUZ_BEARISH_TREND_Gx: indican la fortaleza de la tendencia bullish o bearish
* FUZ_TREND_STRENGTH_Gx: indica la fortaleza de la tendencia actual, sea cual sea
* FUZ_BULL_DIV_STRENGTH_Gx/FUZ_BEAR_DIV_STRENGTH_Gx: indica la fortaleza de las divergencias actuales
* FUZ_TKLND_WDOW_Gx: indica la hora actual relativa a la ventana Tokio-Londres
* FUZ_LNDNY_WDOW_Gx: indica la hora actual relativa a la ventana Londres-NY
* FUZ_MKTOPEN_WDOW_Gx: indica la hora actual relativa a la apertura del mercado
* FUZ_MKTCLOSE_WDOW_Gx: indica la hora actual relativa al cierre del mercado

Las variables que definen la posición del precio son éstas otras:

* FUZ_SMA_SLOW_DISTANCE_Gx: cercanía del precio a la SMA lenta
* FUZ_SMA_MID_DISTANCE_Gx: cercanía del precio a la SMA media
* FUZ_SMA_FAST_DISTANCE_Gx: cercanía del precio a la SMA rápida
* FUZ_FIBO_xxx_Gx: cercanía del precio a diferentes niveles FIBO
* FUZ_FIBO_RETR_Gx: cercanía del precio al retroceso fibo más cercano al precio
* FUZ_FIBO_EXTN_Gx: cercanía del precio a la extensión fibo más cercana al precio
* FUZ_SR_DISTANCE_Gx: cercanía del precio a un nivel Resistencia-Soporte relevante
* FUZ_CHHI_DISTANCE_Gx: cercanía del precio a la parte superior de un canal
* FUZ_CHLO_DISTANCE_Gx: cercanía del precio a la parte inferior de un canal

In [22]:
# list FUZ_.. variables
# market state selection mask
sel_mkt_mask = ['FUZ_BOLLINGER_WIDTH',
                'FUZ_BOLLINGER_b',
                'FUZ_BULLISH_TREND',
                'FUZ_BEARISH_TREND',
                'FUZ_BULL_DIV_STRENGTH',
                'FUZ_BEAR_DIV_STRENGTH',
                'FUX_TKLND_WDOW',
                'FUX_LNDNY_WDOW',
                'FUX_MKTOPEN_WDOW',
                'FUX_MKTCLOSE_WDOW']

# price position selection mask
sel_price_mask = ['FUZ_SMA_SLOW_DISTANCE',
                  'FUZ_SMA_MID_DISTANCE',
                  'FUZ_SMA_FAST_DISTANCE',
                  'FUZ_FIBO_',
                  'FUZ_SR_DISTANCE',
                  'FUZ_CHHI_DISTANCE',
                  'FUZ_CHLO_DISTANCE']

inp_vars = []
out_vars = []
for c in items['EURUSD']['df_filtered'].columns:
  for mm in sel_mkt_mask:
    if mm in c and '_G' in c[-5:]:
      inp_vars.append(c)      
  for pp in sel_price_mask:
    if pp in c and '_G' in c[-5:]:
      inp_vars.append(c)
      if 'h1_' in c[0:4]:
        out_vars.append(c)
      
print('inp_vars = {}'.format(inp_vars))
print('+++++++++++++++++++++++++++++++++++++')
print('out_vars = {}'.format(out_vars))

inp_vars = ['h1_FUZ_BOLLINGER_WIDTH_G0', 'h1_FUZ_BOLLINGER_WIDTH_G1', 'h1_FUZ_BOLLINGER_WIDTH_G2', 'h1_FUZ_BOLLINGER_WIDTH_G3', 'h1_FUZ_BOLLINGER_WIDTH_G4', 'h1_FUZ_BOLLINGER_WIDTH_G5', 'h1_FUZ_BOLLINGER_WIDTH_G6', 'h1_FUZ_BOLLINGER_b_G0', 'h1_FUZ_BOLLINGER_b_G1', 'h1_FUZ_BOLLINGER_b_G2', 'h1_FUZ_BOLLINGER_b_G3', 'h1_FUZ_BOLLINGER_b_G4', 'h1_FUZ_BOLLINGER_b_G5', 'h1_FUZ_BOLLINGER_b_G6', 'h1_FUZ_SMA_SLOW_DISTANCE_G0', 'h1_FUZ_SMA_SLOW_DISTANCE_G1', 'h1_FUZ_SMA_SLOW_DISTANCE_G2', 'h1_FUZ_SMA_SLOW_DISTANCE_G3', 'h1_FUZ_SMA_SLOW_DISTANCE_G4', 'h1_FUZ_SMA_MID_DISTANCE_G0', 'h1_FUZ_SMA_MID_DISTANCE_G1', 'h1_FUZ_SMA_MID_DISTANCE_G2', 'h1_FUZ_SMA_MID_DISTANCE_G3', 'h1_FUZ_SMA_MID_DISTANCE_G4', 'h1_FUZ_SMA_FAST_DISTANCE_G0', 'h1_FUZ_SMA_FAST_DISTANCE_G1', 'h1_FUZ_SMA_FAST_DISTANCE_G2', 'h1_FUZ_SMA_FAST_DISTANCE_G3', 'h1_FUZ_SMA_FAST_DISTANCE_G4', 'h1_FUZ_FIBO_023_G0', 'h1_FUZ_FIBO_023_G1', 'h1_FUZ_FIBO_023_G2', 'h1_FUZ_FIBO_023_G3', 'h1_FUZ_FIBO_023_G4', 'h1_FUZ_FIBO_038_G0', 'h1_FUZ_FIBO_038_G

In [27]:
# rename columns as inp_xxxx or out_xxx and concatenate them
for symbol in items.keys():
  df = items[symbol]['df_filtered']
  for i in inp_vars:
    df['inp_{}'.format(i)] = df[i]
  for o in out_vars:
    df['out_{}'.format(o)] = df[o]  

  # drop all columns doesn't start with inp_ or out_
  drop_cols = []
  for c in df.columns:
    if c[0:4] != 'inp_' and c[0:4] != 'out_':
      drop_cols.append(c)

  df.drop(columns=drop_cols, inplace=True)
  print('New Cols in {} = {}'.format(symbol, df.columns))

New Cols in AUDUSD = Index(['inp_h1_FUZ_BOLLINGER_WIDTH_G0', 'inp_h1_FUZ_BOLLINGER_WIDTH_G1',
       'inp_h1_FUZ_BOLLINGER_WIDTH_G2', 'inp_h1_FUZ_BOLLINGER_WIDTH_G3',
       'inp_h1_FUZ_BOLLINGER_WIDTH_G4', 'inp_h1_FUZ_BOLLINGER_WIDTH_G5',
       'inp_h1_FUZ_BOLLINGER_WIDTH_G6', 'inp_h1_FUZ_BOLLINGER_b_G0',
       'inp_h1_FUZ_BOLLINGER_b_G1', 'inp_h1_FUZ_BOLLINGER_b_G2',
       ...
       'out_h1_FUZ_CHHI_DISTANCE_G0', 'out_h1_FUZ_CHHI_DISTANCE_G1',
       'out_h1_FUZ_CHHI_DISTANCE_G2', 'out_h1_FUZ_CHHI_DISTANCE_G3',
       'out_h1_FUZ_CHHI_DISTANCE_G4', 'out_h1_FUZ_CHLO_DISTANCE_G0',
       'out_h1_FUZ_CHLO_DISTANCE_G1', 'out_h1_FUZ_CHLO_DISTANCE_G2',
       'out_h1_FUZ_CHLO_DISTANCE_G3', 'out_h1_FUZ_CHLO_DISTANCE_G4'],
      dtype='object', length=380)
New Cols in EURAUD = Index(['inp_h1_FUZ_BOLLINGER_WIDTH_G0', 'inp_h1_FUZ_BOLLINGER_WIDTH_G1',
       'inp_h1_FUZ_BOLLINGER_WIDTH_G2', 'inp_h1_FUZ_BOLLINGER_WIDTH_G3',
       'inp_h1_FUZ_BOLLINGER_WIDTH_G4', 'inp_h1_FUZ_BOLLINGER_WIDTH_

New Cols in USDCAD = Index(['inp_h1_FUZ_BOLLINGER_WIDTH_G0', 'inp_h1_FUZ_BOLLINGER_WIDTH_G1',
       'inp_h1_FUZ_BOLLINGER_WIDTH_G2', 'inp_h1_FUZ_BOLLINGER_WIDTH_G3',
       'inp_h1_FUZ_BOLLINGER_WIDTH_G4', 'inp_h1_FUZ_BOLLINGER_WIDTH_G5',
       'inp_h1_FUZ_BOLLINGER_WIDTH_G6', 'inp_h1_FUZ_BOLLINGER_b_G0',
       'inp_h1_FUZ_BOLLINGER_b_G1', 'inp_h1_FUZ_BOLLINGER_b_G2',
       ...
       'out_h1_FUZ_CHHI_DISTANCE_G0', 'out_h1_FUZ_CHHI_DISTANCE_G1',
       'out_h1_FUZ_CHHI_DISTANCE_G2', 'out_h1_FUZ_CHHI_DISTANCE_G3',
       'out_h1_FUZ_CHHI_DISTANCE_G4', 'out_h1_FUZ_CHLO_DISTANCE_G0',
       'out_h1_FUZ_CHLO_DISTANCE_G1', 'out_h1_FUZ_CHLO_DISTANCE_G2',
       'out_h1_FUZ_CHLO_DISTANCE_G3', 'out_h1_FUZ_CHLO_DISTANCE_G4'],
      dtype='object', length=380)
New Cols in USDCHF = Index(['inp_h1_FUZ_BOLLINGER_WIDTH_G0', 'inp_h1_FUZ_BOLLINGER_WIDTH_G1',
       'inp_h1_FUZ_BOLLINGER_WIDTH_G2', 'inp_h1_FUZ_BOLLINGER_WIDTH_G3',
       'inp_h1_FUZ_BOLLINGER_WIDTH_G4', 'inp_h1_FUZ_BOLLINGER_WIDTH_

#### Guardo filtered dataframes

In [30]:
# save dataframes in filtered folder
path = '../csv_data/indicators/fuzzified/merged/filtered'
for symbol in items.keys():
  items[symbol]['df_filtered'].to_csv('{}/{}_flips.csv'.format(path, symbol), sep=';')
  print('Saved file {}'.format('{}/{}_flips.csv'.format(path, symbol)))
  print('+++++++++++++++++++++++++++++++++++++')
   

Saved file ../csv_data/indicators/fuzzified/merged/filtered/AUDUSD_flips.csv
+++++++++++++++++++++++++++++++++++++
Saved file ../csv_data/indicators/fuzzified/merged/filtered/EURAUD_flips.csv
+++++++++++++++++++++++++++++++++++++
Saved file ../csv_data/indicators/fuzzified/merged/filtered/EURCAD_flips.csv
+++++++++++++++++++++++++++++++++++++
Saved file ../csv_data/indicators/fuzzified/merged/filtered/EURCHF_flips.csv
+++++++++++++++++++++++++++++++++++++
Saved file ../csv_data/indicators/fuzzified/merged/filtered/EURGBP_flips.csv
+++++++++++++++++++++++++++++++++++++
Saved file ../csv_data/indicators/fuzzified/merged/filtered/EURJPY_flips.csv
+++++++++++++++++++++++++++++++++++++
Saved file ../csv_data/indicators/fuzzified/merged/filtered/EURNZD_flips.csv
+++++++++++++++++++++++++++++++++++++
Saved file ../csv_data/indicators/fuzzified/merged/filtered/EURUSD_flips.csv
+++++++++++++++++++++++++++++++++++++
Saved file ../csv_data/indicators/fuzzified/merged/filtered/GBPCAD_flips.csv
+++

#### Conversión a pares de entrenamiento supervisado con loopback=6 y prediction = 2

In [43]:
# build supervised dataframes
for symbol in items.keys():
  num_inputs = 0
  num_outputs = 0
  for c in items[symbol]['df_filtered']:
    if 'inp_' in c:
      num_inputs += 1
    if 'out_' in c:
      num_outputs += 1      
  print('Processing {} with shape {}, num_inputs={}, num_outputs={}'.format(symbol, items[symbol]['df_filtered'].shape, num_inputs, num_outputs))
  items[symbol]['df_supervised'] = MyUtils.series_to_supervised(items[symbol]['df_filtered'], num_inputs=num_inputs, num_outputs=num_outputs, n_in=6, n_out=2, dropnan=False)
  print('Built supervised dataframe with shape {}'.format(items[symbol]['df_supervised'].shape))
  print('+++++++++++++++++++++++++++++++++++++')
  


Processing AUDUSD with shape (376, 380), num_inputs=300, num_outputs=80
Built supervised dataframe with shape (376, 1960)
+++++++++++++++++++++++++++++++++++++
Processing EURAUD with shape (361, 380), num_inputs=300, num_outputs=80
Built supervised dataframe with shape (361, 1960)
+++++++++++++++++++++++++++++++++++++
Processing EURCAD with shape (320, 380), num_inputs=300, num_outputs=80
Built supervised dataframe with shape (320, 1960)
+++++++++++++++++++++++++++++++++++++
Processing EURCHF with shape (341, 380), num_inputs=300, num_outputs=80
Built supervised dataframe with shape (341, 1960)
+++++++++++++++++++++++++++++++++++++
Processing EURGBP with shape (340, 380), num_inputs=300, num_outputs=80
Built supervised dataframe with shape (340, 1960)
+++++++++++++++++++++++++++++++++++++
Processing EURJPY with shape (362, 380), num_inputs=300, num_outputs=80
Built supervised dataframe with shape (362, 1960)
+++++++++++++++++++++++++++++++++++++
Processing EURNZD with shape (328, 380),

In [47]:
for c in items['EURUSD']['df_supervised'].columns:
  print('col={}, value={}'.format(c, items['EURUSD']['df_supervised'][c].iloc[40]))


col=inp_h1_FUZ_BOLLINGER_WIDTH_G0(t-5), value=0.0
col=inp_h1_FUZ_BOLLINGER_WIDTH_G1(t-5), value=0.0
col=inp_h1_FUZ_BOLLINGER_WIDTH_G2(t-5), value=0.7220984992784714
col=inp_h1_FUZ_BOLLINGER_WIDTH_G3(t-5), value=0.2779015007215286
col=inp_h1_FUZ_BOLLINGER_WIDTH_G4(t-5), value=0.0
col=inp_h1_FUZ_BOLLINGER_WIDTH_G5(t-5), value=0.0
col=inp_h1_FUZ_BOLLINGER_WIDTH_G6(t-5), value=0.0
col=inp_h1_FUZ_BOLLINGER_b_G0(t-5), value=0.0
col=inp_h1_FUZ_BOLLINGER_b_G1(t-5), value=0.0
col=inp_h1_FUZ_BOLLINGER_b_G2(t-5), value=0.9610497328076071
col=inp_h1_FUZ_BOLLINGER_b_G3(t-5), value=0.0
col=inp_h1_FUZ_BOLLINGER_b_G4(t-5), value=0.0
col=inp_h1_FUZ_BOLLINGER_b_G5(t-5), value=0.0
col=inp_h1_FUZ_BOLLINGER_b_G6(t-5), value=0.0
col=inp_h1_FUZ_SMA_SLOW_DISTANCE_G0(t-5), value=0.0
col=inp_h1_FUZ_SMA_SLOW_DISTANCE_G1(t-5), value=0.0
col=inp_h1_FUZ_SMA_SLOW_DISTANCE_G2(t-5), value=0.0
col=inp_h1_FUZ_SMA_SLOW_DISTANCE_G3(t-5), value=0.7125019545667819
col=inp_h1_FUZ_SMA_SLOW_DISTANCE_G4(t-5), value=0.2874980454