<a href="https://colab.research.google.com/github/rrrudolph/trade/blob/master/Swing_Finder.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Format OHLC Data

In [15]:

import pandas as pd
import datetime as dt
import numpy as np
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
from google.colab import drive
drive.mount('/content/drive')

# pd.set_option('display.max_rows', 100)

df = pd.read_csv('drive/My Drive/Colab Notebooks/EURUSD15.csv')

# this stuff is to format an original file but is no longer needed for eurusd

# df = df.set_axis(['Date', 'Time','O','H','L','C','V'], axis=1, inplace=False)
# df['DT'] = df['Date'] + ' '+df['Time']
df['DT'] = pd.to_datetime(df['DT'])
# df.drop(columns=['Date','Time'],inplace=True)
# df = df.reindex(columns=['DT','O','H','L','C','V','D_Range','ADR','Frac_H','Frac_L'])
# df['DT'] -= pd.Timedelta(hours=7) # minus 7 hours to match CST

# if you want to re-iterate over the highs and lows un-comment these lines:
# df['Locked_H'] = 0
# df['Locked_L'] = 0

df.reset_index(drop=True,inplace=True)  # reset the index (numbers are missing)

df["DT"].describe()

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


count                   65233
unique                  65233
top       2019-08-01 18:30:00
freq                        1
first     2017-07-28 02:15:00
last      2020-03-13 22:45:00
Name: DT, dtype: object

# Get Daily Range 

Create a rolling mean of "average daily range" 


In [0]:

for i in df['DT'].dt.date.unique():
  group = df.groupby(df['DT'].dt.date).get_group(i)
  dayrange = max(group['H']) - min(group['L'])
  df.loc[group.index,'D_Range'] = dayrange

# ADR
#  There are 96 15m candles per day and I want a 10 day average  
#  This will give me an index error so it needs a try block
for i in range(0, len(df)):
  try:
    df.loc[i,'ADR'] = round(df.loc[i-960:i,'D_Range'].mean(),5)
  except:
    pass


In [0]:
df.tail()

# Identify Fractals

For each bar, walk forwards and backwards to see how far you get while the high remains highest high. Do this individually for both the highs and the lows. While walking forward, see how far price moved away from that fractal (Sw_Size). The smaller of the 2 counts (forward vs backward) will be the Frac_H/Frac_L score. 

There are 4 try blocks total for the highs, lows, forwards, and backwards. Exceptions are hit when the loop reaches either end of the df. I need these to know whether a row can be locked from further iterations.

While a market is unfolding the fractals will need to be recalculated for each bar until the forward walk value has been determined with no exceptions raised. Once that happens the value will be locked in place and even though prior fractals may need to be reiterated over, the locked ones will not. This pic show's a locked fractal while a previous fractal needs to still be re-iterated into the future. http://prntscr.com/rhc9zt

Unfortunately, to account for edge cases I need to create a column specific to highs and lows. 

There is also a time limit for a fractal to be locked.  Once it is the highest high/lowest low for 120 bars it will be considered a major swing and there's no need to know the exact bar count beyond that.

In [0]:
# %%timeit
highs = df.loc[:,'H'].values
lows = df.loc[:,'L'].values 

for i in range(0, len(df)): 
  low = lows[i]
  high = highs[i]
 
  # the order below is highs forwards, highs backwards

  if not df.loc[i,'Locked_H']:
    try:   # iter forwards 
      count_prior = 0   # initialize counters and such
      count_next = 0
      next_ = i + 1
      prior_ = i - 1

      while high > highs[next_] and count_next < 96: # iter limit
        next_ +=1
        count_next +=1
      df.loc[i,'Locked_H'] = 1 # if it hasn't error'd lock the row 
    except:
      pass

    try: # iter backwards
      while high >= highs[prior_] and count_prior < 96: 
        prior_ -=1
        count_prior +=1
    except:
      pass 
    df.loc[i,'Frac_H'] = min([count_prior, count_next])
    try: # this will return the 'Swing' size if frac > 1
      if df.loc[i,'Frac_H'] > 1:  # the 'min is an empty argument' requires try block
        df.loc[i,'Swing'] = min(lows[i+1:i+df.loc[i,'Frac_H']]) - high

    except:
      df.loc[i,'Swing'] = 0

  # now the same thing but for the lows
  if not df.loc[i,'Locked_L']:
    try: # iter forwards 
      count_prior = 0
      count_next = 0
      next_ = i + 1
      prior_ = i - 1

      while low < lows[next_] and count_next < 120: 
        next_ +=1
        count_next +=1
      df.loc[i,'Locked_L'] = 1 # if it hasn't error'd set the lock value 
    except:
      pass

    try: # iter backwards
      while low <= lows[prior_] and count_prior < 120: 
        prior_ -=1
        count_prior +=1
      
    except:
      pass
    df.loc[i,'Frac_L'] = min([count_prior, count_next])

    # find swing distance
    # only overwrite if the new value is greater than the last
    try:  
      if (max(highs[i:i+count_next]) - low > abs(df.loc[i,'Swing'])
          and df.loc[i,'Frac_L'] > 1):
        df.loc[i,'Swing'] = max(highs[i+1:df.loc[i,'Frac_L']]) - low
    except:
        pass


# Swings

I need to know the swing size (Sw_Size) in relation to the average daily range in order for it to mean anything.

Also, a fractal size needs to correlate to its swing size.  A large fractal with a tiny swing is mostly irrelevent as that would signify a sideways market. Likewise a large swing on a tiny fractal would signify it was just a slight pullback within a larger swing. Here is an example: http://prntscr.com/rhc2qh

In that situation it would be best to only look at the fractals from further up. 

So, to work with these swings quantitatively I need a swing rating which will be a combination of Frac_H/Frac_L and Sw_Size. However, for the largest swings where price makes a massive move I don't want those swing ratings getting excesively large.  In that situation I want to curb the swing size weighting so it can't overly boost the rating. I'll make a function that decreases the weight as the size increases.  It looks like this: http://prntscr.com/sgbra7

In [16]:
# get the swing size in relation to ADR
df['Sw_Pct'] = round(df['Sw_Size']/df['ADR'], 2)

# create a function of a curved line that will act as a weight
# to use on the swing size. this will taper off the extreme
# swing sizes to balance out the swing ratings
df['Sw_Weight'] = -1.02 ** abs(df['Sw_Pct']*100) # function for curved line
weight = df['Sw_Weight'][abs(df['Sw_Weight'].values) > 0] # list of weights 

# first fill NaNs with mean of group
weight.fillna(value=weight.mean(), inplace=True) 
# normalize so its a weight between 0 and 1
df['Sw_Weight'] = (weight - min(weight)) / (max(weight) - min(weight))
# now that the weights are normalized, multiply the swing by its weight
final = df['Sw_Size'] * df['Sw_Weight']

temp = df[abs(df['Sw_Size']) > 0]
# add the fractal rating to its swing and divide by 2
df['Sw_Rating'] = round((abs(final) * 20 + temp['Frac_H'] + temp['Frac_L']) / 2)
# technically thats not perfect but it will work for 99% of cases


# uncomment to visualize the weight coefficient for the data set
# fig = px.scatter(x=abs(df['Sw_Pct']), y=df["Sw_Weight"])
# fig.show()

In [10]:

# overwrite the file

# df.to_csv('EURUSD15.csv', index=False)
# !cp EURUSD15.csv "drive/My Drive/Colab Notebooks/"  
df.head()

Unnamed: 0,DT,O,H,L,C,V,D_Range,ADR,Frac_H,Frac_L,Locked_H,Locked_L,Sw_Rating,Sw_Size,Sw_Pct,Sw_Weight
0,2017-07-28 02:15:00,1.16814,1.16826,1.1681,1.1682,195,0.00928,0.00928,0.0,2.0,1,1,1.0,0.00111,0.12,0.999846
1,2017-07-28 02:30:00,1.16821,1.16846,1.16814,1.16824,297,0.00928,0.00928,0.0,0.0,1,1,,0.0,0.0,1.0
2,2017-07-28 02:45:00,1.16824,1.16912,1.16824,1.16896,619,0.00928,0.00928,0.0,0.0,1,1,,0.0,0.0,1.0
3,2017-07-28 03:00:00,1.16896,1.16921,1.16864,1.16879,954,0.00928,0.00928,10.0,0.0,1,1,5.0,-0.00211,-0.23,0.99967
4,2017-07-28 03:15:00,1.1688,1.16895,1.16855,1.1688,709,0.00928,0.00928,0.0,0.0,1,1,,0.0,0.0,1.0


# Charting

In [0]:

# import pandas as pd
# import datetime as dt
# import numpy as np
# import plotly.graph_objects as go
# from plotly.subplots import make_subplots
# import plotly.express as px
# from google.colab import drive
# drive.mount('/content/drive')

df = pd.read_csv('drive/My Drive/Colab Notebooks/EURUSD15.csv')
df = df[62906:63140]
#   4/xgHFze94t90VOj9F06X7CRrv6Qiil4_xbpIbcuER1D6leFO-KYBEg3


In [12]:
fig = go.Figure(data=[go.Candlestick(x=df['DT'],
                open=df['O'], high=df['H'],
                low=df['L'], close=df['C'])])
# swings = df['Sw_Rating'][df['Sw_Size'] < 0].append(df['Sw_Rating'][df['Sw_Size'] < 0])
hix = df['DT'][df['Sw_Size'] < 0]
lox = df['DT'][df['Sw_Size'] > 0]
hi = df['H'][df['Sw_Size'] < 0]
low = df['L'][df['Sw_Size'] > 0]
# ratings = round(df['Frac_H'][df['Sw_Size'] < 0],0)
# ratingsl = round(df['Frac_L'][df['Sw_Size'] > 0],0)

ratings = round(df['Sw_Rating'][df['Sw_Size'] < 0],0)
ratingsl = round(df['Sw_Rating'][df['Sw_Size'] > 0],0) 


fig.add_trace(go.Scatter(
    x=hix,
    y=hi,
    mode="markers+text",
    name="Swing Ratings",
    text=ratings,
    textposition="top center"
    # yaxis= 'y2'
))

fig.add_trace(go.Scatter(
    x=lox,
    y=low,
    mode="markers+text",
    name="Swing Ratings",
    text=ratingsl.values,
    textposition="bottom center"
    # yaxis= 'y2'
))

    

fig.update_layout(
    autosize=False,
    width=1100,
    height=800,
    margin=dict(
        l=50,
        r=50,
        b=100,
        t=100,
        pad=4
    ),
    paper_bgcolor="LightSteelBlue",
    
)

fig.update(layout_xaxis_rangeslider_visible=False)

# len(df['DT'][df['Sw_Rating'] > 0])


In [0]:
# redoing the fractal finder loop
# # pandas attempt 2  - this comes out twice as slow! 

# %%timeit

# # i have to supply row numbers incrementally so it seems I cant
# # avoid at least the inital loop

# for i in range(0, len(df)):

#   # it looks like having the same length arrays to vectorize
#   # over is unavoidable so I wonder if I could just create a tempory
#   # column filled with copies of df.loc[i]
#   df['i'] = df.index[i]
#   df['h'] = df.loc[i,'H']
#   df['l'] = df.loc[i,'L']
  
#   # looking forwards, find the nearest row with a higher high than current
#   try:  # find min of index > than index[i], with a high > high[i]
#     nextH = min(df.index[(df.index > df['i']) & (df['H'] > df.loc[i,'H'])])
#     nextH = nextH - df.index[i] # the row count between
#   except: # if min() arg is an empty sequence the forwards count is 0
#     nextH = 0
#   try:  # look backwards at the highs
#     prev = max(df.index[(df.index < df['i']) & (df['H'] > df.loc[i,'H'])])
#     prev = df.index[i] - prev
#   except: 
#     prev = 0
#   df.loc[i,'pandas_H'] = min([nextH, prev])

#   try: # forwards for lows
#     nextL = min(df.index[(df.index > df['i']) & (df['L'] < df.loc[i,'L'])])
#     nextL = nextL - df.index[i]
#   except: 
#     nextL = 0
#   try:  # backwards for lows
#     prev = max(df.index[(df.index < df['i']) & (df['L'] < df.loc[i,'L'])])
#     prev = df.index[i] - prev
#   except: 
#     prev = 0
#   df.loc[i,'pandas_L'] = min([nextL, prev])
  
#   # these are the rows which contain the next highest high / lowest low
#   # I'll find the lowest low in between current rows high and next higher high
#   next_h = df.loc[count_nextH,'H']
#   next_l = df.loc[count_nextL,'L']

#   if nextH > 1: # if not 0  # i'll wanna cap these at a certain limit later
#     df.loc[i,'pandas_swing'] = min(df.loc[i:i+nextH,'L']) - df.loc[i,'H']
#   if (nextL > 1 and # and will not overwrite existing value with new smaller value
#     max(df.loc[i+1:i+nextL,'H']) - df.loc[i,'L'] > df.loc[i,'pandas_swing']):
#     df.loc[i,'pandas_swing'] = max(df.loc[i+1:i+nextL,'H']) - df.loc[i,'L']



  

# # last_highest = max(df['H']([df.index < df.iloc[i].index).all() &
# #                       (df['H'] < df.loc[i,'H']).all()])

# # for i in range(0, len(df)): 
# #   low = lows[i]
# #   high = highs[i]
# # df['H'][df.loc[:i-1].index] < df.iloc[i].index

potential trade triggers:
1. low vol pin bar http://prntscr.com/rg3go1
2. Scenario B: consecutive bars of same direction with decreasing vol. http://prntscr.com/rg3kf1
3. if I identify lots of similar swing prices in a short span of time I can do trades to catch stop hunts http://prntscr.com/rgz9js - or swing size to fractal size is high and daily range is small (which would mean theres lots of volatility going nowhere)

Zones creation:

2. http://prntscr.com/rg3kf1 ( Scenario A: high volume bars that are follwed by a lower vol bar and an impulse. Entry could be a low vol opposing candle in zone.  All candles from Scenario B could also be used as a zone. Low vol rejections.  Prior swings at the same price area could be a rating factor.
