## Crossover Reinforcement using ML
* We set the shorter sma to 6
* We set the longer sma to 20
* We set the directional sma to determine general trend at 100
* We only trade at the direction of the market
* We dynamically set the stop loss and take profit using ATR. We do testing between setting the stop loss multiplier to 1 or 1.5.
* The take profit is hard coded to be 2X the stop loss to give a risk to reward ration of 1:2.
* 1. When there is a crossover which is above the sma_100, we go long
* 2. When there is a crossunder which is below the sma_100, we go short <br>
* <i> The two will serve as our baseline model for forecasting</i>
## What is the next Step
1.  The next step is to build lower stop loss, upper stop loss, lower take profit, and upper take profit for each candle!
2. For the crossovers that occur while meeting our defined crossover conditions, we check whether the market hits the stop loss or take profit first and return the corresponding value.
  * Returns 1 if we went long and the market hit the upper take profit
  * Returns 2 if we went short and the market hit the lower take profit
  * Returns 3 if both crossover conditions were well defined but stop losses were hit! (We return 3 instead of returning to 0)

## Next step
- We are setting these feature extraction after checking whether the take profits were hit to reduce processing power in the<i> long iteration process </i>
* We add other technical indicators to aid the `rfClassifier` in making decisions.
  * 1. The bollinger band width
  * 2. The ROC of the 3rd, and 9th candles. (2 columns)
  * 3. The adx with a period of 14 (lookback period)
  * 4. whether the candle body size is a momentum candle or not (Returns a boolean of int type).
  * 5. The body size of each candle
  <br>
  <i> Save this data into a copy `preserved dataset` to maintain consistency in time series data when new data needs to be appended to the bottom and feature extraction is needed from previous rows </i> <br>
  <br>
#### `Depending on how model reacts, we can choose to do negative shift of row values into columns to show how price and features transition. Can sometimes result into overfitting`

## STEPS TO FOLLOW
<i> FIRST, we drop all null rows </i>
1. Extracts the rows with crossover and their features and targets
2. divide the data into 75/25
  * Save the 25 data into another file to serve as external data.
  * Use the 75 for training and testing purpose (further splitting it into 75/25)
3. Use a rf classifier model to predict whether the model results to 1, 2, or 2.
  * 1 ==> the model predicted going long, and yes our upper tp was hit
  * 2 ==> the model predicted going short, and yes our lower tp was hit
  * 3 ==> The model predicted going either long or short and none of our take profits were hit.

4. Determine the accuracy of the model! Then load the external dataset and determine the accuracy.

## Final Step!
  - Trade implementation thru Brokers api (OANDA, DERIV) thru data streaming
  - Data is loaded from broker using provided broker api, passed thru data pipeline function `data_cleaning(df)` and appended to a preserved dataset.
  - We then check whether any crossover conditions were met. If yes, we take the transformed row and pass it to our rf model to validate the signal!
  <br> <br>
<i><b> The model only confirms a signal from our base prediction model (based on crossover and trend) whether it is valid or not. If valid, the trade is executed on the next open candle. Hardcoding a risk reward ration of 1:2 allows a 50% accuracy or more to be profitable </b></i>

In [42]:
# install the necessary libraries
!pip install ta



In [57]:
# import the necessary libraries/modules
import ta
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [58]:
## load the dataset!
df = pd.read_csv('EURUSD_M15.csv')
df.head()

Unnamed: 0,Date,Open,High,Low,Close,Vol
0,31/08/2020 10:00,1.1917,1.19171,1.19091,1.19099,2176
1,31/08/2020 10:15,1.19098,1.19218,1.19098,1.19184,2286
2,31/08/2020 10:30,1.19183,1.19217,1.1916,1.19187,2754
3,31/08/2020 10:45,1.1919,1.19264,1.19176,1.19191,3435
4,31/08/2020 11:00,1.19192,1.19341,1.19184,1.19324,3899


In [59]:
len(df)

100000

In [60]:
# Convert date to datetime and set as index
df['Date'] = pd.to_datetime(df['Date'], format='%d/%m/%Y %H:%M')
df.set_index('Date', inplace=True)

In [61]:
# sma_6, sma_20 and sma_100 calculations
df['sma_6'] = ta.trend.SMAIndicator(df['Close'], window=6).sma_indicator()
df['sma_20'] = ta.trend.SMAIndicator(df['Close'], window=20).sma_indicator()
df['sma_100'] = ta.trend.SMAIndicator(df['Close'], window=100).sma_indicator()
df.tail(10)

Unnamed: 0_level_0,Open,High,Low,Close,Vol,sma_6,sma_20,sma_100
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2024-09-02 07:30:00,1.10563,1.10609,1.1056,1.10609,2905,1.106133,1.105406,1.106048
2024-09-02 07:45:00,1.10611,1.10686,1.1059,1.10666,3322,1.106267,1.105482,1.10603
2024-09-02 08:00:00,1.10666,1.10773,1.10662,1.10762,3650,1.106462,1.105603,1.106027
2024-09-02 08:15:00,1.10762,1.10773,1.10709,1.10721,3876,1.106568,1.105692,1.106014
2024-09-02 08:30:00,1.10723,1.10748,1.10703,1.10719,2798,1.106735,1.105781,1.105999
2024-09-02 08:45:00,1.1072,1.1072,1.1064,1.10674,4179,1.106918,1.105876,1.105982
2024-09-02 09:00:00,1.10674,1.10676,1.10635,1.1065,2371,1.106987,1.105971,1.10596
2024-09-02 09:15:00,1.1065,1.10702,1.1064,1.10697,2157,1.107038,1.106081,1.105937
2024-09-02 09:30:00,1.10695,1.10727,1.1066,1.10704,2439,1.106942,1.106199,1.105923
2024-09-02 09:45:00,1.10704,1.10708,1.1067,1.1069,3222,1.10689,1.106302,1.105917


In [62]:
# Next we calculate the bollinger band width using a period of 20 and std_dev_multiplier of 2.
# Besides the width, we also calculate the percentage of close value to the band
df['bb_width'] = ta.volatility.BollingerBands(df['Close'], window=20, window_dev=2).bollinger_wband()
df['bb_pband'] = ta.volatility.BollingerBands(df['Close'], window=20, window_dev=2).bollinger_pband()
df.tail(10)

Unnamed: 0_level_0,Open,High,Low,Close,Vol,sma_6,sma_20,sma_100,bb_width,bb_pband
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2024-09-02 07:30:00,1.10563,1.10609,1.1056,1.10609,2905,1.106133,1.105406,1.106048,0.209848,0.79487
2024-09-02 07:45:00,1.10611,1.10686,1.1059,1.10666,3322,1.106267,1.105482,1.10603,0.230444,0.962411
2024-09-02 08:00:00,1.10666,1.10773,1.10662,1.10762,3650,1.106462,1.105603,1.106027,0.283758,1.142762
2024-09-02 08:15:00,1.10762,1.10773,1.10709,1.10721,3876,1.106568,1.105692,1.106014,0.310152,0.942653
2024-09-02 08:30:00,1.10723,1.10748,1.10703,1.10719,2798,1.106735,1.105781,1.105999,0.330536,0.885362
2024-09-02 08:45:00,1.1072,1.1072,1.1064,1.10674,4179,1.106918,1.105876,1.105982,0.329038,0.737307
2024-09-02 09:00:00,1.10674,1.10676,1.10635,1.1065,2371,1.106987,1.105971,1.10596,0.314567,0.651911
2024-09-02 09:15:00,1.1065,1.10702,1.1064,1.10697,2157,1.107038,1.106081,1.105937,0.307294,0.761406
2024-09-02 09:30:00,1.10695,1.10727,1.1066,1.10704,2439,1.106942,1.106199,1.105923,0.293176,0.759319
2024-09-02 09:45:00,1.10704,1.10708,1.1067,1.1069,3222,1.10689,1.106302,1.105917,0.275116,0.696477


In [63]:
# determine crossover and crossunder and whether they happen in the direction of the trend!
# Determine the trend
df['trend'] = np.where(df['Close'] > df['sma_100'], 'uptrend', 'downtrend')
# Check for crossover (SMA-6 crosses above SMA-20)
df['crossover'] = (df['sma_6'] > df['sma_20']) & (df['sma_6'].shift(1) <= df['sma_20'].shift(1))

# Check for crossunder (SMA-6 crosses below SMA-20)
df['crossunder'] = (df['sma_6'] < df['sma_20']) & (df['sma_6'].shift(1) >= df['sma_20'].shift(1))

# Only consider crossovers/crossunders in the direction of the trend
df['valid_cross'] = np.where(df['crossover'] & (df['trend'] == 'uptrend'),1,
                             np.where(df['crossunder'] & (df['trend'] == 'downtrend'),2,3))
# df.tail(10)

In [64]:
(df['valid_cross']==1).sum(), (df['valid_cross']==2).sum()

(1607, 1660)

In [65]:
trades = (df['valid_cross']==1).sum() + (df['valid_cross']==2).sum()
trades

3267

<i>The approach BELOW is quite solid for setting dynamic stop loss and take profit levels. It adapts to market volatility (through the use of ATR) and maintains a consistent risk-reward ratio.</i>**bold text**

In [66]:
# determine atr values, stop losses and take profits thresholds
stop_loss_multiplier = 1.0 # change between 1.0 and 1.5 and see which fits the need best!
risk_reward_ratio = 2.0
# Calculate the ATR (used to determine stop loss distance)
atr_period = 14
df['ATR'] = ta.volatility.AverageTrueRange(df['High'], df['Low'], df['Close'], window=atr_period).average_true_range()
# Calculate initial stop loss and take profit levels
df['LowerStopLoss'] = df['Low'] - (df['ATR'] * stop_loss_multiplier)
df['UpperStopLoss'] = df['High'] + (df['ATR'] * stop_loss_multiplier)

# Calculate the take profit based on risk-reward ratio (2x the distance from stop loss)
df['LowerTakeProfit'] = df['Close'] - ((df['UpperStopLoss'] - df['Close']) * risk_reward_ratio)
df['UpperTakeProfit'] = df['Close'] + ((df['Close'] - df['LowerStopLoss']) * risk_reward_ratio)
# df.tail(10)

In [67]:
# Now determine whether when a trade is executed, does it turn into profits or not? Not that we are referring to the current take profit and stop loss value and we are determining if that will be hit by the next future candle. If crossover which are 1, take_profits are hit, return 1 else 0. For cross_under which are 2, return 2 if lower take profit is hit, else return 0.
# the new values are returned in a new colum trade_result

In [71]:

def evaluate_trades_precisely(df):
    df['trade_result'] = 0
    df['trade_duration'] = 0
    i = 0

    while i < len(df):
        if df['valid_cross'].iloc[i] == 1:  # Long trade signal
            entry_price = df['Close'].iloc[i]
            stop_loss = df['LowerStopLoss'].iloc[i]
            take_profit = df['UpperTakeProfit'].iloc[i]

            for j in range(i, len(df)):
                if df['Low'].iloc[j] <= stop_loss:
                    df.loc[df.index[i], 'trade_result'] = 3  # Stop loss hit
                    df.loc[df.index[i], 'trade_duration'] = j - i
                    # i = j  # Move outer loop to this point
                    break
                elif df['High'].iloc[j] >= take_profit:
                    df.loc[df.index[i], 'trade_result'] = 1  # Take profit hit
                    df.loc[df.index[i], 'trade_duration'] = j - i
                    # i = j  # Move outer loop to this point
                    break
            else:
                # Trade didn't conclude within available data
                df.loc[df.index[i], 'trade_result'] = -1
                df.loc[df.index[i], 'trade_duration'] = len(df) - i - 1

        elif df['valid_cross'].iloc[i] == 2:  # Short trade signal
            entry_price = df['Close'].iloc[i]
            stop_loss = df['UpperStopLoss'].iloc[i]
            take_profit = df['LowerTakeProfit'].iloc[i]

            for j in range(i, len(df)):
                if df['High'].iloc[j] >= stop_loss:
                    df.loc[df.index[i], 'trade_result'] = 3  # Stop loss hit
                    df.loc[df.index[i], 'trade_duration'] = j - i
                    # i = j  # Move outer loop to this point
                    break
                elif df['Low'].iloc[j] <= take_profit:
                    df.loc[df.index[i], 'trade_result'] = 2  # Take profit hit
                    df.loc[df.index[i], 'trade_duration'] = j - i
                    # i = j  # Move outer loop to this point
                    break
            else:
                # Trade didn't conclude within available data
                df.loc[df.index[i], 'trade_result'] = -1
                df.loc[df.index[i], 'trade_duration'] = len(df) - i - 1

        i += 1  # Move to next candle if no trade was initiated

    return df

# Assuming df is your dataframe with all the previous calculations
df = evaluate_trades_precisely(df)

# Display summary of trade results
print("Long trades won:", (df['trade_result'] == 1).sum())
print("Short trades won:", (df['trade_result'] == 2).sum())
print("Total Signals Given: ", trades)
print("Trades lost:", (df['trade_result'] == 3).sum())
print("Unconcluded trades:", (df['trade_result'] == -1).sum())

# Calculate win rate
total_concluded_trades = ((df['trade_result'] == 1) | (df['trade_result'] == 2) | (df['trade_result'] == 3)).sum()
winning_trades = (df['trade_result'] == 1).sum() + (df['trade_result'] == 2).sum()
win_rate = winning_trades / total_concluded_trades if total_concluded_trades > 0 else 0
print("Total Signals Taken: ", total_concluded_trades)
print(f"Win rate: {win_rate:.2%}")

# Average trade duration
avg_duration = df[df['trade_duration'] > 0]['trade_duration'].mean()
print(f"Average trade duration: {avg_duration:.2f} candles")

Long trades won: 522
Short trades won: 550
Total Signals Given:  3267
Trades lost: 2193
Unconcluded trades: 2
Total Signals Taken:  3265
Win rate: 32.83%
Average trade duration: 22.94 candles


In [41]:
trades

3267