Multiple Model-based Binary Classification for Forex Trading

...

Author -  Leung Cheuk Hei, Victor

...

Copyright© 2020. myworldbox.


Meun

Use asset_report(, True) to see the detailed report

The current ranking is just based on accuracy to minimize testing time

<a name="nav-meun"></a>

This data inherits the previous date's price impact.

[(Report 1) Swing Trading Batch Classification Report](#swing-trading-data-report)

[(Report 2) Day Trading Batch Classification Report](#day-trading-data-report)

Overview

A winning strategy may or may not show a directly proportional relationship between accuracy and profit.

This trading report presents two trading styles - swing trading and day trading.

My strategy is to use all of the models from [ sklearn ] to predict the decision columns.

Next, swing trading and day trading strategies are derived from the same decision columns.

I have modularized assets, models, and metrics into arrays to study their collective impacts on the dataset. I have included an additional dataset from yahoo finance and applied the same concept to data with a different shape.

For Yahoo finance data, I try to make the problem of trading a binary classification problem.

My models analyze the closing price of the historical market.

If the previous closing price is lower than the current closing price, it is classified as a buying opportunity and vice versa.

To minimize spreads, I only try assets with the top 10 market capitalization.

Only high liquidity markets are being tested.

1. Description of the dataset and any preprocessing

`After obtaining the data frame from yahoo finance, the data is processed to yield two new data frames (Volatility column, and Decision column). To associate trading with classification, trading behaviors, and profit (Action column, and Profit column) are created from the classification results. The trading rules for swing trading and day trading are derived from the same classification results (Decision column). Their only difference would be their final interpretation. No scaling would be used as the future prices of assets are unpredictable. To reduce complexity, no technical indicators will be employed in this study. All positions will be closed on the last day.
`

2. Description of the machine learning task(s) performed on the dataset

`I have used models from Probability Calibration, Dummy estimators, Ensemble Methods, Gaussian Processes, Linear classifiers, Classical linear regressors, Regressors with variable selection, Bayesian regressors, Outlier-robust regressors, Generalized linear models (GLM) for regression, Miscellaneous, Naive Bayes, Nearest Neighbors, Neural network models, Semi-Supervised Learning, Support Vector Machines, and Decision Trees to compare their effectiveness.`

3. Description of the hardware and software computing environment, machine learning methods, and parameter settings

`For both hardware and software, I use TPU/GPU from Colab to conduct machine learning. All parameters in models are left to default to avoid extra complexity.`

4. Description of the experiments

`I feed assets, models, and metrics into the array to evaluate their independent impacts. You can observe that simple classifiers already perform well on the datasets. It may be overkill to use complex models like CNN with customized layers. Generally, my script can be applied to both regression and classification problems with slight modification as most models are modulized into arrays.`

5. Visualization and discussion of the results obtained

`The results are listed in the Meun section below. My implementation also allows the use of regressors on classification objects. I rounded the regression results up or down to obtain either 1 or 0. Regression is performing similar to classifiers. Moreover, the great performance of DummyRegressor and DummyClassifier also outlined the concept of "successful bad trades".`

In [None]:
!pip install yfinance

Python libraries

In [None]:
import math
import numpy as np
import pandas as pd
import yfinance as yf
import warnings
import statistics
import matplotlib.pyplot as plt

In [None]:
from IPython.display import display, HTML
from scipy.stats import skew, kurtosis
from google.colab import drive

Sklearn libraries

In [None]:
from sklearn.base import is_classifier, is_regressor
from sklearn.model_selection import train_test_split
from sklearn import feature_selection

Sklearn libraries - scalers

In [None]:
from sklearn.preprocessing import MinMaxScaler, StandardScaler

Sklearn libraries - machine-learning models

In [None]:
from sklearn import calibration
from sklearn import dummy
from sklearn import cluster
from sklearn import ensemble
from sklearn import gaussian_process
from sklearn import isotonic
from sklearn import kernel_approximation
from sklearn import kernel_ridge
from sklearn import linear_model
from sklearn import manifold
from sklearn import naive_bayes
from sklearn import neighbors
from sklearn import neural_network
from sklearn import semi_supervised
from sklearn import svm
from sklearn import tree

Sklearn libraries - classification metrics

In [None]:
from sklearn import metrics

In [None]:
warnings.filterwarnings('ignore')

# drive.mount('/content/drive')

In [None]:
pd.set_option('max_colwidth', 500)

In [None]:
color = "white"

In [None]:
plt.style.use('seaborn-darkgrid')
%matplotlib inline

Color definition

In [None]:
def zeroColor(score):
  if (score == 0):
    return f"background-color: #140326;"

def negativeColor(score):
  if (score < 0):
    return f"background-color: #d822ff;"

def reportColor(score):
  return f"background-color: #274e13;"

def swingColor(score):
  return f"background-color: #3e03a3;"

def dayColor(score):
  return f"background-color: #ff7300;"

def actionColor(action):
  if (action == "No Action"):
    return f"background-color: #000000;"
  if (action == "Sell"):
    return f"background-color: #ff0000;"
  if (action == "Buy"):
    return f"background-color: #0011ff;"
  if (action == "Buy & Sell"):
    return f"background-color: #12014d;"

In [None]:
scaler = [MinMaxScaler(feature_range=(0, 100)), StandardScaler()]

In [None]:
model = [
    # Probability Calibration
    calibration.CalibratedClassifierCV(),
    # calibration.calibration_curve(),

    # Dummy estimators
    dummy.DummyClassifier(),
    dummy.DummyRegressor(),

    # Clustering
    # cluster.AffinityPropagation(),
    # cluster.AgglomerativeClustering(),
    # cluster.Birch(),
    # cluster.DBSCAN(),
    # cluster.FeatureAgglomeration(),
    # cluster.KMeans(),
    # cluster.BisectingKMeans(),
    # cluster.MiniBatchKMeans(),
    # cluster.MeanShift(),
    # cluster.OPTICS(),
    # cluster.SpectralClustering(),
    # cluster.SpectralBiclustering(),
    # cluster.SpectralCoclustering(),

    # Ensemble Methods
    ensemble.AdaBoostClassifier(),
    ensemble.AdaBoostRegressor(),
    ensemble.BaggingClassifier(),
    ensemble.BaggingRegressor(),
    ensemble.ExtraTreesClassifier(),
    ensemble.ExtraTreesRegressor(),
    ensemble.GradientBoostingClassifier(),
    ensemble.GradientBoostingRegressor(),
    ensemble.IsolationForest(),
    ensemble.RandomForestClassifier(),
    ensemble.RandomForestRegressor(),
    # ensemble.RandomTreesEmbedding(),
    # ensemble.StackingClassifier(),
    # ensemble.StackingRegressor(),
    # ensemble.VotingClassifier(),
    # ensemble.VotingRegressor(),
    ensemble.HistGradientBoostingRegressor(),
    ensemble.HistGradientBoostingClassifier(),

    # Gaussian Processes
    gaussian_process.GaussianProcessClassifier(),
    gaussian_process.GaussianProcessRegressor(),

    # Isotonic regression
    # isotonic.IsotonicRegression(),
    # isotonic.check_increasing(),
    # isotonic.isotonic_regression(),

    # Kernel Approximation
    # kernel_approximation.AdditiveChi2Sampler(),
    # kernel_approximation.Nystroem(),
    # kernel_approximation.PolynomialCountSketch(),
    # kernel_approximation.RBFSampler(),
    # kernel_approximation.SkewedChi2Sampler(),

    # Linear classifiers
    linear_model.LogisticRegression(),
    linear_model.LogisticRegressionCV(),
    linear_model.PassiveAggressiveClassifier(),
    linear_model.Perceptron(),
    linear_model.RidgeClassifier(),
    linear_model.RidgeClassifierCV(),
    linear_model.SGDClassifier(),
    linear_model.SGDOneClassSVM(),

    # Classical linear regressors
    linear_model.LinearRegression(),
    linear_model.Ridge(),
    linear_model.RidgeCV(),
    linear_model.SGDRegressor(),

    # Regressors with variable selection
    linear_model.ElasticNet(),
    linear_model.ElasticNetCV(),
    linear_model.Lars(),
    linear_model.LarsCV(),
    linear_model.Lasso(),
    linear_model.LassoCV(),
    linear_model.LassoLars(),
    linear_model.LassoLarsCV(),
    linear_model.LassoLarsIC(),
    linear_model.OrthogonalMatchingPursuit(),
    linear_model.OrthogonalMatchingPursuitCV(),

    # Bayesian regressors
    linear_model.ARDRegression(),
    linear_model.BayesianRidge(),

    # Multi-task linear regressors with variable selection
    # linear_model.MultiTaskElasticNet(),
    # linear_model.MultiTaskElasticNetCV(),
    # linear_model.MultiTaskLasso(),
    # linear_model.MultiTaskLassoCV(),

    # Outlier-robust regressors
    linear_model.HuberRegressor(),
    # linear_model.QuantileRegressor(),
    # linear_model.RANSACRegressor(),
    linear_model.TheilSenRegressor(),

    # Generalized linear models (GLM) for regression
    linear_model.PoissonRegressor(),
    linear_model.TweedieRegressor(),
    # linear_model.GammaRegressor(),

    # Miscellaneous
    linear_model.PassiveAggressiveRegressor(),
    # linear_model.enet_path(),
    # linear_model.lars_path(),
    # linear_model.lars_path_gram(),
    # linear_model.lasso_path(),
    # linear_model.orthogonal_mp(),
    # linear_model.orthogonal_mp_gram(),
    # linear_model.ridge_regression(),

    # Manifold Learning
    # manifold.Isomap(),
    # manifold.LocallyLinearEmbedding(),
    # manifold.MDS(),
    # manifold.SpectralEmbedding(),
    # manifold.TSNE(),
    # manifold.locally_linear_embedding(),
    # manifold.smacof(),
    # manifold.spectral_embedding(),
    # manifold.trustworthiness(),

    # Naive Bayes
    naive_bayes.BernoulliNB(),
    # naive_bayes.CategoricalNB(),
    # naive_bayes.ComplementNB(),
    naive_bayes.GaussianNB(),
    # naive_bayes.MultinomialNB(),

    # Nearest Neighbors
    # neighbors.BallTree(),
    # neighbors.KDTree(),
    # neighbors.KernelDensity(),
    neighbors.KNeighborsClassifier(),
    neighbors.KNeighborsRegressor(),
    # neighbors.KNeighborsTransformer(),
    # neighbors.LocalOutlierFactor(),
    # neighbors.RadiusNeighborsClassifier(),
    neighbors.RadiusNeighborsRegressor(),
    # neighbors.RadiusNeighborsTransformer(),
    neighbors.NearestCentroid(),
    # neighbors.NearestNeighbors(),
    # neighbors.NeighborhoodComponentsAnalysis(),
    # neighbors.kneighbors_graph(),
    # neighbors.radius_neighbors_graph(),
    # neighbors.sort_graph_by_row_values(),

    # Neural network models
    # neural_network.BernoulliRBM(),
    neural_network.MLPClassifier(),
    neural_network.MLPRegressor(),

    # Semi-Supervised Learning
    semi_supervised.LabelPropagation(),
    semi_supervised.LabelSpreading(),
    # semi_supervised.SelfTrainingClassifier(),

    # Support Vector Machines
    svm.LinearSVC(),
    svm.LinearSVR(),
    # svm.NuSVC(),
    svm.NuSVR(),
    svm.OneClassSVM(),
    svm.SVC(),
    svm.SVR(),

    # Decision Trees
    tree.DecisionTreeClassifier(),
    tree.DecisionTreeRegressor(),
    tree.ExtraTreeClassifier(),
    tree.ExtraTreeRegressor(),
    # tree.export_graphviz(),
    # tree.export_text(),
    ]

In [None]:
classification_metric = [
    # Classification metrics
    metrics.accuracy_score,
    metrics.auc,
    metrics.average_precision_score,
    metrics.balanced_accuracy_score,
    metrics.brier_score_loss,
    metrics.classification_report,
    metrics.cohen_kappa_score,
    metrics.confusion_matrix,
    metrics.dcg_score,
    metrics.det_curve,
    metrics.f1_score,
    metrics.fbeta_score,
    metrics.hamming_loss,
    metrics.hinge_loss,
    metrics.jaccard_score,
    metrics.log_loss,
    metrics.matthews_corrcoef,
    metrics.multilabel_confusion_matrix,
    metrics.ndcg_score,
    metrics.precision_recall_curve,
    metrics.precision_recall_fscore_support,
    metrics.precision_score,
    metrics.recall_score,
    metrics.roc_auc_score,
    metrics.roc_curve,
    metrics.top_k_accuracy_score,
    metrics.zero_one_loss,
]

In [None]:
regression_metric = [
    # Regression metrics
    metrics.explained_variance_score,
    metrics.max_error,
    metrics.mean_absolute_error,
    metrics.mean_squared_error,
    metrics.mean_squared_log_error,
    metrics.median_absolute_error,
    metrics.r2_score,
    metrics.mean_poisson_deviance,
    metrics.mean_gamma_deviance,
    metrics.mean_tweedie_deviance,
    metrics.d2_tweedie_score,
    metrics.mean_pinball_loss,
    # metrics.d2_pinball_score,
    # metrics.d2_absolute_error_score
]



Account Setting

In [None]:
money = 10000

In [None]:
invest_ratio = 0.5

In [None]:
train_size = 0.5

In [None]:
trading_day = 252
risk_free_rate = 0.02

Frames obtained from yfinance

In [None]:
start_year = 2000
trade_year = 20

In [None]:
start_period = str(start_year) + '-01-01'
end_period = str(start_year + trade_year) + '-01-01'

Stocks

In [None]:
major_stock = [
    'BAC', # Bank of America
    'C', # Citigroup
    'F', # Ford Motor Company
    'GE', # General Electric
    'CSCO', # Cisco Systems

    'PFE', # Pfizer Inc.
    'MSFT', # Microsoft Corporation
    'AA', # Alcoa Inc.
    'T', # AT&T Inc.
    'INTC' # Intel Corporation
]

Commodities

In [None]:
major_commodity = [
    'CL=F', # Crude oil
    'GC=F', # Gold
    'NG=F', # Natural gas
    'ZC=F', # Corn
    'HO=F', # Heating Oil

    'BZ=F', # Brent Oil
    'SI=F', # Silver
    'ZS=F', # Soybeans
    'HG=F', # Copper
    'ZW=F', # Wheat
]

Forexes

In [None]:
major_forex = [
'USDJPY=X', # US Dollar/Japanese Yen
'EURUSD=X', # Euro/US Dollar
'GBPUSD=X', # British Pound/US Dollar
'AUDUSD=X', # Australian Dollar/US Dollar
'USDCHF=X', # US Dollar/Swiss Franc

'USDCAD=X', # US Dollar/Canadian Dollar
'EURJPY=X', # Euro/Japanese Yen
'GBPJPY=X', # British Pound/Japanese Yen
'EURCHF=X', # Euro/Swiss Franc
'NZDUSD=X', # New Zealand Dollar/US Dollar
]

Asset classes

In [None]:
asset = [
    # major_stock,
    # major_commodity,
    major_forex
]

In [None]:
asset_data = []

Data download with NA data removal

In [None]:
for x, i in enumerate(asset):

  each_data = []
  for y, j in enumerate(i):

    data = yf.download(tickers=j, start=start_period, end=end_period, period='1d', group_by='ticker')
    data.index = pd.to_datetime(data.index)
    no_na = data.dropna()

    r, c = no_na.shape

    print("\n[", x, ",", y, "] ---> ", asset[x][y], " ---> ", r/365, " years")
    display(pd.DataFrame(no_na))

    each_data.append(no_na)

  asset_data.append(each_data)

Data Processing

1. Decision column implies the relatioinship of previous day and today

2. It needs to be shifted down by one row

In [None]:
def swing_sell_action(swing_current_lot, main, swing_max, swing_gain, swing_profit, i, open, swing_sell_price):

  swing_profit += swing_current_lot * main['Open'][i]
  open = False

  # print(i, " - Sell [", swing_current_lot, "] at [", main['Open'][i], "] - money = [", swing_profit, "]")

  if(swing_profit - swing_sell_price[-1] > 0 and swing_profit - swing_sell_price[-1] > swing_max[0]): # max profit
    # print("\n>>> max profit [ sell:(", swing_profit, ") - buy:(", swing_sell_price[-1] ,") ] ---> ", swing_profit - swing_sell_price[-1], "\n")
    swing_max[0] = swing_profit - swing_sell_price[-1]
  if(swing_profit - swing_sell_price[-1] < 0 and swing_profit - swing_sell_price[-1] < swing_max[1]): # max loss
    # print("\n>>> max lose [ sell:(", swing_profit, ") - buy:(", swing_sell_price[-1] ,") ] ---> ", swing_profit - swing_sell_price[-1], "\n")
    swing_max[1] = swing_profit - swing_sell_price[-1]
  if not math.isnan(swing_profit):
    swing_sell_price.append(swing_profit)

  return swing_profit, swing_max, open


Decision to Action for two strategies
1. Swing Trading
2. Day Trading

In [None]:
def decision_to_action(main, target, volatility, swing_step, day_step, swing_gain, day_gain):

  print(len(main), len(target))

  swing_profit = money
  day_profit = money

  swing_action = ""
  day_action = ""

  swing_current_lot = 0
  day_current_lot = 0

  swing_max = [
      0, # 0 - profit
      0, # 1 - loss
      0, # 2 - holding period
  ]

  day_max = [
      0, # 0 - profit
      0, # 1 - loss
      0, # 2 - holding period
  ]

  swing_sell_price = [
      money
      ]


  day_sell_price = [
      money
      ]

  open = False

  for i, d in enumerate(target):

    try:

      if(i == len(target) - 1 and (swing_step[i - 1] == 'Buy' or swing_step[i - 1] == 'Retain')): # close all

        swing_action = "Sell"

        if(open == True and swing_current_lot >= 0): swing_profit, swing_max, open = swing_sell_action(swing_current_lot, main, swing_max, swing_gain, swing_profit, i, open, swing_sell_price)

      elif(i == 0 and target[i - 1] == 1 or target[i - 2] == 0 and target[i - 1] == 1 and i != len(target) - 1):

        swing_action = "Buy"

        if(swing_current_lot >= 0 and open == False and swing_profit > 0):
          open = True
          swing_current_lot = int((swing_profit*invest_ratio)/main['Open'][i])
          if(swing_current_lot > 0):
            swing_profit -= swing_current_lot * main['Open'][i]
            # print(i, " - Buy [", swing_current_lot, "] at [", main['Open'][i], "] - money = [", swing_profit, "]")

      elif(target[i - 1] == 1 and target[i - 2] == 1):

        swing_action = "Retain"

      elif(target[i - 1] == 0 and target[i - 2] == 1 and swing_step[i - 1] != 'Sell'):

        swing_action = "Sell"

        if(open == True and swing_current_lot >= 0): swing_profit, swing_max, open = swing_sell_action(swing_current_lot, main, swing_max, swing_gain, swing_profit, i, open, swing_sell_price)

      else:
        swing_action = 'No Action'

    except:
      swing_action = 'No Action'

    try:

      if(target[i - 1] == 1):
        day_current_lot = int((day_profit/invest_ratio)/main['Open'][i])
        day_profit += day_current_lot * (main['Close'][i] - main['Open'][i])

        # print(i, " - Buy & Sell [", day_current_lot, "] at [", main['Open'][i], "] - money = [", day_profit, "]")

        day_action = "Buy & Sell"

        if(day_profit - day_sell_price[-1] > 0 and day_profit - day_sell_price[-1] > day_max[0]): # max profit
          day_max[0] = day_profit - day_sell_price[-1]
        if(day_profit - day_sell_price[-1] < 0 and day_profit - day_sell_price[-1] < day_max[1]): # max loss
          day_max[1] = day_profit - day_sell_price[-1]

        if not math.isnan(day_profit):
          day_sell_price.append(day_profit)

      else:
        day_action = "No Action"

    except:
      day_action = "No Action"

    swing_step.append(swing_action)
    day_step.append(day_action)

    swing_gain.append(swing_profit)
    day_gain.append(day_profit)

    try:
      volatility.append((main['Close'][i] - main['Close'][i - 1])/main['Close'][i - 1])

    except:
      volatility.append(0)

  volatility[0] = 0
  fluctuation = pd.DataFrame(volatility, columns = ['Volatility'])

  swing_trade = pd.DataFrame(swing_step, columns = ['Swing Action'])
  swing_earn = pd.DataFrame(swing_gain, columns = ['Swing Profit'])

  day_trade = pd.DataFrame(day_step, columns = ['Day Action'])
  day_earn = pd.DataFrame(day_gain, columns = ['Day Profit'])

  return fluctuation, swing_trade, day_trade, swing_earn, day_earn, swing_max, day_max, swing_sell_price, day_sell_price

1. Scaling data
2. Price to Decision and Action

In [None]:
loop_1 = []
for x, i in enumerate(asset_data):

  loop_2 = []
  for y, j in enumerate(i):

    loop_3 = []
    for z, k in enumerate(scaler):

      try:

        print("[",x,",",y,",",z,"]")
        r, c = j.shape

        decision = []

        for row in range(0, r):

          if(j['Open'][row] < j['Close'][row]):
            decision.append(1) # buy
          else:
            decision.append(0) # sell

        idea = pd.DataFrame(decision, columns = ['Decision'])
        idea['Decision'] = idea['Decision'].shift(-1) # shifting Decision column one row up
        idea.at[idea.index[0], 'Decision'] = 0.0 # no action on day 1
        idea.at[idea.index[-1], 'Decision'] = 0.0 # close all

        fluctuation, swing_action, day_action, swing_profit, day_profit, swing_snapshot, day_snapshot, day_sell_price, swing_sell_price = decision_to_action(j, idea['Decision'],[], [], [], [], [])

        current_frame = pd.concat([j.reset_index(), fluctuation['Volatility'], idea['Decision'], swing_action['Swing Action'], swing_profit['Swing Profit'], day_action['Day Action'], day_profit['Day Profit']], axis=1)

        color_frame = current_frame.style.applymap(actionColor)
        display(color_frame.applymap(reportColor, subset=['Date',	'Open',	'High',	'Low',	'Close',	'Adj Close',	'Volume',	'Volatility']))

        result = idea['Decision'].value_counts()

        loop_3.append(current_frame)
      except:
        pass

    loop_2.append(loop_3)
  loop_1.append(loop_2)

In [None]:
processed_data = loop_1

In [None]:
for x, i in enumerate(processed_data): # 2

  for y, j in enumerate(i): # 10

    fig, axes = plt.subplots(2, 5, figsize=(30, 10))
    for z, k in enumerate(j): # 2

      print("[",x,",",y,",",z,"]")
      for a, ax in enumerate(axes.flat):

        if(x == 0):
          ax.set_facecolor('xkcd:black')
        ax.plot(k['Close'])

        # Set the title and axis labels
        ax.set_title(' VS ', fontsize=10)

        ax.set_xlabel('Year-Month', fontsize=15)
        ax.set_ylabel('Close Prices', fontsize=15)

        ax.tick_params(axis='both', labelsize=15)

        h1, l1  = ax.get_legend_handles_labels()

    plt.show()

In [None]:
def slicing(list, begin, to):
  return list[int(len(list) * begin) : int(len(list) * to)]

Splitting training and testing datasets

In [None]:
loop_1 = []
for x, i in enumerate(processed_data): # 2

  loop_2 = []
  for y, j in enumerate(i): # 10

    loop_3 = []
    for z, k in enumerate(j): # 2

      print("[",x,",",y,",",z,"]")

      X = k.copy()
      for j in ['Date', 'Decision', 'Swing Action', 'Swing Profit', 'Day Action', 'Day Profit']:
        X = X.drop([j], axis='columns').copy()
      Y = k.Decision

      x_train, x_validation, y_train, y_validation = train_test_split(X, Y, test_size=train_size, shuffle=False )

      print(len(x_train), len(x_validation) , len(y_train), len(y_validation))

      x_train = slicing(X, 0, 1 - train_size)
      x_validation = slicing(X, 1 - train_size, 1)

      y_train = slicing(Y, 0, 1 - train_size)
      y_validation = slicing(Y, 1 - train_size, 1)

      print(len(x_train), len(x_validation) , len(y_train), len(y_validation))

      loop_3.append([x_train, x_validation, y_train, y_validation])

    loop_2.append(loop_3)

  loop_1.append(loop_2)

In [None]:
train_test_data = loop_1

Trading metrics

In [None]:
def max_drawdown(prices):
    max_so_far = prices[0]
    max_drawdown = 0
    for price in prices:
        if price > max_so_far:
            max_so_far = price
        else:
            drawdown = 1 - price / max_so_far
            if drawdown > max_drawdown:
                max_drawdown = drawdown
    return max_drawdown

In [None]:
def recovery_rate(prices, max_drawdown):
    max_so_far = prices[0]
    for price in prices:
        if price > max_so_far:
            max_so_far = price
        else:
            drawdown = 1 - price / max_so_far
            if drawdown > max_drawdown:
                max_drawdown = drawdown
    recovery_rate = 1 - max_drawdown
    return recovery_rate

In [None]:
def sharpe_ratio(money_change, num_years, risk_free_rate):
    """
    Calculates the Sharpe Ratio for a given array of money changes,
    assuming that the changes represent the results of a trading algorithm
    over a certain period of time.
    """
    # Calculate the trade returns
    trade_returns = []
    for i in range(len(money_change) - 1):
        trade_returns.append((money_change[i+1] - money_change[i]) / money_change[i])

    # Calculate the total return over the trading period
    total_return = np.prod([1 + r for r in trade_returns]) - 1

    # Calculate the annualized mean return and standard deviation of returns
    num_trades = len(trade_returns)
    annualized_return = (1 + total_return) ** (1 / num_years) - 1
    annualized_std_dev = np.std(trade_returns) * np.sqrt(num_trades / num_years)

    # Calculate the Sharpe Ratio
    sharpe = (annualized_return - risk_free_rate) / annualized_std_dev

    # Ensure Sharpe Ratio is not infinity or negative infinity
    if sharpe == np.inf or sharpe == -np.inf:
        sharpe = np.nan

    # Return Sharpe Ratio
    return sharpe

In [None]:
def sortino_ratio(money_change, num_years, risk_free_rate):
    """
    Calculates the Sortino Ratio for a given array of money changes,
    assuming that the changes represent the results of a trading algorithm
    over a certain period of time.
    """
    # Calculate the trade returns
    trade_returns = []
    for i in range(len(money_change) - 1):
        trade_returns.append((money_change[i+1] - money_change[i]) / money_change[i])

    # Calculate the total return over the trading period
    total_return = np.prod([1 + r for r in trade_returns]) - 1

    # Calculate the downside deviation
    downside_returns = [r for r in trade_returns if r < 0]
    downside_deviation = np.std(downside_returns) * np.sqrt(len(trade_returns) / num_years)

    # Calculate the annualized mean return and Sortino Ratio
    num_trades = len(trade_returns)
    annualized_return = (1 + total_return) ** (1 / num_years) - 1
    sortino = (annualized_return - risk_free_rate) / downside_deviation

    # Ensure Sortino Ratio is not infinity or negative infinity
    if sortino == np.inf or sortino == -np.inf:
        sortino = np.nan

    # Return Sortino Ratio
    return sortino

In [None]:
def cagr(account_money, years):

  initial_value = account_money[0]
  final_value = account_money[-1]

  cagr = (final_value / initial_value) ** (1 / years) - 1

  return cagr

In [None]:
def sterling_ratio(money_change, num_years, risk_free_rate):
    """
    Calculates the Sterling Ratio for a given array of money changes,
    assuming that the changes represent the results of a trading algorithm
    over a certain period of time.
    """
    # Calculate the trade returns
    trade_returns = []
    for i in range(len(money_change) - 1):
        trade_returns.append((money_change[i+1] - money_change[i]) / money_change[i])

    # Calculate the total return over the trading period
    total_return = np.prod([1 + r for r in trade_returns]) - 1

    # Calculate the skewness and kurtosis of the returns
    returns_skewness = skew(trade_returns)
    returns_kurtosis = kurtosis(trade_returns)

    # Calculate the annualized mean return and standard deviation of returns
    num_trades = len(trade_returns)
    annualized_return = (1 + total_return) ** (1 / num_years) - 1
    annualized_std_dev = np.std(trade_returns) * np.sqrt(num_trades / num_years)

    # Calculate the Sterling Ratio
    sterling = (annualized_return - risk_free_rate) / (annualized_std_dev * (1 + 0.25 * returns_skewness + 0.025 * returns_kurtosis))

    # Ensure Sterling Ratio is not infinity or negative infinity
    if sterling == np.inf or sterling == -np.inf:
        sterling = np.nan

    # Return Sterling Ratio
    return sterling

In [None]:
def modified_sharpe_ratio(money_change, num_years, risk_free_rate):
    """
    Calculates the Modified Sharpe Ratio for a given array of money changes,
    assuming that the changes represent the results of a trading algorithm
    over a certain period of time.
    """
    # Calculate the trade returns
    trade_returns = []
    for i in range(len(money_change) - 1):
        trade_returns.append((money_change[i+1] - money_change[i]) / money_change[i])

    # Calculate the total return over the trading period
    total_return = np.prod([1 + r for r in trade_returns]) - 1

    # Calculate the annualized mean return and standard deviation of returns
    num_trades = len(trade_returns)
    annualized_return = (1 + total_return) ** (1 / num_years) - 1
    annualized_std_dev = np.std(trade_returns) * np.sqrt(num_trades / num_years)

    # Calculate the Modified Sharpe Ratio
    modified_sharpe = (annualized_return - risk_free_rate) / (annualized_std_dev + np.abs(annualized_return - risk_free_rate))

    # Ensure Modified Sharpe Ratio is not infinity or negative infinity
    if modified_sharpe == np.inf or modified_sharpe == -np.inf:
        modified_sharpe = np.nan

    # Return Modified Sharpe Ratio
    return modified_sharpe

Report generator

In [None]:
detail = False

ranking = [
  # common parameters
  [], # 0 - asset
  [], # 1 - model
  [], # 2 - scaler

  [], # 3 - accuracy

  # strategy - 0 - buy and hold
  [], # 4 - profit threshold

  # strategy - 1 - swing trading
  [], # 5 - capital history
  [], # 6 - profit
  [], # 7 - cagr
  [], # 8 - sharpe ratio
  [], # 9 - max dropdown
  [], # 10 - recovery rate
  [], # 11 - max profit
  [], # 12 - max loss
  [], # 13 - total trade

  # strategy - 2 - day trading
  [], # 14 - capital history
  [], # 15 - profit
  [], # 16 - cagr
  [], # 17 - sharpe ratio
  [], # 18 - max dropdown
  [], # 19 - recovery rate
  [], # 20 - max profit
  [], # 21 - max loss
  [], # 22 - total trade
]

loop_1 = []
for x, i in enumerate(train_test_data): # 2

  loop_2 = []
  for y, j in enumerate(i): # 10

    loop_3 = []
    for z, k in enumerate(j): # 2

      loop_4 = []
      for a, l in enumerate(model):

        print("[",x,",",y,",",z, ",",a,"]")

        try:

          l.fit(k[0], [round(_) for _ in k[2]])

          # m.save(path + "result_" + id * x * y * z + ".h5", save_format="h5")

          prediction = l.predict(k[1])

          trade_year = len(prediction)/365

          rounded_prediction = np.interp(prediction, [min(prediction), max(prediction)], [0, 1])
          rounded_prediction = [round(_) for _ in rounded_prediction]

          rounded_truth = np.array([round(_) for _ in k[3]])

          # inverted_prediction = [1-x for x in rounded_prediction]

          score = classification_metric[0](rounded_truth, rounded_prediction)
          # m.score(rounded_truth.reshape(-1, 1), rounded_prediction)

          if(z % 2 == 0):
            color = "yellow"

          else:
            color = "lightblue"

          display(HTML('''<span style="color: ''' + color + '''"><<---------------(start ['''+ str(a) + '''])--------------->></span>'''))

          if(detail):
            for i in classification_metric:
              display(HTML('''<span style="color:LightGoldenRodYellow"><br>[ '''+  i.__name__ + ''' ]<br></span>'''))

              print(i(rounded_truth, rounded_prediction))

          idea = pd.DataFrame(rounded_prediction, columns = ['Decision'])
          idea['Decision'] = idea['Decision'].shift(-1) # shifting Decision column one row up (position is opened at Close)
          idea.at[idea.index[0], 'Decision'] = 0.0 # no action on day 1
          idea.at[idea.index[-1], 'Decision'] = 0.0 # close all

          sliced_data = slicing(asset_data[x][y], 1 - train_size, 1)
          fluctuation, swing_action, day_action, swing_profit, day_profit, swing_snapshot, day_snapshot, day_sell_price, swing_sell_price = decision_to_action(sliced_data, idea['Decision'], [], [], [], [], [])

          swing_cagr = cagr(swing_sell_price, trade_year) # (swing_profit['Swing Profit'].iloc[-1]/money) ** (1/trade_year) - 1
          day_cagr = cagr(day_sell_price, trade_year) # (day_profit['Day Profit'].iloc[-1]/money) ** (1/trade_year) - 1
          # cagr = (sliced_data['Close'][len(sliced_data) - 1]/sliced_data['Close'][0]) ** (1/trade_year) - 1
          # std = statistics.stdev(sliced_data['Close']) * (trading_day**(1/2))

          swing_sharpe_ratio = sharpe_ratio(swing_sell_price, trade_year, risk_free_rate) # (swing_cagr - risk_free_rate)/std
          day_sharpe_ratio = sharpe_ratio(day_sell_price, trade_year, risk_free_rate) # (day_cagr - risk_free_rate)/std

          # swing_sortino_ratio = sortino_ratio(swing_sell_price, trade_year, risk_free_rate)
          # day_sortino_ratio = sortino_ratio(day_sell_price, trade_year, risk_free_rate)

          # swing_sterling_ratio = sterling_ratio(swing_sell_price, trade_year, risk_free_rate)
          # day_sterling_ratio = sterling_ratio(day_sell_price, trade_year, risk_free_rate)

          # swing_modified_sharpe_ratio = modified_sharpe_ratio(swing_sell_price, trade_year, risk_free_rate) # (swing_cagr - risk_free_rate)/std
          # day_modified_sharpe_ratio = modified_sharpe_ratio(day_sell_price, trade_year, risk_free_rate) # (day_cagr - risk_free_rate)/std

          # print(swing_sharpe_ratio, day_sharpe_ratio)
          # print(swing_sortino_ratio, day_sortino_ratio)
          # print(swing_sterling_ratio, day_sterling_ratio)
          # print(swing_modified_sharpe_ratio, day_modified_sharpe_ratio)

          swing_mdd = max_drawdown(swing_sell_price)
          day_mdd = max_drawdown(day_sell_price)

          swing_rr = recovery_rate(swing_sell_price, swing_mdd) # (swing_sell_price[-1]-min(swing_sell_price))/(max(swing_sell_price)-min(swing_sell_price))
          day_rr = recovery_rate(day_sell_price, day_mdd) # (day_sell_price[-1]-min(day_sell_price))/(max(day_sell_price)-min(day_sell_price))
          # current_frame = pd.concat([(slicing(asset_data[x][asset[y][z]], 1 - train_size, 1)).reset_index(), fluctuation['Volatility'], idea['Decision'], action['Action'], profit['Profit']], axis=1)
          # display(current_frame.style.applymap(actionColor))

          print("\n(", a, ") ", l.__class__.__name__ ,"\n")
          print("(", z, ") ", scaler[z], "\n")

          ranking[0].append(asset[x][y]) # asset
          ranking[1].append(l.__class__.__name__) # model
          ranking[2].append(scaler[z]) # scaler
          ranking[3].append(score) # accuracy

          # strategy - 0 - buy and hold
          ranking[4].append(money+(sliced_data['Open'].iloc[-1]-sliced_data['Open'].iloc[0])*int(money/sliced_data['Open'].iloc[0]) - money) # profit threshold

          # strategy - 1 - swing trading
          ranking[5].append(swing_sell_price) # capital history
          ranking[6].append(swing_sell_price[-1] - money) # net profit
          ranking[7].append(swing_cagr) # compound annual growth rate
          ranking[8].append(swing_sharpe_ratio) # sharpe ratio
          ranking[9].append(swing_mdd) # max drawdown
          ranking[10].append(swing_rr) # recovery rate
          ranking[11].append(swing_snapshot[0]) # max profit
          ranking[12].append(swing_snapshot[1]) # max loss
          ranking[13].append(len(swing_sell_price) - 1) # total trade

          ranking[14].append(day_sell_price) # capital history
          ranking[15].append(day_sell_price[-1] - money) # net profit
          ranking[16].append(day_cagr) # compound annual growth rate
          ranking[17].append(day_sharpe_ratio) # sharpe ratio
          ranking[18].append(day_mdd) # max drawdown
          ranking[19].append(day_rr) # recovery rate
          ranking[20].append(day_snapshot[0]) # max profit
          ranking[21].append(day_snapshot[1]) # max loss
          ranking[22].append(len(day_sell_price) - 1) # total trade

          # print([last for *_, last in ranking])

          display(HTML('''<br><span style="color: ''' + color + '''"><<---------------(end ['''+ str(a) + '''])--------------->></span>'''))

        except:

          pass

        loop_4.append(ranking)

      loop_3.append(loop_4)

    loop_2.append(loop_3)

  loop_1.append(loop_2)

Scaling for Sharpe Ratio - Swing Trading

In [None]:
ranking[8] = np.interp(ranking[8], [min(ranking[8]), max(ranking[8])], [-3, 3])

Scaling for Sharpe Ratio - Day Trading

In [None]:
ranking[17] = np.interp(ranking[17], [min(ranking[17]), max(ranking[17])], [-3, 3])

In [None]:
ranking_x = loop_1

In [None]:
def drop_duplicate_rows(arr):
    unique_rows = []

    for row in arr:
        if not any(np.array_equal(row, unique_row) for unique_row in unique_rows):
            unique_rows.append(row)

    unique_arr = np.array(unique_rows, dtype=object)

    return unique_arr

In [None]:
def drop_duplicate_columns(arr):
    # Convert NaN values to string representation
    arr[np.isnan(arr)] = 'np.nan'

    # Convert array elements to strings
    arr_strings = np.array([str(row) for row in arr])

    # Find unique strings
    unique_strings = np.unique(arr_strings, axis=0)

    # Convert strings back to array
    unique_arr = np.array([eval(row.replace('nan', 'np.nan')) for row in unique_strings])

    return unique_arr

In [None]:
# ranking_only = drop_duplicate_columns(ranking)

In [None]:
def drop_element(arr):
    n = min(len(row) for row in arr)  # Number of columns

    for i in range(len(arr)):
        if len(arr[i]) > n:
            arr[i] = arr[i][:n]

    return arr

In [None]:
ranking_only = drop_element(ranking)

In [None]:
len(ranking)

In [None]:
len(ranking_only)

In [None]:
len(ranking[0])

In [None]:
len(ranking_only[0])

In [None]:
pd.DataFrame(ranking_x[0][0][0][0])

Last frame for the multi-dimensional array

Report Section

In [None]:
parameter = [
    'accuracy',
    'profit threshold',
    'net profit',
    'cagr',
    'sharpe ratio',
    'max drawdown',
    'recovery rate',
    'max profit',
    'max loss',
    'total trade'
    ]

In [None]:
true_false = [True, False]

In [None]:
show_result = 10

(Report 1) Swing Trading Batch Classification Report
<a name="swing-trading-data-report"></a>

[Meun](#nav-meun)

In [None]:
for state in true_false:

  for rank in parameter:

    swing_chart = pd.DataFrame(data = {
                                    'asset': ranking_only[0],
                                    'model': ranking_only[1],
                                    'accuracy': ranking_only[3],

                                    'profit threshold': ranking_only[4],

                                    'profit history': ranking_only[5],
                                    'net profit': ranking_only[6],
                                    'cagr': ranking_only[7],
                                    'sharpe ratio': ranking_only[8],
                                    'max drawdown': ranking_only[9],
                                    'recovery rate': ranking_only[10],
                                    'max profit': ranking_only[11],
                                    'max loss': ranking_only[12],
                                    'total trade': ranking_only[13]
                                    }).sort_values(by=rank, ascending=state, inplace=False)

    swing_report = swing_chart.drop(columns=['profit history'], axis=1)

    word = ''

    if(state):
      word = 'Fewest'
    else:
      word = 'Greatest'

    chart = swing_chart.head(show_result)
    swing_chart = swing_report.head(show_result)

    print("\n(" + word +") " + str(show_result) +" swing trading models ["+ rank + "]:")
    color_frame = swing_chart.style.applymap(reportColor, subset=['asset', 'model'])
    color_frame = color_frame.applymap(swingColor, subset=[rank])
    display(color_frame)

    fig, axes = plt.subplots(int(show_result/5), 5, figsize=(30, show_result))
    fig.subplots_adjust(hspace=0.5)
    ax = axes.flat

    for i, flow in enumerate(chart['profit history']):

      color = ''
      if(flow[0] < flow[-1]):
        ax[i].set_facecolor('xkcd:light green')
        color = 'black'
      else:
        ax[i].set_facecolor('xkcd:deep red')
        color = 'white'

      ax[i].plot(flow, color=color)

      ax[i].set_title(word + ' - [' + str(i + 1)+ ']\n\n' + str(rank) + ':' + str(swing_chart[rank][swing_chart.index.values[i]]) + '\n\n[' +  swing_chart['asset'][swing_chart.index.values[i]] + ' ' + swing_chart['model'][swing_chart.index.values[i]] + ']', fontsize=10)

      ax[i].set_xlabel('Trade', fontsize=15)
      ax[i].set_ylabel('Profit History', fontsize=15)

      ax[i].tick_params(axis='both', labelsize=15)

      h1, l1  = ax[i].get_legend_handles_labels()

(Swing Trading) - Dummy investiment in all models

In [None]:
diversify_investiment = swing_report['net profit'].sum() - money * len(swing_report['net profit'])

print("Gain of diversify investment in swing trading:\n\nnet profit --->", diversify_investiment, "(", diversify_investiment / (money * len(swing_report['net profit'])),"% )")

(Report 2) Day Trading Batch Classification Report
<a name="day-trading-data-report"></a>

[Meun](#nav-meun)

In [None]:
for state in true_false:

  for rank in parameter:

    day_chart = pd.DataFrame(data = {
                                    'asset': ranking_only[0],
                                    'model': ranking_only[1],
                                    'accuracy': ranking_only[3],

                                    'profit threshold': ranking_only[4],

                                    'profit history': ranking_only[14],
                                    'net profit': ranking_only[15],
                                    'cagr': ranking_only[16],
                                    'sharpe ratio': ranking_only[17],
                                    'max drawdown': ranking_only[18],
                                    'recovery rate': ranking_only[19],
                                    'max profit': ranking_only[20],
                                    'max loss': ranking_only[21],
                                    'total trade': ranking_only[22]
                                    }).sort_values(by=rank, ascending=state, inplace=False)

    day_report = day_chart.drop(columns=['profit history'], axis=1)

    word = ''

    if(state):
      word = 'Fewest'
    else:
      word = 'Greatest'

    chart = day_chart.head(10)
    day_chart = day_report.head(10)

    print("\n(" + word +") 10 day trading models ["+ rank + "]:")
    color_frame = day_chart.style.applymap(reportColor, subset=['asset', 'model'])
    color_frame = color_frame.applymap(dayColor, subset=[rank])
    display(color_frame)

    fig, axes = plt.subplots(int(show_result/5), 5, figsize=(30, show_result))
    fig.subplots_adjust(hspace=0.5)
    ax = axes.flat

    for i, flow in enumerate(chart['profit history']):

      color = ''
      if(flow[0] < flow[-1]):
        ax[i].set_facecolor('xkcd:light green')
        color = 'black'
      else:
        ax[i].set_facecolor('xkcd:deep red')
        color = 'white'

      ax[i].plot(flow, color=color)

      ax[i].set_title(word + ' - [' + str(i + 1)+ ']\n\n' + str(rank) + ':' + str(day_chart[rank][day_chart.index.values[i]]) + '\n\n[' +  day_chart['asset'][day_chart.index.values[i]] + ' ' + day_chart['model'][day_chart.index.values[i]] + ']', fontsize=10)

      ax[i].set_xlabel('Trade', fontsize=15)
      ax[i].set_ylabel('Profit History', fontsize=15)

      ax[i].tick_params(axis='both', labelsize=15)

      h1, l1  = ax[i].get_legend_handles_labels()

(Day Trading) - Dummy investiment in all models

In [None]:
diversify_investiment = day_report['net profit'].sum() - money * len(day_report['net profit'])

print("Gain of diversify investment in day trading:\n\nnet profit --->", diversify_investiment, "(", diversify_investiment / (money * len(day_report['net profit'])),"% )")