# Masoscience (advance LSTM model)

**Author:** Mir Yasin Zeinaliyan

**Email:** yasinprodebian@gmail.com  

**Github:** https://github.com/yasin-pro/masoscience

**Description:** In this project, we implement an advanced LSTM model, which is provided for free, but there are no other models in the demo, but we implemented this part completely, and this part is enough to implement the other processes of the project. Unlike this section, although it is a demo, we have presented a very good model.


### Install the necessary tools

To run the codes of this project, you must install the relevant tools

In [None]:
!pip install keras-tuner

### Import libraries

In this section, I entered the code of all the libraries that are required to run the following codes

In [4]:
import pandas as pd
import numpy as np
import datetime as dt
import tensorflow as tf
from kerastuner import HyperModel
from kerastuner.tuners import RandomSearch
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Input, LSTM, Dense, Dropout, TimeDistributed, RepeatVector, BatchNormalization, LeakyReLU, Attention, Add
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint
from tensorflow.keras.regularizers import l2
from sklearn.metrics import mean_absolute_error, mean_squared_error
import matplotlib.pyplot as plt
from sklearn.preprocessing import RobustScaler
from sklearn.model_selection import train_test_split

### Read data

In this section, we read the prepared data and check it

My data is in my Google Drive, if your data is in another path, you need to change the data reading code

In [None]:
from google.colab import drive

drive.mount('/content/drive')
df = pd.read_csv("/content/drive/My Drive/Masoscience/processed_eurusd.csv")

# df = pd.read_csv("processed_eurusd.csv")

df.head(20)

### Checking and reviewing data

In this section, we get an overview of the data and check that we have not forgotten anything in the preparation and that the data is ready for learning and performing operations.

In [None]:
df.info()

### Data augmentation

Data augmentation involves creating new data points from existing data by applying various transformations. By introducing slight modifications to the original data, we generate multiple versions, enabling the machine learning model to train on a more extensive and varied dataset.

In [6]:
def data_augmentation(data, augmentation_factor=2):
    augmented_data = data.copy()
    original_data_length = len(data)

    for _ in range(augmentation_factor):
        new_data = data.copy()

        non_cyclical_columns = [col for col in data.columns if not any(x in col for x in ['sin', 'cos'])]
        for column in non_cyclical_columns:
            noise = np.random.normal(0, 0.01, len(data))
            new_data[column] = new_data[column] + noise

        augmented_data = pd.concat([augmented_data, new_data])

    return augmented_data.reset_index(drop=True)

n_augmentations = 2

augmented_data = data_augmentation(df, n_augmentations)

### Cyclical perturbation
  this technique involves making small, periodic changes to data that has inherent cyclical features, such as sine and cosine components. The objective of cyclical perturbation is to maintain the periodic cycles while introducing slight variations to increase the diversity of the training dataset.

In [7]:
def cyclical_perturbation(data, sin_columns, cos_columns, perturb_factor=0.005):
    perturbed_data = data.copy()

    for sin_col, cos_col in zip(sin_columns, cos_columns):
        angle = np.random.normal(0, perturb_factor, len(data))
        perturbed_data[sin_col] = perturbed_data[sin_col] * np.cos(angle) - perturbed_data[cos_col] * np.sin(angle)
        perturbed_data[cos_col] = perturbed_data[sin_col] * np.sin(angle) + perturbed_data[cos_col] * np.cos(angle)

    return perturbed_data


cyclical_columns = ['hour_sin', 'hour_cos', 'day_of_week_sin', 'day_of_week_cos', 'week_of_month_sin', 'week_of_month_cos']
sin_columns = [col for col in cyclical_columns if 'sin' in col]
cos_columns = [col for col in cyclical_columns if 'cos' in col]

perturb_factor = 0.005

perturbed_data = cyclical_perturbation(augmented_data, sin_columns, cos_columns, perturb_factor)

### Normalize the data

`RobustScaler` is one of the scalers available in the scikit-learn library, used for scaling features of the data. It is particularly robust in the presence of outliers.

#### How It Works

Unlike `StandardScaler`, which uses the mean and standard deviation, `RobustScaler` uses the median and interquartile range (IQR) for scaling, reducing the influence of outliers on the data.

The formula used by `RobustScaler` to scale each feature is as follows:

---
$$
\hat{x}_i = \frac{x_i - \text{Median}(X)}{\text{IQR}(X)}
$$
---


In [12]:
X = df.drop(["open", "high", "low", "close", "change_percent"], axis=1)
y = df[["open", "high", "low", "close", "change_percent"]]

scaler_X = RobustScaler()
X_scaled = scaler_X.fit_transform(X)

scaler_y = RobustScaler()
y_scaled = scaler_y.fit_transform(y)

### Data preprocessing

In this section, we divided the data into training and testing sections and transformed it to coordinate with the LSTM model.

In [13]:
X_train, X_test, y_train_base, y_test_base = train_test_split(X_scaled, y_scaled, test_size=0.2, random_state=42)

X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)

### Model prediction settings

To predict the number of future steps, which refers to predicting the number of future candles, we set the number of outputs and check and adjust the number of outputs for coordination and assurance.

In [14]:
prediction_steps = 10

y_train = np.array([y_train_base[i:i + prediction_steps] for i in range(0, len(y_train_base) - prediction_steps)])
y_test = np.array([y_test_base[i:i + prediction_steps] for i in range(0, len(y_test_base) - prediction_steps)])

X_train = X_train[:len(y_train)]
X_test = X_test[:len(y_test)]

### LSTM model creation

Long Short-Term Memory (LSTM) networks are a type of Recurrent Neural Network (RNN) capable of learning long-term dependencies. They were introduced to mitigate the vanishing gradient problem in traditional RNNs, making them more effective for time series prediction, natural language processing, and other sequential tasks.

This code section builds and trains an advanced LSTM model for time series forecasting, using multiple techniques such as normalization, elimination, regularization, and compression to avoid overfitting and improve model performance.

In [15]:
def attention_3d_block(inputs):
    input_dim = int(inputs.shape[2])
    a = Dense(input_dim, activation='softmax')(inputs)
    output_attention_mul = tf.keras.layers.multiply([inputs, a])
    return output_attention_mul

class LSTMHyperModel(HyperModel):

    def build(self, hp):
        inputs = Input(shape=(X_train.shape[1], X_train.shape[2]))

        x = LSTM(units=hp.Int('units_1', min_value=32, max_value=128, step=32),
                 return_sequences=True, kernel_regularizer=tf.keras.regularizers.L2(0.01))(inputs)
        x = BatchNormalization()(x)
        x = LeakyReLU()(x)
        x = Dropout(hp.Float('dropout_1', min_value=0.2, max_value=0.5, step=0.1))(x)

        x = LSTM(units=hp.Int('units_2', min_value=32, max_value=128, step=32),
                 return_sequences=True, kernel_regularizer=tf.keras.regularizers.L2(0.01))(x)
        x = BatchNormalization()(x)
        x = LeakyReLU()(x)
        x = Dropout(hp.Float('dropout_2', min_value=0.2, max_value=0.5, step=0.1))(x)

        x = attention_3d_block(x)

        x = LSTM(units=hp.Int('units_3', min_value=32, max_value=128, step=32),
                 return_sequences=False, kernel_regularizer=tf.keras.regularizers.L2(0.01))(x)
        x = BatchNormalization()(x)
        x = LeakyReLU()(x)
        x = Dropout(hp.Float('dropout_3', min_value=0.2, max_value=0.5, step=0.1))(x)

        x = RepeatVector(prediction_steps)(x)

        x = LSTM(units=hp.Int('units_4', min_value=32, max_value=128, step=32),
                 return_sequences=True, kernel_regularizer=tf.keras.regularizers.L2(0.01))(x)
        x = BatchNormalization()(x)
        x = LeakyReLU()(x)
        x = Dropout(hp.Float('dropout_4', min_value=0.2, max_value=0.5, step=0.1))(x)

        x = LSTM(units=hp.Int('units_5', min_value=32, max_value=128, step=32),
                 return_sequences=True, kernel_regularizer=tf.keras.regularizers.L2(0.01))(x)
        x = BatchNormalization()(x)
        x = LeakyReLU()(x)
        x = Dropout(hp.Float('dropout_5', min_value=0.2, max_value=0.5, step=0.1))(x)

        x = TimeDistributed(Dense(5, kernel_regularizer=tf.keras.regularizers.L2(0.01)))(x)

        model = Model(inputs=inputs, outputs=x)
        model.compile(optimizer=tf.keras.optimizers.Adam(
            learning_rate=hp.Float('learning_rate', min_value=1e-4, max_value=1e-2, sampling='LOG')),
            loss='mean_squared_error')

        return model

### models Callbacks

A few callbacks have been added to make it a good model, especially for storage and significant improvements and...

In [16]:
early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.0001)
checkpoint = ModelCheckpoint('lstm_model_checkpoint.keras', monitor='val_loss', save_best_only=True, verbose=1)

### Train model

In this part, the most important part of the program is for the model to learn to use it for prediction

The implemented model is of HyperModel type, so we can find its best settings with existing techniques

In [None]:
tuner = RandomSearch(
    LSTMHyperModel(),
    objective='val_loss',
    max_trials=10,
    executions_per_trial=1,
    directory='/content/drive/My Drive/Masoscience/lstm/lstm_hyperparameter_tuning',
    project_name='/content/drive/My Drive/Masoscience/lstm/lstm_stock_prediction')

tuner.search_space_summary()

tuner.search(X_train, y_train,
             epochs=50,
             validation_split=0.2,
             callbacks=[early_stopping, reduce_lr, checkpoint],
             verbose=1)

### Get best model

We get the best model with the following line of code

In [None]:
best_model = tuner.get_best_models(num_models=1)[0]

### Save model

We save the seen training model at the end so that we don't run the operation on it every time to use it for prediction. As it turns out, this process is very time-consuming.

I saved it in my Google Drive. To save it in another address, you need to change the following code


In [None]:
# best_model.save("models/lstm_model.h5")
best_model.save("/content/drive/My Drive/Masoscience/lstm/lstm_model.h5")