**DEEPLOB Inference**

**Reference:** DEEPLOB-XBTUSD.ipynb in exocharts github

**Description:** This is inference notebook of DEEPLOB model. Due to the training being so hard and time/energy consuming, I had to outsource the training of the model to a different machine in the cloud. I trained a few models, especially changing the row count to reflect if I need to use all of the data.
The test data was made from full data (last 25%), so that no model could have seen that data because they only saw n rows of dataset beginning.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [2]:
xtest = np.load("xtest.npy")
ytest = np.load("ytest.npy")

In [3]:
xtest.shape, ytest.shape

((876569, 100, 40, 1), (876569, 3))

In [7]:
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Input, Conv2D, LeakyReLU, MaxPooling2D, concatenate, LSTM, Reshape, Dense
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint

def initiate_DeepLOB_model(lookback_timestep, feature_num, conv_filter_num, inception_num, LSTM_num, leaky_relu_alpha,
                          loss, optimizer, metrics):
    
    input_tensor = Input(shape=(lookback_timestep, feature_num, 1))
    
    conv_layer1 = Conv2D(conv_filter_num, (1,2), strides=(1, 2))(input_tensor)
    conv_layer1 =LeakyReLU(alpha=leaky_relu_alpha)(conv_layer1)
    conv_layer1 = Conv2D(conv_filter_num, (4,1), padding='same')(conv_layer1)
    conv_first1 = LeakyReLU(alpha=leaky_relu_alpha)(conv_layer1)
    conv_layer1 = Conv2D(conv_filter_num, (4,1), padding='same')(conv_layer1)
    conv_layer1 = LeakyReLU(alpha=leaky_relu_alpha)(conv_layer1)

    conv_layer2 = Conv2D(conv_filter_num, (1,2), strides=(1, 2))(conv_layer1)
    conv_layer2 = LeakyReLU(alpha=leaky_relu_alpha)(conv_layer2)
    conv_layer2 = Conv2D(conv_filter_num, (4,1), padding='same')(conv_layer2)
    conv_layer2 = LeakyReLU(alpha=leaky_relu_alpha)(conv_layer2)
    conv_layer2 = Conv2D(conv_filter_num, (4,1), padding='same')(conv_layer2)
    conv_layer2 = LeakyReLU(alpha=leaky_relu_alpha)(conv_layer2)

    conv_layer3 = Conv2D(conv_filter_num, (1,10))(conv_layer2)
    conv_layer3 = LeakyReLU(alpha=leaky_relu_alpha)(conv_layer3)
    conv_layer3 = Conv2D(conv_filter_num, (4,1), padding='same')(conv_layer3)
    conv_layer3 = LeakyReLU(alpha=leaky_relu_alpha)(conv_layer3)
    conv_layer3 = Conv2D(conv_filter_num, (4,1), padding='same')(conv_layer3)
    conv_layer3 = LeakyReLU(alpha=leaky_relu_alpha)(conv_layer3)
    
    inception_module1 = Conv2D(inception_num, (1,1), padding='same')(conv_layer3)
    inception_module1 = LeakyReLU(alpha=leaky_relu_alpha)(inception_module1)
    inception_module1 = Conv2D(inception_num, (3,1), padding='same')(inception_module1)
    inception_module1 = LeakyReLU(alpha=leaky_relu_alpha)(inception_module1)

    inception_module2 = Conv2D(inception_num, (1,1), padding='same')(conv_layer3)
    inception_module2 = LeakyReLU(alpha=leaky_relu_alpha)(inception_module2)
    inception_module2 = Conv2D(inception_num, (5,1), padding='same')(inception_module2)
    inception_module2 = LeakyReLU(alpha=leaky_relu_alpha)(inception_module2)

    inception_module3 = MaxPooling2D((3,1), strides=(1,1), padding='same')(conv_layer3)
    inception_module3 = Conv2D(inception_num, (1,1), padding='same')(inception_module3)
    inception_module3 = LeakyReLU(alpha=leaky_relu_alpha)(inception_module3)
    
    inception_module_final = concatenate([inception_module1, inception_module2, inception_module3], axis=3)
    inception_module_final = Reshape((inception_module_final.shape[1], inception_module_final.shape[3]))(inception_module_final)

    LSTM_output = LSTM(LSTM_num)(inception_module_final)

    model_output = Dense(3, activation='softmax')(LSTM_output)
    
    DeepLOB_model = Model(inputs=input_tensor, outputs= model_output)  
    es = EarlyStopping(monitor='val_accuracy', mode='max', verbose=1)
    
    DeepLOB_model.compile(optimizer=optimizer, loss=loss, metrics=metrics)

    return DeepLOB_model

In [8]:
lookback_timestep = 100
feature_num = 40
conv_filter_num = 16
inception_num = 32
LSTM_num = 64
leaky_relu_alpha = 0.01
loss = 'categorical_crossentropy'
learning_rate = 0.01
adam_epsilon = 1
optimizer = Adam(learning_rate=learning_rate, epsilon=1)
batch_size = 32
metrics = ['accuracy']

In [13]:
model1 = initiate_DeepLOB_model(lookback_timestep, feature_num, conv_filter_num, inception_num, LSTM_num, leaky_relu_alpha,
                          loss, optimizer, metrics)
model1.load_weights("10000nmodel.epoch140-loss0.40.hdf5")
loss1, acc1 = model1.evaluate(xtest, ytest)

2023-01-23 21:21:25.960714: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 14025104000 exceeds 10% of free system memory.




In [14]:
model2 = initiate_DeepLOB_model(lookback_timestep, feature_num, conv_filter_num, inception_num, LSTM_num, leaky_relu_alpha,
                          loss, optimizer, metrics)
model2.load_weights("50000nmodel.epoch89-loss0.50.hdf5")
loss2, acc2 = model2.evaluate(xtest, ytest)

2023-01-23 21:35:20.540810: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 14025104000 exceeds 10% of free system memory.




In [None]:
model3 = initiate_DeepLOB_model(lookback_timestep, feature_num, conv_filter_num, inception_num, LSTM_num, leaky_relu_alpha,
                          loss, optimizer, metrics)
model3.load_weights("allnmodel.epoch33-loss0.70.hdf5")
loss3, acc3 = model3.evaluate(xtest, ytest)

2023-01-23 21:49:21.546859: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 14025104000 exceeds 10% of free system memory.




**Conclusion:** There seems to be a clear indication that with more data I give to the model, the bigger the accuracy is on the same test data, parameter tuning should be the next logical step.