## Attempting to Implement Distributed Training to Maximize GPU Usage  


On a single model training instance, GPU usage is limited by batch size (assuming no other bottlenecks, which should not be the case here since data processing is done beforehand and the dataset is relatively small). One idea is to virtualize the GPU into two separate instances and to use distributed learning to use those two sub-GPUs concurrently.

In [1]:
%load_ext autoreload
%autoreload 2

import pandas as pd
import time
import gc

import numpy as np
from google.cloud import bigquery
from google.cloud import storage

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)


from tensorflow import keras
from tensorflow.keras import layers
import seaborn as sns
from pandas.tseries.offsets import BDay

from tensorflow.keras.layers import Embedding
from tensorflow.keras import activations
from tensorflow.keras import backend as K
from tensorflow.keras import initializers
from tensorflow.keras.layers.experimental.preprocessing import Normalization
from sklearn import preprocessing
from datetime import datetime
import matplotlib.pyplot as plt
import pickle5 as pickle


from ficc.utils.nelson_siegel_model import *
from ficc.utils.diff_in_days import *
from ficc.utils.auxiliary_functions import sqltodf


from IPython.display import display, HTML
import os


from ficc.data.process_data import process_data
from ficc.utils.auxiliary_variables import PREDICTORS, NON_CAT_FEATURES, BINARY, CATEGORICAL_FEATURES, IDENTIFIERS, PURPOSE_CLASS_DICT, NUM_OF_DAYS_IN_YEAR
from ficc.utils.gcp_storage_functions import upload_data, download_data
from ficc.utils.auxiliary_variables import RELATED_TRADE_BINARY_FEATURES, RELATED_TRADE_NON_CAT_FEATURES, RELATED_TRADE_CATEGORICAL_FEATURES

from ficc_keras_utils import *
import ficc_keras_utils

pd.set_option('display.float_format', lambda x: '%.3f' % x)

2023-04-26 18:19:08.870037: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 18:19:08.886165: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 18:19:08.888044: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero


Initializing pandarallel with 8.0 cores
INFO: Pandarallel will run on 8 workers.
INFO: Pandarallel will use Memory file system to transfer data between the main process and workers.


In [2]:
gpus = tf.config.list_physical_devices('GPU')
memory = int(15360/2)
if gpus:
    try:
        tf.config.set_logical_device_configuration(
            gpus[0],
            [tf.config.LogicalDeviceConfiguration(memory_limit=7000),
             tf.config.LogicalDeviceConfiguration(memory_limit=7000)
            ]
             )
            
        logical_gpus = tf.config.list_logical_devices('GPU')
        print(len(gpus), "Physical GPU,", len(logical_gpus), "Logical GPUs")
    except RuntimeError as e:
        # Virtual devices must be set before GPUs have been initialized
            print(e)

1 Physical GPU, 2 Logical GPUs


2023-04-26 18:19:11.385579: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-26 18:19:11.387458: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 18:19:11.389365: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-04-26 18:19:11.390901: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zer

In [3]:
print(tf.__version__)

2.7.0


In [27]:
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/home/jupyter/ficc/isaac_creds.json"
os.environ['TF_GPU_THREAD_MODE'] = 'gpu_private'
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
pd.options.mode.chained_assignment = None

bq_client = bigquery.Client()
storage_client = storage.Client()

TRAIN_TEST_SPLIT = 0.85
LEARNING_RATE = 0.0001
BATCH_SIZE = 1000
NUM_EPOCHS = 20
ficc_keras_utils.NUM_EPOCHS = NUM_EPOCHS

DROPOUT = 0.01
SEQUENCE_LENGTH = 5
NUM_FEATURES = 7

Load data

In [5]:
%%time

path = '../processed_file_FULL_2023-04-12-20:44.pkl'
if os.path.isfile(path):
    print('File available, loading pickle')
    with open(path, 'rb') as f:
        data = pickle.load(f)
else:
    print('File not available, downloading from cloud storage')
    with open('processed_file_FULL_2023-04-12-20:44.pkl', 'wb') as f:
        import gcsfs
        fs = gcsfs.GCSFileSystem(project='eng-reactor-287421')
        with fs.open('isaac_data/processed_file_FULL_2023-04-12-20:44.pkl') as f:
            data = pd.read_pickle(f)
            pickle.dump(data, f)

File available, loading pickle
CPU times: user 19.8 s, sys: 8.08 s, total: 27.9 s
Wall time: 27.9 s


In [6]:
data['new_ys'] = data['yield'] - data['new_ficc_ycl']
data['new_ys_realtime'] = data['yield'] - data['new_real_time_ficc_ycl']
data.dropna(subset=['new_ys', 'new_ys_realtime'], inplace=True)

In [7]:
auxiliary_features = ['dollar_price',
                      'last_calc_date',
                     'calc_date', 
                     'trade_date',
                      'last_trade_date',
                     'trade_datetime', 
                     'purpose_sub_class', 
                     'called_redemption_type', 
                     'calc_day_cat',
                     'yield',
                     'ficc_ycl',
                     #'same_ys',
                     #'trade_history_sum',
                     'new_ficc_ycl',
                      'new_real_time_ficc_ycl',
                     'days_to_refund',
                      'last_dollar_price',
                      'last_rtrs_control_number',
                     'is_called',
                     ]

In [8]:
if 'target_attention_features' not in PREDICTORS:
    PREDICTORS.append('target_attention_features')
    
if 'ficc_treasury_spread' not in PREDICTORS:
    PREDICTORS.append('ficc_treasury_spread')
    NON_CAT_FEATURES.append('ficc_treasury_spread')
    
for col in ['new_ficc_ycl', 'new_real_time_ficc_ycl']:     
    if col not in PREDICTORS:
        PREDICTORS.append(col)
        NON_CAT_FEATURES.append(col)

for col in ['extraordinary_make_whole_call', 'make_whole_call', 'has_unexpired_lines_of_credit']:     
    if col not in data.columns:
        try: 
            print(f'Removing {col} from PREDICTORS and BINARY')
            BINARY.remove(col)
            PREDICTORS.remove(col) 
        except:
            continue

Removing extraordinary_make_whole_call from PREDICTORS and BINARY
Removing make_whole_call from PREDICTORS and BINARY
Removing has_unexpired_lines_of_credit from PREDICTORS and BINARY


In [9]:
def process_data(data): 
    data['ted-rate'] = (data['t_rate_10'] - data['t_rate_2']) * 100
    
    # Here is a list of exclusions that we will be experimenting with. The model is trained with these exclusions. These exclusions were discussed with a team member.
    # Callable less than a year in the future
    # Maturity less than a year in the future and more than 30 years in the future
    
    data = data[(data.days_to_call == 0) | (data.days_to_call > np.log10(400))]
    data = data[(data.days_to_refund == 0) | (data.days_to_refund > np.log10(400))]
    data = data[(data.days_to_maturity == 0) | (data.days_to_maturity > np.log10(400))]
    data = data[data.days_to_maturity < np.log10(30000)]
    data['trade_history_sum'] = data.trade_history.parallel_apply(lambda x: np.sum(x))
    data.issue_amount = data.issue_amount.replace([np.inf, -np.inf], np.nan)
    data.dropna(inplace=True, subset=PREDICTORS+['trade_history_sum'])
    data.purpose_sub_class.fillna(0, inplace=True)
    
    # data['calc_date_duration'] = data[['last_calc_date','last_trade_date']].parallel_apply(get_calc_date_duration, axis=1)
    # data['new_ficc_ycl_fixed_shape'] = data[['trade_date', 'calc_date_duration']].parallel_apply(lambda x: calculate_ycl(x, new_yc_params), axis = 1)
    # data['new_ficc_ycl_prev_day'] = data[['last_calc_date', 'last_trade_date' ,'calc_date_duration','trade_date']].parallel_apply(get_yield_for_last_duration, axis=1)
    
    return data

In [10]:
%%time

processed_data = process_data(data) 
# processed_data = processed_data[IDENTIFIERS + PREDICTORS + auxiliary_features]

CPU times: user 40.6 s, sys: 12.3 s, total: 52.9 s
Wall time: 58 s


In [11]:
encoders = {}
fmax = {}
for f in CATEGORICAL_FEATURES:
    print(f)
    fprep = preprocessing.LabelEncoder().fit(processed_data[f].drop_duplicates()) #note that there are apparently no trades with CC 
    fmax[f] = np.max(fprep.transform(fprep.classes_))
    encoders[f] = fprep
    
with open('encoders.pkl','wb') as file:
    pickle.dump(encoders,file)

rating
incorporated_state_code
trade_type
purpose_class


In [12]:
train_dataframe = processed_data[(processed_data.trade_date <
                                  '2023-02-01')].sort_values(by='trade_date', ascending=True).reset_index(drop=True)

test_dataframe = processed_data[(processed_data.trade_date >'2023-02-01')].sort_values(by='trade_date', ascending=True).reset_index(drop=True)

In [13]:
def create_input(df):
    global encoders
    datalist = []
    datalist.append(np.stack(df['trade_history'].to_numpy()))
    datalist.append(np.stack(df['target_attention_features'].to_numpy()))

    noncat_and_binary = []
    for f in NON_CAT_FEATURES + BINARY:
        noncat_and_binary.append(np.expand_dims(df[f].to_numpy().astype('float32'), axis=1))
    datalist.append(np.concatenate(noncat_and_binary, axis=-1))
    
    for f in CATEGORICAL_FEATURES:
        encoded = encoders[f].transform(df[f])
        datalist.append(encoded.astype('float32'))
    
    return datalist

In [14]:
%%time
x_train = create_input(train_dataframe)
x_train[0] = x_train[0][:,:,[0,2,3,4,5,6]]
y_train = train_dataframe.new_ys

x_test = create_input(test_dataframe)
x_test[0] = x_test[0][:,:,[0,2,3,4,5,6]]
y_test = test_dataframe.new_ys

CPU times: user 22.4 s, sys: 944 ms, total: 23.4 s
Wall time: 23.4 s


In [15]:
cutoff_idx = int(len(x_train[0])*0.5)
x_train = [x[cutoff_idx:] for x in x_train]
y_train = y_train[cutoff_idx:]

## Model Training and Testing

In [16]:
# Normalization layer for the trade history
trade_history_normalizer = Normalization(name='Trade_history_normalizer')
trade_history_normalizer.adapt(x_train[0],batch_size=BATCH_SIZE)

# Normalization layer for the non-categorical and binary features
noncat_binary_normalizer = Normalization(name='Numerical_binary_normalizer')
noncat_binary_normalizer.adapt(x_train[2], batch_size = BATCH_SIZE)

tf.keras.utils.set_random_seed(10)

In [17]:
def create_tf_data(x_train, y_train, shuffle=False, shuffle_buffer=1):

    train_size = int(0.8*len(x_train[0]))
                     
    X=()
    for x in x_train:
        X += (tf.data.Dataset.from_tensor_slices(x),)
        

    temp = tf.data.Dataset.zip((X))
    del X
    dataset = tf.data.Dataset.zip((temp,
                        tf.data.Dataset.from_tensor_slices(y_train)))
    del temp
    if shuffle:
        shuffle_buffer = int(len(x_train[0])*shuffle_buffer)
        dataset = dataset.shuffle(shuffle_buffer)
            
    train_ds = dataset.take(train_size)
    val_ds = dataset.skip(train_size)                 
    return train_ds, val_ds

In [20]:
# tf.keras.backend.clear_session()
# gc.collect()

# timestamp = datetime.now().strftime('%Y-%m-%d %H-%M')

# fit_callbacks = fit_callbacks = [
# keras.callbacks.EarlyStopping(
#     monitor="val_loss",
#     patience=10,
#     verbose=0,
#     mode="auto",
#     restore_best_weights=True),
#     CSVLoggerTimeHistory(timestamp+'_training_logs.csv', separator=",", append=False)]

# # with tf.device('/cpu:0'):
    


# gpus = tf.config.list_logical_devices('GPU')

# strategy = tf.distribute.MirroredStrategy(gpus)

# with strategy.scope():
#     train_ds, val_ds = create_tf_data(x_train, y_train, shuffle = True)
#     train_ds = train_ds.batch(BATCH_SIZE).prefetch(2).cache()
#     val_ds = val_ds.batch(BATCH_SIZE).prefetch(2).cache()
    
#     model_new_ys = generate_model(SEQUENCE_LENGTH=5, 
#                               NUM_FEATURES=6, 
#                               trade_history_normalizer=trade_history_normalizer
#                              )

#     fit_callbacks = fit_callbacks = [
#     keras.callbacks.EarlyStopping(
#         monitor="val_loss",
#         patience=10,
#         verbose=0,
#         mode="auto",
#         restore_best_weights=True),
#         # time_callback,
#         CSVLoggerTimeHistory('_'.join([model_new_ys.name,timestamp,'training_logs.csv'])
#                              , separator=",", 
#                              append=False)
#     ]
    
    
#     model_new_ys.compile(optimizer=keras.optimizers.Adam(learning_rate=0.0001),
#           loss=keras.losses.MeanAbsoluteError(),
#           metrics=[keras.metrics.MeanAbsoluteError()])

# #     history_new_ys = model_new_ys.fit(train_ds,
# #                                       validation_data=val_ds,
# #                                         epochs=NUM_EPOCHS,     
# #                                         verbose=1, 
# #                                         callbacks=fit_callbacks)

In [28]:
tf.keras.backend.clear_session()
gc.collect()

gpus = tf.config.list_logical_devices('GPU')

strategy = tf.distribute.MirroredStrategy(gpus,
                                         cross_device_ops=tf.distribute.HierarchicalCopyAllReduce()
                                         )

INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0', '/job:localhost/replica:0/task:0/device:GPU:1')


In [29]:
tf.config.list_logical_devices('GPU')

[LogicalDevice(name='/device:GPU:0', device_type='GPU'),
 LogicalDevice(name='/device:GPU:1', device_type='GPU')]

In [30]:
with strategy.scope():
    
    # Normalization layer for the trade history
    trade_history_normalizer = Normalization(name='Trade_history_normalizer')
    trade_history_normalizer.adapt(x_train[0],batch_size=BATCH_SIZE)

    # Normalization layer for the non-categorical and binary features
    noncat_binary_normalizer = Normalization(name='Numerical_binary_normalizer')
    noncat_binary_normalizer.adapt(x_train[2], batch_size = BATCH_SIZE)

    tf.keras.utils.set_random_seed(10)
    
    def create_tf_data(x_train, y_train, shuffle=False, shuffle_buffer=1):

        train_size = int(0.8*len(x_train[0]))

        X=()
        for x in x_train:
            X += (tf.data.Dataset.from_tensor_slices(x),)


        temp = tf.data.Dataset.zip((X))
        del X
        dataset = tf.data.Dataset.zip((temp,
                            tf.data.Dataset.from_tensor_slices(y_train)))
        del temp
        if shuffle:
            shuffle_buffer = int(len(x_train[0])*shuffle_buffer)
            dataset = dataset.shuffle(shuffle_buffer)

        train_ds = dataset.take(train_size)
        val_ds = dataset.skip(train_size)                 
        return train_ds, val_ds

    def generate_model(name = None, SEQUENCE_LENGTH = SEQUENCE_LENGTH ,NUM_FEATURES = NUM_FEATURES, trade_history_normalizer = trade_history_normalizer):
        inputs = []
        layer = []

        ############## INPUT BLOCK ###################
        trade_history_input = layers.Input(name="trade_history_input", 
                                           shape=(SEQUENCE_LENGTH,NUM_FEATURES), 
                                           dtype = tf.float32) 

        target_attention_input = layers.Input(name="target_attention_input", 
                                           shape=(SEQUENCE_LENGTH, 3), 
                                           dtype = tf.float32) 


        inputs.append(trade_history_input)
        inputs.append(target_attention_input)

        inputs.append(layers.Input(
            name="NON_CAT_AND_BINARY_FEATURES",
            shape=(len(NON_CAT_FEATURES + BINARY),)
        ))


        layer.append(noncat_binary_normalizer(inputs[2]))
        ####################################################


        ############## TRADE HISTORY MODEL #################

        lstm_layer = layers.LSTM(50, 
                                 activation='tanh',
                                 input_shape=(SEQUENCE_LENGTH,NUM_FEATURES),
                                 return_sequences = True,
                                 name='LSTM')

        lstm_attention_layer = CustomAttention(50)

        lstm_layer_2 = layers.LSTM(100, 
                                   activation='tanh',
                                   input_shape=(SEQUENCE_LENGTH,50),
                                   return_sequences = False,
                                   name='LSTM_2')


        features = lstm_layer(trade_history_normalizer(inputs[0]))
        features = lstm_attention_layer(features, features, inputs[1])
        features = layers.BatchNormalization()(features)
        # features = layers.Dropout(DROPOUT)(features)

        features = lstm_layer_2(features)
        features = layers.BatchNormalization()(features)
        # features = layers.Dropout(DROPOUT)(features)

        trade_history_output = layers.Dense(100, 
                                            activation='relu')(features)

        ####################################################

        ############## REFERENCE DATA MODEL ################
        global encoders
        for f in CATEGORICAL_FEATURES:
            fin = layers.Input(shape=(1,), name = f)
            inputs.append(fin)
            embedded = layers.Flatten(name = f + "_flat")( layers.Embedding(input_dim = fmax[f]+1,
                                                                            output_dim = max(30,int(np.sqrt(fmax[f]))),
                                                                            input_length= 1,
                                                                            name = f + "_embed")(fin))
            layer.append(embedded)


        reference_hidden = layers.Dense(400,
                                        activation='relu',
                                        name='reference_hidden_1')(layers.concatenate(layer, axis=-1))

        reference_hidden = layers.BatchNormalization()(reference_hidden)
        reference_hidden = layers.Dropout(DROPOUT)(reference_hidden)

        reference_hidden2 = layers.Dense(200,activation='relu',name='reference_hidden_2')(reference_hidden)
        reference_hidden2 = layers.BatchNormalization()(reference_hidden2)
        reference_hidden2 = layers.Dropout(DROPOUT)(reference_hidden2)

        reference_output = layers.Dense(100,activation='tanh',name='reference_hidden_3')(reference_hidden2)

        ####################################################

        feed_forward_input = layers.concatenate([reference_output, trade_history_output])

        hidden = layers.Dense(300,activation='relu')(feed_forward_input)
        hidden = layers.BatchNormalization()(hidden)
        hidden = layers.Dropout(DROPOUT)(hidden)

        hidden2 = layers.Dense(100,activation='tanh')(hidden)
        hidden2 = layers.BatchNormalization()(hidden2)
        hidden2 = layers.Dropout(DROPOUT)(hidden2)

        final = layers.Dense(1)(hidden2)

        if name: model = keras.Model(inputs=inputs, outputs=final, name=name)
        else: model = keras.Model(inputs=inputs, outputs=final)

        return model
    
    
    train_ds, val_ds = create_tf_data(x_train, y_train, shuffle = True)
    train_ds = train_ds.batch(2*BATCH_SIZE).prefetch(2).cache()
    val_ds = val_ds.batch(2*BATCH_SIZE).prefetch(2).cache()
    
    model_new_ys = generate_model(SEQUENCE_LENGTH=5, 
                              NUM_FEATURES=6, 
                              trade_history_normalizer=trade_history_normalizer
                             )

    timestamp = datetime.now().strftime('%Y-%m-%d %H-%M')
    fit_callbacks = fit_callbacks = [
    keras.callbacks.EarlyStopping(
        monitor="val_loss",
        patience=10,
        verbose=0,
        mode="auto",
        restore_best_weights=True),
        # time_callback,
        CSVLoggerTimeHistory('_'.join([model_new_ys.name,timestamp,'training_logs.csv'])
                             , separator=",", 
                             append=False)
    ]
    
    
    model_new_ys.compile(optimizer=keras.optimizers.Adam(learning_rate=0.0001),
          loss=keras.losses.MeanAbsoluteError(),
          metrics=[keras.metrics.MeanAbsoluteError()])

    history_new_ys = model_new_ys.fit(train_ds,
                                      validation_data=val_ds,
                                        epochs=NUM_EPOCHS,     
                                        verbose=1, 
                                        callbacks=fit_callbacks,                                        
                                      use_multiprocessing=True,
                                        workers=-1)

2023-04-26 18:54:11.023963: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:766] AUTO sharding policy will apply DATA sharding policy as it failed to apply FILE sharding policy because of the following reason: Found an unshardable source dataset: name: "TensorSliceDataset/_1"
op: "TensorSliceDataset"
input: "Placeholder/_0"
attr {
  key: "Toutput_types"
  value {
    list {
      type: DT_DOUBLE
    }
  }
}
attr {
  key: "_cardinality"
  value {
    i: 2212912
  }
}
attr {
  key: "is_files"
  value {
    b: false
  }
}
attr {
  key: "metadata"
  value {
    s: "\n\026TensorSliceDataset:352"
  }
}
attr {
  key: "output_shapes"
  value {
    list {
      shape {
        dim {
          size: 5
        }
        dim {
          size: 6
        }
      }
    }
  }
}



Epoch 1/20


2023-04-26 18:54:33.847521: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:380] Filling up shuffle buffer (this may take a while): 1120999 of 2212912


  1/886 [..............................] - ETA: 7:51:17 - loss: 59.8879 - mean_absolute_error: 59.8879

2023-04-26 18:54:43.697307: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:405] Shuffle buffer filled.




2023-04-26 18:55:18.799378: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:766] AUTO sharding policy will apply DATA sharding policy as it failed to apply FILE sharding policy because of the following reason: Found an unshardable source dataset: name: "TensorSliceDataset/_1"
op: "TensorSliceDataset"
input: "Placeholder/_0"
attr {
  key: "Toutput_types"
  value {
    list {
      type: DT_DOUBLE
    }
  }
}
attr {
  key: "_cardinality"
  value {
    i: 2212912
  }
}
attr {
  key: "is_files"
  value {
    b: false
  }
}
attr {
  key: "metadata"
  value {
    s: "\n\026TensorSliceDataset:352"
  }
}
attr {
  key: "output_shapes"
  value {
    list {
      shape {
        dim {
          size: 5
        }
        dim {
          size: 6
        }
      }
    }
  }
}

2023-04-26 18:55:32.533271: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:380] Filling up shuffle buffer (this may take a while): 1084115 of 2212912
2023-04-26 18:55:42.533267: I tensorflow/core/kernels/data/sh

Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Model training time was 11.26 minutes (675.57 seconds).
Average time for each epoch was 0.56 minutes (33.78 seconds).


In [31]:
tf.keras.backend.clear_session()
gc.collect()

# Normalization layer for the trade history
trade_history_normalizer = Normalization(name='Trade_history_normalizer')
trade_history_normalizer.adapt(x_train[0],batch_size=BATCH_SIZE)

# Normalization layer for the non-categorical and binary features
noncat_binary_normalizer = Normalization(name='Numerical_binary_normalizer')
noncat_binary_normalizer.adapt(x_train[2], batch_size = BATCH_SIZE)

tf.keras.utils.set_random_seed(10)

with tf.device('/cpu:0'):
    train_ds, val_ds = create_tf_data(x_train, y_train, shuffle = True)
    train_ds = train_ds.batch(2*BATCH_SIZE).prefetch(2).cache()
    val_ds = val_ds.batch(2*BATCH_SIZE).prefetch(2).cache()

with tf.device('/gpu:0'):
    model_new_ys = generate_model(SEQUENCE_LENGTH=5, 
                              NUM_FEATURES=6, 
                              trade_history_normalizer=trade_history_normalizer
                             )

    timestamp = datetime.now().strftime('%Y-%m-%d %H-%M')
    fit_callbacks = fit_callbacks = [
    keras.callbacks.EarlyStopping(
        monitor="val_loss",
        patience=10,
        verbose=0,
        mode="auto",
        restore_best_weights=True),
        # time_callback,
        CSVLoggerTimeHistory('_'.join([model_new_ys.name,timestamp,'training_logs.csv'])
                             , separator=",", 
                             append=False)
    ]


    model_new_ys.compile(optimizer=keras.optimizers.Adam(learning_rate=0.0001),
          loss=keras.losses.MeanAbsoluteError(),
          metrics=[keras.metrics.MeanAbsoluteError()])

    history_new_ys = model_new_ys.fit(train_ds,
                                      validation_data=val_ds,
                                        epochs=NUM_EPOCHS,     
                                        verbose=1, 
                                        callbacks=fit_callbacks,                                        
                                      use_multiprocessing=True,
                                        workers=-1)

Epoch 1/20


2023-04-26 19:05:58.597429: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:380] Filling up shuffle buffer (this may take a while): 1084517 of 2212912
2023-04-26 19:06:08.597450: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:380] Filling up shuffle buffer (this may take a while): 1779101 of 2212912


  1/886 [..............................] - ETA: 7:10:36 - loss: 59.8816 - mean_absolute_error: 59.8816

2023-04-26 19:06:13.677092: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:405] Shuffle buffer filled.




2023-04-26 19:06:43.228949: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:380] Filling up shuffle buffer (this may take a while): 1071653 of 2212912
2023-04-26 19:06:53.228943: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:380] Filling up shuffle buffer (this may take a while): 2145358 of 2212912
2023-04-26 19:06:53.851687: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:405] Shuffle buffer filled.


Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Model training time was 6.18 minutes (371.08 seconds).
Average time for each epoch was 0.31 minutes (18.55 seconds).
