# 4.1 Build Model

In this notebook we will finally approach the task of building an actual model to make buy and sell decisions. We will write the code for the model to function, load the prepared train and test datasets, and then begin to train the model. 

We will begin with a very simple model copied straight from the tensorflow [convolution network tutorial](https://www.tensorflow.org/tutorials/images/cnn)

In [1]:
import tensorflow as tf

from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
import numpy as np

In [None]:
X_train = np.load('./data/prepared/august25screenfixed/numpy_matrices/X_train.npy')
y_train = np.load('./data/prepared/august25screenfixed/numpy_matrices/y_train.npy')

In [47]:
X_test = np.load('./data/prepared/august25screenfixed/numpy_matrices/X_test.npy')
y_test = np.load('./data/prepared/august25screenfixed/numpy_matrices/y_test.npy')

In [2]:
y_train = np.load('./data/prepared/august25screenfixed/numpy_matrices/y_train.npy')
y_test = np.load('./data/prepared/august25screenfixed/numpy_matrices/y_test.npy')

In [3]:
np.unique(y_train, return_counts=True)

(array([0., 1., 2.]), array([34245, 16007,  9748]))

In [4]:
# Undersample to balance classes for training set
h = np.where(y_train == 0)[0]
b = np.where(y_train == 1)[0]
s = np.where(y_train == 2)[0]

hi = np.random.choice(h, size=8000, replace=False)
bi = np.random.choice(b, size=8000, replace=False)
si = np.random.choice(s, size=8000, replace=False)

In [5]:
ind = np.concatenate([hi,bi,si])
ind.shape

(24000,)

In [6]:
np.unique(y_train[ind], return_counts=True)

(array([0., 1., 2.]), array([8000, 8000, 8000]))

In [43]:
X_train[ind].shape

(24000, 1, 116, 60)

In [44]:
y_train = y_train[ind]
X_train = X_train[ind]

In [55]:
np.shape(y_test)

(6000, 1)

In [7]:
# decrease test size for runtime and memory concerns
indt = np.random.choice(np.arange(np.shape(y_test)[0]), size=6000, replace=False)

In [None]:
y_test = y_test[indt]
X_test = X_test[indt]
y_test.shape, X_test.shape

In [57]:
# Because of memory constraints, I choose channels first previously. 
# However, the model now explicitly calls for channels last. 
X_train.shape = (24000, 116, 60, 1)

In [60]:
X_test.shape = (6000, 116, 60, 1)

In [61]:
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((24000, 116, 60, 1), (6000, 116, 60, 1), (24000, 1), (6000, 1))

In [62]:
model = models.Sequential()
model.add(layers.Conv2D(64, (5, 5), activation='relu', data_format='channels_last', input_shape=(116, 60, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (4, 4), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(3))
model.summary()

Model: "sequential_10"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_32 (Conv2D)           (None, 112, 56, 64)       1664      
_________________________________________________________________
max_pooling2d_17 (MaxPooling (None, 56, 28, 64)        0         
_________________________________________________________________
conv2d_33 (Conv2D)           (None, 53, 25, 64)        65600     
_________________________________________________________________
max_pooling2d_18 (MaxPooling (None, 26, 12, 64)        0         
_________________________________________________________________
conv2d_34 (Conv2D)           (None, 24, 10, 64)        36928     
_________________________________________________________________
conv2d_35 (Conv2D)           (None, 22, 8, 64)         36928     
_________________________________________________________________
flatten_4 (Flatten)          (None, 11264)           

In [63]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

In [64]:
history = model.fit(X_train, y_train, epochs=10, 
                    validation_data=(X_test, y_test))

Train on 24000 samples, validate on 6000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
 1888/24000 [=>............................] - ETA: 3:03 - loss: 1.0988 - accuracy: 0.3335

KeyboardInterrupt: 

My first shot at training does not appear to produce results. There are several assumptions that I made which could be questioned. 

First, do we really need previous data? Could a column of the current timestep do just as well on its own? If so, we should the data be formatted as images? Couldn't it just be a column leading to a densely connected feed forward neural network? This would train faster, allow for a deeper network, and allow larger samples to be trained on. 

If previous price action is helpful, do I need 60 timesteps back? Thats an entire hour. Why start at the longer range and work down? Why not start from 1 and work towards more data? 

Parameters such as kernal size, number of layers, number of filters, etc all need to be questioned and tuned. 

In [67]:
np.unique(y_test, return_counts=True)[1]/y_test.shape[0]

array([0.5705, 0.244 , 0.1855])

Options:  
    1) decrease training set size and run more epochs  
    2) Change image size to make them shorter  
    3) Change filters, kernal, and layers   

#### Attempt 3 columns only

In [8]:
X_train = np.load('./data/prepared/august25screenfixed/numpy_matrices/Xr3.npy')
X_test = np.load('./data/prepared/august25screenfixed/numpy_matrices/Xe3.npy')

X_train = X_train[ind]
X_test = X_test[indt]

In [9]:
X_train.shape = (24000, 116, 3, 1)
X_test.shape = (6000, 116, 3, 1)

In [11]:
y_train = y_train[ind]
y_test = y_test[indt]

In [12]:
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((24000, 116, 3, 1), (6000, 116, 3, 1), (24000, 1), (6000, 1))

In [19]:
model = models.Sequential()
model.add(layers.Conv2D(64, (5, 1), activation='relu', padding = "same", data_format='channels_last', input_shape=(116, 3, 1)))
model.add(layers.MaxPooling2D((2, 1)))
model.add(layers.Conv2D(64, (4, 1), activation='relu', padding = "same",))
model.add(layers.MaxPooling2D((2, 1)))
model.add(layers.Conv2D(64, (3, 1), activation='relu', padding = "same"))
model.add(layers.Conv2D(64, (3, 1), activation='relu', padding = "same"))

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(3))
model.summary()

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=10, 
                    validation_data=(X_test, y_test))

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_10 (Conv2D)           (None, 116, 3, 64)        384       
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 58, 3, 64)         0         
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 58, 3, 64)         16448     
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 29, 3, 64)         0         
_________________________________________________________________
conv2d_12 (Conv2D)           (None, 29, 3, 64)         12352     
_________________________________________________________________
conv2d_13 (Conv2D)           (None, 29, 3, 64)         12352     
_________________________________________________________________
flatten_1 (Flatten)          (None, 5568)             

KeyboardInterrupt: 

In [20]:
model = models.Sequential()
model.add(layers.Conv2D(128, (10, 1), activation='relu', padding = "same", data_format='channels_last', input_shape=(116, 3, 1)))
model.add(layers.Conv2D(64, (4, 1), activation='relu', padding = "same",))
model.add(layers.Conv2D(64, (3, 1), activation='relu', padding = "same"))
model.add(layers.Conv2D(64, (3, 1), activation='relu', padding = "same"))

model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(3))
model.summary()



Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_14 (Conv2D)           (None, 116, 3, 128)       1408      
_________________________________________________________________
conv2d_15 (Conv2D)           (None, 116, 3, 64)        32832     
_________________________________________________________________
conv2d_16 (Conv2D)           (None, 116, 3, 64)        12352     
_________________________________________________________________
conv2d_17 (Conv2D)           (None, 116, 3, 64)        12352     
_________________________________________________________________
flatten_2 (Flatten)          (None, 22272)             0         
_________________________________________________________________
dense_4 (Dense)              (None, 128)               2850944   
_________________________________________________________________
dense_5 (Dense)              (None, 3)                

In [21]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=10, 
                    validation_data=(X_test, y_test))

Train on 24000 samples, validate on 6000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10

KeyboardInterrupt: 

In [22]:
model = models.Sequential()
model.add(layers.Conv2D(64, (5, 1), activation='relu', padding = "same", data_format='channels_last', input_shape=(116, 3, 1)))
model.add(layers.MaxPooling2D((2, 1)))
model.add(layers.Conv2D(64, (4, 1), activation='relu', padding = "same",))
model.add(layers.MaxPooling2D((2, 1)))
model.add(layers.Conv2D(64, (3, 1), activation='relu', padding = "same"))
model.add(layers.Conv2D(128, (3, 1), activation='relu', padding = "same"))

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(3))
model.summary()

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=10, 
                    validation_data=(X_test, y_test))

Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_18 (Conv2D)           (None, 116, 3, 64)        384       
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 58, 3, 64)         0         
_________________________________________________________________
conv2d_19 (Conv2D)           (None, 58, 3, 64)         16448     
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 29, 3, 64)         0         
_________________________________________________________________
conv2d_20 (Conv2D)           (None, 29, 3, 64)         12352     
_________________________________________________________________
conv2d_21 (Conv2D)           (None, 29, 3, 128)        24704     
_________________________________________________________________
flatten_3 (Flatten)          (None, 11136)            

In [23]:
model = models.Sequential()
model.add(layers.Conv2D(64, (5, 1), activation='relu', padding = "same", data_format='channels_last', input_shape=(116, 3, 1)))
model.add(layers.MaxPooling2D((2, 1)))
model.add(layers.Conv2D(64, (4, 1), activation='relu', padding = "same",))
model.add(layers.MaxPooling2D((2, 1)))
model.add(layers.Conv2D(64, (3, 1), activation='relu', padding = "same"))
model.add(layers.Conv2D(128, (3, 1), activation='relu', padding = "same"))
model.add(layers.Conv2D(128, (3, 1), activation='relu', padding = "same"))

model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(3))
model.summary()

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=10, 
                    validation_data=(X_test, y_test))

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_22 (Conv2D)           (None, 116, 3, 64)        384       
_________________________________________________________________
max_pooling2d_11 (MaxPooling (None, 58, 3, 64)         0         
_________________________________________________________________
conv2d_23 (Conv2D)           (None, 58, 3, 64)         16448     
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 29, 3, 64)         0         
_________________________________________________________________
conv2d_24 (Conv2D)           (None, 29, 3, 64)         12352     
_________________________________________________________________
conv2d_25 (Conv2D)           (None, 29, 3, 128)        24704     
_________________________________________________________________
conv2d_26 (Conv2D)           (None, 29, 3, 128)       

KeyboardInterrupt: 

In [25]:
X_train.shape

(24000, 116, 3, 1)

In [32]:
X_train = X_train[:, :, :1, :]
X_test = X_test[:, :, :1, :]

X_train.shape, X_test.shape

((24000, 116, 1, 1), (6000, 116, 1, 1))

In [33]:
X_train.shape = (24000, 116)
X_test.shape = (6000, 116)

In [37]:
X_train.shape

(24000, 116)

In [36]:
y_train.shape = (24000)
y_test.shape = (6000)

In [40]:
from tensorflow import keras

In [43]:
model = keras.Sequential([
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(3)
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=10,
         validation_data=(X_test, y_test))

Train on 24000 samples, validate on 6000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7ff95536eb50>

So far, the super simple feed forward network seems to perform just as well as the cnn. 

In [48]:
model = keras.Sequential([
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(116, activation='relu'),
    keras.layers.Dense(3)
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=10,
         validation_data=(X_test, y_test))

Train on 24000 samples, validate on 6000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7ff954fdf550>

In [53]:
model.fit(X_train, y_train, epochs=20,
         validation_data=(X_test, y_test))

Train on 24000 samples, validate on 6000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x7ff9287bc150>

In [54]:
np.unique(model.predict(X_train), return_counts=True)

(array([-27.306562, -26.922873, -25.991856, ...,  10.438577,  10.866445,
         11.300644], dtype=float32),
 array([1, 1, 1, ..., 1, 1, 1]))

In [55]:
model.predict(X_train)

array([[-0.0461382 , -2.2631397 , -1.6560884 ],
       [-0.65172607, -2.8611639 ,  0.95060426],
       [ 2.467665  ,  0.3657786 , -7.543315  ],
       ...,
       [-1.4604993 ,  0.7621466 ,  0.76671624],
       [ 0.86103374, -1.8974257 ,  1.3878165 ],
       [-2.708256  , -2.6481233 ,  2.9754574 ]], dtype=float32)

In [56]:
probability_model = tf.keras.Sequential([model, 
                                         tf.keras.layers.Softmax()])

predictions = probability_model.predict(X_train)

In [58]:
predictions[0]

array([0.7640396 , 0.08323085, 0.15272966], dtype=float32)

In [59]:
np.argmax(predictions[0])

0

In [60]:
predictions = probability_model.predict(X_test)

In [61]:
np.argmax(predictions[0])

0

In [62]:
predictions[0]

array([9.8680556e-01, 1.2953308e-02, 2.4112873e-04], dtype=float32)

In [63]:
y_test[0]

0.0

The model appears interesting now. It trains very quickly and does well on the training set but not the test set. We can try a couple new approachs. 

1) Let's increase the training set size.   
2) Let's increase and balance the test set   