Here’s an example of my script in which I make multilayer perceptron on some financial data. The data has 53 independent variables (X) and one independent variable (y). It is time series data and X exhibits plenty of time-series related information. The goal is to fit MP as good as possible.

In [12]:
### Import packages:
import numpy as np
import sys
import theano
import theano.tensor as T
from keras.models import Sequential
import keras

In [13]:
### Load data and count the number of samples.
input_data_all = np.load('train_sample.npy')
output_data_all= np.load('train_target.npy')
n_samples= input_data_all.shape[0]
print(n_samples)

96674


In [14]:
### shuffle 2 data series together randomly.
def shuffle_together(input_1, input_2):
    if input_1.shape[0]!= input_2.shape[0]:
        print ("Problem, y and x array are not of the same shape.")
        return None
    c= np.arange(input_1.shape[0])
    np.random.shuffle(c)
    return input_1[c], input_2[c], c

In [15]:
### splitting on train and validation input (note that it is splitted randomly).

input_data_all, output_data_all, _ = shuffle_together(input_data_all, output_data_all)
n_validation_samples= int(0.1*n_samples)
validation_input= input_data_all[-1*n_validation_samples:]
validation_output= output_data_all[-1*n_validation_samples:]
n_train_samples= n_samples-n_validation_samples
train_input= input_data_all[:n_train_samples]
train_output= output_data_all[:n_train_samples]
print(train_input.shape)
print(train_output.shape)
### Now we define matrix x and vector y in theano.
x=T.matrix('x', dtype= theano.config.floatX)
y=T.vector('y', dtype= theano.config.floatX)


(87007, 53)
(87007,)


In [16]:
train_input.shape[1]

53

In [17]:
# initialize model
model = keras.models.Sequential()
### First layer.  Linear activation layer.
model.add(keras.layers.Dense(input_data_all.shape[1], input_shape=(train_input.shape[1],),activation='linear',
                bias_initializer=keras.initializers.RandomUniform(minval=-1, maxval=1, seed=None),
                kernel_initializer=keras.initializers.RandomUniform(minval=-1, maxval=1, seed=None)))

# add hidden layer(s) with relu and tanh activation functions.
model.add(keras.layers.Dense(units=40,kernel_initializer=keras.initializers.RandomUniform(minval=-1, maxval=1, seed=None),
        bias_initializer=keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None),
        activation='relu')
    )          
model.add(keras.layers.Dense(units=25,kernel_initializer=keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None),
        bias_initializer=keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None),
        activation='tanh')
    )          
model.add(keras.layers.Dense(units=15,kernel_initializer=keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None),
        bias_initializer=keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None),
        activation='tanh')
    )          
model.add(keras.layers.Dense(units=8,kernel_initializer=keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None),
        bias_initializer=keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None),
        activation='tanh')
    )          
model.add(keras.layers.Dense(units=4,kernel_initializer=keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None),
        bias_initializer=keras.initializers.RandomUniform(minval=-1, maxval=1, seed=None),
        activation='relu')
    )          
### Final layer is again linear.
model.add(keras.layers.Dense(units=1,kernel_initializer=keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None),
        bias_initializer=keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None),
        activation='linear')
    )          


In [18]:
# define SGD optimizer
sgd_optimizer = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-012)

# compile model
model.compile(
    optimizer=sgd_optimizer,
    loss='mean_squared_error'
)

In [19]:
# train model
history = model.fit(
    input_data_all, output_data_all,
    batch_size=300, epochs=100000,
    verbose=1, validation_split=0.1,validation_freq=100
)

Train on 87006 samples, validate on 9668 samples
Epoch 1/100000
Epoch 2/100000
Epoch 3/100000
Epoch 4/100000
Epoch 5/100000
Epoch 6/100000
Epoch 7/100000
Epoch 8/100000











Epoch 9/100000
Epoch 10/100000
Epoch 11/100000
Epoch 12/100000
Epoch 13/100000
  900/87006 [..............................] - ETA: 26s - loss: 0.0381



 1800/87006 [..............................] - ETA: 35s - loss: 0.0378



Epoch 14/100000
Epoch 15/100000
Epoch 16/100000
Epoch 17/100000
Epoch 18/100000
Epoch 19/100000
Epoch 20/100000
Epoch 21/100000
Epoch 22/100000
Epoch 23/100000
Epoch 24/100000
Epoch 25/100000
Epoch 26/100000
Epoch 27/100000
Epoch 28/100000
Epoch 29/100000
Epoch 30/100000
Epoch 31/100000
Epoch 32/100000
Epoch 33/100000
Epoch 34/100000
Epoch 35/100000
Epoch 36/100000
Epoch 37/100000
Epoch 38/100000
Epoch 39/100000
Epoch 40/100000
Epoch 41/100000
Epoch 42/100000
Epoch 43/100000
Epoch 44/100000
Epoch 45/100000
Epoch 46/100000
Epoch 47/100000
Epoch 48/100000
 1200/87006 [..............................] - ETA: 39s - loss: 0.0012



Epoch 49/100000
  900/87006 [..............................] - ETA: 23s - loss: 0.0040



Epoch 50/100000
Epoch 51/100000
Epoch 52/100000
Epoch 53/100000
  600/87006 [..............................] - ETA: 21s - loss: 0.0012



Epoch 54/100000
Epoch 55/100000
Epoch 56/100000
 7800/87006 [=>............................] - ETA: 23s - loss: 6.5052e-04



Epoch 57/100000
Epoch 58/100000
Epoch 59/100000
Epoch 60/100000
Epoch 61/100000
Epoch 62/100000
Epoch 63/100000
Epoch 64/100000
Epoch 65/100000
Epoch 66/100000
Epoch 67/100000
Epoch 68/100000
 1200/87006 [..............................] - ETA: 22s - loss: 2.7793e-04



Epoch 69/100000
Epoch 70/100000
Epoch 71/100000
Epoch 72/100000
Epoch 73/100000
Epoch 74/100000
Epoch 75/100000
Epoch 76/100000
Epoch 77/100000



Epoch 78/100000
Epoch 79/100000
Epoch 80/100000
Epoch 81/100000
Epoch 82/100000
  900/87006 [..............................] - ETA: 22s - loss: 8.9690e-05



 1500/87006 [..............................] - ETA: 34s - loss: 1.1607e-04



Epoch 83/100000
Epoch 84/100000
Epoch 85/100000
Epoch 86/100000
Epoch 87/100000
Epoch 88/100000
Epoch 89/100000
Epoch 90/100000
Epoch 91/100000
Epoch 92/100000
Epoch 93/100000
Epoch 94/100000
Epoch 95/100000
Epoch 96/100000
Epoch 97/100000
Epoch 98/100000
Epoch 99/100000
Epoch 100/100000
Epoch 101/100000
Epoch 102/100000
Epoch 103/100000
Epoch 104/100000
Epoch 105/100000
Epoch 106/100000
Epoch 107/100000
Epoch 108/100000
Epoch 109/100000
Epoch 110/100000
Epoch 111/100000
Epoch 112/100000
Epoch 113/100000
Epoch 114/100000
  900/87006 [..............................] - ETA: 34s - loss: 7.9534e-04



 3300/87006 [>.............................] - ETA: 18s - loss: 6.3547e-04



Epoch 115/100000
Epoch 116/100000
Epoch 117/100000
Epoch 118/100000
Epoch 119/100000
Epoch 120/100000
Epoch 121/100000
Epoch 122/100000
Epoch 123/100000
 3900/87006 [>.............................] - ETA: 5s - loss: 7.7110e-0



Epoch 124/100000
Epoch 125/100000
Epoch 126/100000
Epoch 127/100000
Epoch 128/100000
Epoch 129/100000
Epoch 130/100000
Epoch 131/100000
Epoch 132/100000
Epoch 133/100000
Epoch 134/100000
Epoch 135/100000
Epoch 136/100000
Epoch 137/100000
Epoch 138/100000
Epoch 139/100000
Epoch 140/100000
Epoch 141/100000
Epoch 142/100000
Epoch 143/100000
Epoch 144/100000
Epoch 145/100000
Epoch 146/100000
Epoch 147/100000
Epoch 148/100000
Epoch 149/100000
  900/87006 [..............................] - ETA: 27s - loss: 6.7494e-04



Epoch 150/100000
Epoch 151/100000
 1200/87006 [..............................] - ETA: 23s - loss: 5.6752e-04



Epoch 152/100000
Epoch 153/100000
Epoch 154/100000
Epoch 155/100000
Epoch 156/100000
Epoch 157/100000
Epoch 158/100000
Epoch 159/100000
Epoch 160/100000
Epoch 161/100000
Epoch 162/100000
Epoch 163/100000
Epoch 164/100000
Epoch 165/100000
Epoch 166/100000
Epoch 167/100000
Epoch 168/100000
Epoch 169/100000
Epoch 170/100000
Epoch 171/100000
Epoch 172/100000
Epoch 173/100000
Epoch 174/100000
Epoch 175/100000
Epoch 176/100000
Epoch 177/100000
 6900/87006 [=>............................] - ETA: 17s - loss: 3.4527e-04



Epoch 178/100000
Epoch 179/100000
 1500/87006 [..............................] - ETA: 14s - loss: 7.4689e-05



Epoch 180/100000
Epoch 181/100000
Epoch 182/100000
Epoch 183/100000
Epoch 184/100000
Epoch 185/100000











Epoch 186/100000
Epoch 187/100000
Epoch 188/100000
Epoch 189/100000
Epoch 190/100000
Epoch 191/100000
Epoch 192/100000
Epoch 193/100000
Epoch 194/100000
Epoch 195/100000
Epoch 196/100000
Epoch 197/100000
Epoch 198/100000
Epoch 199/100000
Epoch 200/100000
Epoch 201/100000
Epoch 202/100000
Epoch 203/100000
Epoch 204/100000
Epoch 205/100000
Epoch 206/100000
Epoch 207/100000
Epoch 208/100000
Epoch 209/100000
Epoch 210/100000
Epoch 211/100000
Epoch 212/100000
Epoch 213/100000
Epoch 214/100000
Epoch 215/100000
Epoch 216/100000
Epoch 217/100000
Epoch 218/100000
Epoch 219/100000
Epoch 220/100000
Epoch 221/100000
Epoch 222/100000
Epoch 223/100000
Epoch 224/100000
Epoch 225/100000
Epoch 226/100000
Epoch 227/100000
Epoch 228/100000
Epoch 229/100000
Epoch 230/100000
Epoch 231/100000
Epoch 232/100000
Epoch 233/100000
Epoch 234/100000
Epoch 235/100000
Epoch 236/100000
Epoch 237/100000
 3000/87006 [>.............................] - ETA: 12s - loss: 3.6157e-04



Epoch 238/100000
Epoch 239/100000
Epoch 240/100000
Epoch 241/100000
Epoch 242/100000
Epoch 243/100000
Epoch 244/100000
Epoch 245/100000
Epoch 246/100000
Epoch 247/100000
Epoch 248/100000
Epoch 249/100000
Epoch 250/100000
Epoch 251/100000
Epoch 252/100000
Epoch 253/100000
Epoch 254/100000
Epoch 255/100000
Epoch 256/100000
Epoch 257/100000
Epoch 258/100000
Epoch 259/100000
Epoch 260/100000
Epoch 261/100000
Epoch 262/100000
Epoch 263/100000
Epoch 264/100000
Epoch 265/100000
Epoch 266/100000
Epoch 267/100000
Epoch 268/100000
Epoch 269/100000
Epoch 270/100000
Epoch 271/100000
Epoch 272/100000
Epoch 273/100000
Epoch 274/100000
Epoch 275/100000
 2700/87006 [..............................] - ETA: 17s - loss: 1.3349e-04



Epoch 276/100000
Epoch 277/100000
Epoch 278/100000
 1200/87006 [..............................] - ETA: 33s - loss: 5.1261e-05 ETA: 21s - loss: 1.7498



Epoch 279/100000
Epoch 280/100000
Epoch 281/100000
Epoch 282/100000
Epoch 283/100000
Epoch 284/100000
Epoch 285/100000
Epoch 286/100000
Epoch 287/100000
Epoch 288/100000
Epoch 289/100000
Epoch 290/100000
Epoch 291/100000
 1200/87006 [..............................] - ETA: 19s - loss: 2.9060e-05



Epoch 292/100000
Epoch 293/100000
Epoch 294/100000
Epoch 295/100000
Epoch 296/100000
Epoch 297/100000
Epoch 298/100000
Epoch 299/100000
Epoch 300/100000
Epoch 301/100000
  600/87006 [..............................] - ETA: 28s - loss: 9.9142e-06



 3000/87006 [>.............................] - ETA: 14s - loss: 1.5270e-



Epoch 302/100000
Epoch 303/100000
Epoch 304/100000
Epoch 305/100000
Epoch 306/100000
Epoch 307/100000
Epoch 308/100000
Epoch 309/100000
Epoch 310/100000
Epoch 311/100000
Epoch 312/100000
Epoch 313/100000
Epoch 314/100000
Epoch 315/100000
Epoch 316/100000
Epoch 317/100000
Epoch 318/100000
Epoch 319/100000
Epoch 320/100000
Epoch 321/100000
Epoch 322/100000
 1200/87006 [..............................] - ETA: 29s - loss: 3.5700e-



Epoch 323/100000
Epoch 324/100000
Epoch 325/100000
Epoch 326/100000
Epoch 327/100000
Epoch 328/100000
Epoch 329/100000
Epoch 330/100000
Epoch 331/100000
Epoch 332/100000
Epoch 333/100000
Epoch 334/100000
Epoch 335/100000
Epoch 336/100000
Epoch 337/100000

KeyboardInterrupt: 

In [20]:
### Now we load the test data and predict the model out of that
### What we do next is calculate R-squared and display it.
test_x = np.load('test_sample.npy')#,dtype=theano.config.floatX)
test_y= np.load('test_target.npy')#,dtype=theano.config.floatX)
test_x = np.load('test_sample.npy')#,dtype=theano.config.floatX)
test_y= np.load('test_target.npy')#,dtype=theano.config.floatX)
sample_out= model.predict(test_x)
ss_tot = np.sum((test_y-np.mean(test_y))**2)
SS_res = np.sum((sample_out.reshape(-1)-test_y)**2)
print("Rsquared:",(1 - SS_res/ss_tot)*100)

Rsquared: 99.95008688561789


R-squared of 99.95 is not great, but also not bad for first training. We can still improve this by retraining, just like we did it when we used lasagne package!

In [21]:
model.predict(test_x)

array([[1.505761 ],
       [1.5068358],
       [1.5069517],
       ...,
       [1.2960402],
       [1.2948159],
       [1.294792 ]], dtype=float32)

In [22]:
### And we save the model;

model.save("model.h1")