## Deep Learning Completely Inelastic Collision Solver

This notebook implements a recurrent neural network to solve for the post collision velocity of a two-mass system where the collision is completely inelastic and the second mass is initially at rest. The model is formulated using a recurrent neural network (RNN) and its input is represented as a text string giving the masses and the initial velocity of the first mass. Exact solutions calculated using conservation of momentum equations are used to train the RNN and evaluate the accuracy of predictions. This notebook is inspired by the addition_rnn.py example included with Keras.

### Import useful packaged including TensorFlow 2.0 and Keras.
The notebook utilizes tensorflow >= 2.0, which now includes keras, a package of high level wrappers designed to make building and training deep learning models easier. 

In [6]:
# import packages
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D

from tensorflow.keras.layers import RNN, LSTM, TimeDistributed, RepeatVector, Dense, LSTMCell, Dropout
# LSTM doesn't work well since cuDNN is compiled with certain restrictions. 
# So, here I will create LSTM layers by wrapping LSTMCell in RNN
from tensorflow.keras.models import Sequential
from tensorflow.keras import optimizers

import pickle
import os

### Check to see if a Tensorflow is installed with GPU support and if a GPU is available.

In [2]:
if not tf.test.is_gpu_available():
    print('No GPU found. Training will be slower.')
else:
    print('Default GPU {} found.'.format(tf.test.gpu_device_name()))

Default GPU /device:GPU:0 found.


### Define a class to encode and decode between a selection of characters and one-hot integer representations.

In [3]:
class CharacterTable(object):
    """Given a set of characters:
    + Encode them to a one-hot integer representation
    + Decode the one-hot or integer representation to their character output
    + Decode a vector of probabilities to their character output
    """
    def __init__(self, chars):
        """Initialize character table.
        # Arguments
            chars: Characters that can appear in the input.
        """
        self.chars = sorted(set(chars))
        self.char_indices = dict((c, i) for i, c in enumerate(self.chars))
        self.indices_char = dict((i, c) for i, c in enumerate(self.chars))

    def encode(self, C, num_rows):
        """One-hot encode given string C.
        # Arguments
            C: string, to be encoded.
            num_rows: Number of rows in the returned one-hot encoding. This is
                used to keep the # of rows for each data the same.
        """
        x = np.zeros((num_rows, len(self.chars)))
        for i, c in enumerate(C):
            x[i, self.char_indices[c]] = 1
        return x

    def decode(self, x, calc_argmax=True):
        """Decode the given vector or 2D array to their character output.
        # Arguments
            x: A vector or a 2D array of probabilities or one-hot representations;
                or a vector of character indices (used with `calc_argmax=False`).
            calc_argmax: Whether to find the character index with maximum
                probability, defaults to `True`.
        """
        if calc_argmax:
            x = x.argmax(axis=-1)
        return ''.join(self.indices_char[x] for x in x)
    
class colors:
    ok = '\033[92m'
    fail = '\033[91m'
    close = '\033[0m'

### Generate input and output character sequences.

The input is in the form {}m{}v{} where the values in the {}s are randomly drawn one or two character sequences representing, from left to right, the mass of object one, the mass of object two, and the initial velocity of object one. Object two is initially at rest.

The output is a character string representing the post collision velocity of the system after a head on totally inelastic collision.

In [9]:
# Parameters for the model and dataset.
num_problems = 80000 # number of sequences in the training and validation sets
digits = 2 # maximum number of digits for mass and velocity in the input sequence 
lenans = 4 # number of characters in the answer sequence 

REVERSE=False

# Maximum length of input is 'int + int' (e.g., '345+678'). Maximum length of
# int is DIGITS.
maxlen = digits + 1 + digits + 1 + digits

# All the numbers, plus sign and space for padding.
chars = '0123456789mv. '
ctable = CharacterTable(chars)


filename = 'dl-completely-inelastic-collison-rnn-data.pickle'

if os.path.exists(filename):
    print('Loading saved data...')
    with open(filename, 'rb') as f:
        questions, expected, x_train, x_val, x_test, y_train, y_val, y_test = pickle.load(f)
        
else:


    questions = []
    expected = []
    seen = set()
    print('Generating data...')
    while len(questions) < num_problems:
        f = lambda: int(''.join(np.random.choice(list('123456789'))
                    for i in np.arange(np.random.randint(1, digits + 1))))
        a, b, c = f(), f(), f()
        # Skip any questions we've already seen
        key = tuple(sorted((a, b, c)))
        if key in seen:
            continue
        seen.add(key)
        # Pad the data with spaces such that it is always MAXLEN.
        q = '{}m{}v{}'.format(a, b, c)
        query = q + ' ' * (maxlen - len(q))
        ans = a*c/(a+b)
        if ans < 10:
            r = lenans-2
        elif ans < 100:
            r = lenans-3
        ans = str(round(a*c/(a+b),r))
        # Answers can be of maximum size LENANS.
        ans += '0' * (lenans - len(ans))
        questions.append(query)
        expected.append(ans)
    print('Total momentum questions:', len(questions))

    print('Vectorization...')
    x = np.zeros((len(questions), maxlen, len(chars)), dtype=np.bool)
    y = np.zeros((len(questions), lenans, len(chars)), dtype=np.bool)
    for i, sentence in enumerate(questions):
        x[i] = ctable.encode(sentence, maxlen)
    for i, sentence in enumerate(expected):
        y[i] = ctable.encode(sentence, lenans)

    # Shuffle (x, y) in unison as the later parts of x will almost all be larger
    # digits.
    indices = np.arange(len(y))
    np.random.shuffle(indices)
    x = x[indices]
    y = y[indices]

    # set apart 20% for validation and text data that we never train over.
    split_at = len(x) - len(x) // 5
    (x_train, x_val) = x[:split_at], x[split_at:]
    (y_train, y_val) = y[:split_at], y[split_at:]

    split_at =  len(x_val) - len(x_val) // 2 
    (x_val, x_test) = x_val[:split_at], x_val[split_at:]
    (y_val, y_test) = y_val[:split_at], y_val[split_at:]
    
    with open(filename, 'wb') as f:
        print('Saving data...')
        pickle.dump([questions, expected, x_train, x_val, x_test, y_train, y_val, y_test], f)

print('Training Data:')
print(x_train.shape)
print(y_train.shape)

print('Validation Data:')
print(x_val.shape)
print(y_val.shape)

print('Test Data:')
print(x_test.shape)
print(y_test.shape)

Loading saved data...
Training Data:
(64000, 8, 14)
(64000, 4, 14)
Validation Data:
(8000, 8, 14)
(8000, 4, 14)
Test Data:
(8000, 8, 14)
(8000, 4, 14)


In [10]:
# show an example of the input and output character strings
ii = np.random.randint(0, 1000)
print('input:', questions[ii], 'output:', expected[ii])

input: 88m48v86 output: 55.6


In [11]:
# hyperparameters 
HIDDEN_SIZE = 128
BATCH_SIZE = 256
LAYERS = 2
learning_rate = 0.001

print('Build model...')
model = Sequential()
# "encode" the input sequence using an RNN, producing an output of HIDDEN_SIZE.
# note: in a situation where your input sequences have a variable length, use input_shape=(None, num_feature).
# model.add(LSTM(HIDDEN_SIZE, input_shape=(MAXLEN, len(chars))))
model.add(RNN(LSTMCell(HIDDEN_SIZE), input_shape=(maxlen, len(chars))))

# as the decoder RNN's input, repeatedly provide with the last output of
# RNN for each time step. Repeat 'lenans' times as that's the maximum length of output.
model.add(RepeatVector(lenans))
# the decoder RNN could be multiple layers stacked or a single layer.
for _ in range(LAYERS):
    # by setting return_sequences to True, return not only the last output but
    # all the outputs so far in the form of (num_samples, timesteps,
    # output_dim). This is necessary as TimeDistributed in the below expects
    # the first dimension to be the timesteps.
    # model.add(LSTM(HIDDEN_SIZE, return_sequences=True))
    model.add(RNN(LSTMCell(HIDDEN_SIZE), return_sequences=True))

# add a dropout layer to prevent overfitting
# model.add(Dropout(rate=0.1))
# add a dense layer to every temporal slice of an input. for each of step of the output sequence, 
# decide which character should be chosen.
model.add(TimeDistributed(Dense(len(chars), activation='softmax')))
model.compile(loss='categorical_crossentropy',
              # optimizer=keras.optimizers.Adam(lr=learning_rate),
              optomizer=keras.optimizers.Adam(),
              metrics=['accuracy'])
model.summary()


Build model...
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
rnn (RNN)                    (None, 128)               73216     
_________________________________________________________________
repeat_vector (RepeatVector) (None, 4, 128)            0         
_________________________________________________________________
rnn_1 (RNN)                  (None, 4, 128)            131584    
_________________________________________________________________
rnn_2 (RNN)                  (None, 4, 128)            131584    
_________________________________________________________________
time_distributed (TimeDistri (None, 4, 14)             1806      
Total params: 338,190
Trainable params: 338,190
Non-trainable params: 0
_________________________________________________________________


### Train the model and generate example predictions every 3 epochs.

In [12]:
# train the model and show predictions against the validation dataset.
for iteration in range(1, 36):
    print()
    print('-' * 50)
    print('Iteration', iteration)
    model.fit(x_train, y_train,
              batch_size=BATCH_SIZE,
              epochs=3,
              validation_data=(x_val, y_val))
    score = model.evaluate(x_test, y_test, verbose=0)
    print('score:', score)
    # select 10 samples from the validation set at random so we can visualize errors.
    for i in range(10):
        ind = np.random.randint(0, len(x_val))
        rowx, rowy = x_test[np.array([ind])], y_test[np.array([ind])]
        preds = model.predict_classes(1.0*rowx, verbose=0) # mult by 1.0 to get data types to jive
        q = ctable.decode(rowx[0])
        correct = ctable.decode(rowy[0])
        guess = ctable.decode(preds[0], calc_argmax=False)
        print('Q', q[::-1] if REVERSE else q, end=' ')
        print('A', correct, end=' ')
        
        for ii in np.arange(len(correct)):
            if correct[ii] == guess[ii]:
                print(colors.ok + '☑' + colors.close, end='')
            else:
                print(colors.fail + '☒' + colors.close, end='')
        
        print(' P', guess)


--------------------------------------------------
Iteration 1
Train on 64000 samples, validate on 8000 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
score: [1.6199826693534851, 0.376375]
Q 33m24v42 A 24.3 [91m☒[0m[91m☒[0m[92m☑[0m[92m☑[0m P 19.3
Q 37m8v42  A 34.5 [91m☒[0m[91m☒[0m[92m☑[0m[91m☒[0m P 12.0
Q 3m76v71  A 2.70 [91m☒[0m[92m☑[0m[91m☒[0m[92m☑[0m P 0.00
Q 63m72v47 A 21.9 [91m☒[0m[91m☒[0m[92m☑[0m[91m☒[0m P 13.3
Q 95m77v7  A 3.87 [91m☒[0m[92m☑[0m[91m☒[0m[91m☒[0m P 1.90
Q 67m96v54 A 22.2 [91m☒[0m[91m☒[0m[92m☑[0m[91m☒[0m P 13.3
Q 31m58v28 A 9.75 [91m☒[0m[92m☑[0m[91m☒[0m[91m☒[0m P 1.90
Q 19m6v15  A 11.4 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 1..0
Q 68m25v71 A 51.9 [91m☒[0m[91m☒[0m[92m☑[0m[91m☒[0m P 33.3
Q 41m89v5  A 1.58 [92m☑[0m[92m☑[0m[91m☒[0m[91m☒[0m P 1.30

--------------------------------------------------
Iteration 2
Train on 64000 samples, validate on 8000 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
score: [1.314008

Q 75m9v36  A 32.1 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 31.0
Q 94m37v25 A 17.9 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 17.0
Q 55m62v31 A 14.6 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 14.0
Q 95m16v63 A 53.9 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 58.3
Q 53m57v14 A 6.75 [92m☑[0m[92m☑[0m[91m☒[0m[91m☒[0m P 6.00

--------------------------------------------------
Iteration 8
Train on 64000 samples, validate on 8000 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
score: [1.0746028776168823, 0.5726875]
Q 16m88v17 A 2.62 [91m☒[0m[92m☑[0m[91m☒[0m[91m☒[0m P 3.88
Q 37m21v34 A 21.7 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 22.8
Q 78m81v74 A 36.3 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 36.8
Q 3m94v21  A 0.65 [92m☑[0m[92m☑[0m[91m☒[0m[91m☒[0m P 0.88
Q 41m5v18  A 16.0 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 16.2
Q 55m93v36 A 13.4 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 13.2
Q 57m93v82 A 31.2 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 31.8
Q 33m26v5  A 2.80 [92m☑[

Epoch 3/3
score: [0.9349445757865906, 0.6245625]
Q 7m75v6   A 0.51 [92m☑[0m[92m☑[0m[91m☒[0m[91m☒[0m P 0.49
Q 47m7v79  A 68.8 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 69.6
Q 59m11v82 A 69.1 [91m☒[0m[91m☒[0m[92m☑[0m[91m☒[0m P 70.0
Q 88m79v28 A 14.8 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 15.2
Q 1m76v78  A 1.01 [91m☒[0m[92m☑[0m[91m☒[0m[91m☒[0m P 0.94
Q 62m45v87 A 50.4 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 50.0
Q 78m47v53 A 33.1 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 34.0
Q 5m75v38  A 2.38 [92m☑[0m[92m☑[0m[91m☒[0m[91m☒[0m P 2.22
Q 59m17v98 A 76.1 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 70.0
Q 27m64v8  A 2.37 [92m☑[0m[92m☑[0m[91m☒[0m[91m☒[0m P 2.20

--------------------------------------------------
Iteration 15
Train on 64000 samples, validate on 8000 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
score: [0.9121813957691193, 0.63509375]
Q 42m72v49 A 18.1 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 17.5
Q 8m89v38  A 3.13 [92m☑[0m[92m☑[0m[9

Q 54m61v68 A 31.9 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 31.5
Q 93m29v29 A 22.1 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 21.2
Q 51m63v57 A 25.5 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 25.7
Q 73m18v72 A 57.8 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 57.2
Q 33m27v78 A 42.9 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 42.5

--------------------------------------------------
Iteration 21
Train on 64000 samples, validate on 8000 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
score: [0.8575352659225464, 0.66515625]
Q 7m78v56  A 4.61 [92m☑[0m[92m☑[0m[91m☒[0m[91m☒[0m P 4.80
Q 49m27v27 A 17.4 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 17.9
Q 16m51v35 A 8.36 [92m☑[0m[92m☑[0m[91m☒[0m[91m☒[0m P 8.80
Q 86m76v86 A 45.7 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 45.5
Q 28m64v86 A 26.2 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 26.5
Q 9m94v78  A 6.82 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 6.84
Q 81m14v15 A 12.8 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 12.9
Q 72m83v8  A 3.72 [92m☑

Epoch 3/3
score: [0.8122373790740967, 0.6831875]
Q 52m37v31 A 18.1 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 18.9
Q 7m5v46   A 26.8 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 26.0
Q 3m39v86  A 6.14 [91m☒[0m[92m☑[0m[91m☒[0m[91m☒[0m P 5.00
Q 17m49v48 A 12.4 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 12.0
Q 31m4v28  A 24.8 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 24.1
Q 38m48v21 A 9.28 [92m☑[0m[92m☑[0m[91m☒[0m[91m☒[0m P 9.10
Q 3m46v6   A 0.37 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 0.31
Q 56m37v9  A 5.42 [92m☑[0m[92m☑[0m[91m☒[0m[92m☑[0m P 5.82
Q 34m44v94 A 41.0 [92m☑[0m[92m☑[0m[92m☑[0m[92m☑[0m P 41.0
Q 19m8v31  A 21.8 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 21.0

--------------------------------------------------
Iteration 28
Train on 64000 samples, validate on 8000 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
score: [0.8619791390895843, 0.6553125]
Q 47m61v15 A 6.53 [92m☑[0m[92m☑[0m[91m☒[0m[91m☒[0m P 6.40
Q 1m16v87  A 5.12 [92m☑[0m[92m☑[0m[91

Q 25m21v2  A 1.09 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 1.00
Q 49m11v54 A 44.1 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 44.6
Q 68m99v1  A 0.41 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 0.44
Q 74m11v76 A 66.2 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 67.5
Q 34m85v46 A 13.1 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 12.9

--------------------------------------------------
Iteration 34
Train on 64000 samples, validate on 8000 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
score: [0.7884940972328186, 0.6900625]
Q 61m81v36 A 15.5 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 15.0
Q 9m42v63  A 11.1 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 11.3
Q 86m25v56 A 43.4 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 43.5
Q 73m29v3  A 2.15 [92m☑[0m[92m☑[0m[91m☒[0m[92m☑[0m P 2.25
Q 9m9v94   A 47.0 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 47.5
Q 69m1v31  A 30.6 [92m☑[0m[92m☑[0m[92m☑[0m[91m☒[0m P 30.5
Q 73m34v85 A 58.0 [92m☑[0m[91m☒[0m[92m☑[0m[91m☒[0m P 57.5
Q 2m66v55  A 1.62 [92m☑