<a href="https://colab.research.google.com/github/karnamohit/kranka_ucm/blob/master/tf1_hb_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Importing all the useful libraries...

In [0]:
%tensorflow_version 1.x
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

Checking the version of TensorFlow, NumPy...

In [0]:
#print('TensorFlow version info:\t',tf.__version__)
!pip show tensorflow
print(' ')
print('------------------------------------------------------------')
print(' ')
!pip show numpy
#print('NumPy version info:\t \t',np.__version__)

Name: tensorflow
Version: 1.15.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /usr/local/lib/python3.6/dist-packages
Requires: opt-einsum, termcolor, tensorflow-estimator, wrapt, google-pasta, keras-applications, gast, numpy, tensorboard, astor, keras-preprocessing, six, wheel, grpcio, absl-py, protobuf
Required-by: stable-baselines, magenta, fancyimpute
 
------------------------------------------------------------
 
Name: numpy
Version: 1.17.3
Summary: NumPy is the fundamental package for array computing with Python.
Home-page: https://www.numpy.org
Author: Travis E. Oliphant et al.
Author-email: None
License: BSD
Location: /usr/local/lib/python3.6/dist-packages
Requires: 
Required-by: yellowbrick, xgboost, xarray, wordcloud, umap-learn, torchvision, torchtext, torch, thinc, Theano, tflearn, tensorflow, tensorflow-probability

**Making, processing raw data**

Make up data (2x2, sequentially indexed matrices)...

In [0]:
series_data = np.zeros((100,2,2), np.float64)        # "time"-series data for a 2x2 evolving matrix with 100 time-steps, randomly initialized
series_data[0,:,:] = np.array([[3.14,-2.8], [-1.2,4.5]])
series_data[1,:,:] = np.array([[-0.56, -0.21], [1.7, 1.8]])
series_data[2,:,:] = np.array([[1.43,-8.2], [-2.1,5.4]])
for i in range(3, series_data.shape[0]):
    series_data[i,:,:] = 0.7*series_data[i-1,:,:] + 0.3*series_data[i-2,:,:] - 0.1*series_data[i-3,:,:]
#    series_data[i,:,:] = (i**0.5)*3 + 5
#print(series_data)
tsteps = int(series_data.shape[0])         # total number of time-steps in the series
tsteps_train = int(tsteps/2)               # number of time-steps used for training
mat_row = int(series_data.shape[1])
mat_col = int(series_data.shape[2])

#print(series_data[tsteps_train-2,:,:])
#print(series_data[tsteps_train-1,:,:])
#print(series_data[tsteps_train,:,:])

In [0]:
print(series_data[0,:,:])
print(series_data[1,:,:])

[[ 3.14 -2.8 ]
 [-1.2   4.5 ]]
[[-0.56 -0.21]
 [ 1.7   1.8 ]]


Define a sequence (multiple consecutive 2x2 matrices) size, ```len_seq```, to be fed in as input (equivalent to the amount of memory in previous time-steps, in our case), and the amount of overlap, ```n_overlap``` between any two adjacent input sequences...

In [0]:
len_seq = 4
n_overlap = 3

if (n_overlap >= len_seq):
    raise Exception("n_overlap must be less than len_seq")

Build tensor of sequences, call it ```seq_series_data```...

In [0]:
n_seq = int((tsteps_train - (len_seq - n_overlap)) / (len_seq - n_overlap)) # number of sequences of size len_seq
#print(n_seq)
seq_series_data = np.zeros((n_seq,len_seq,mat_row,mat_col), np.float64)
seq_series_data_pred = np.zeros((1,mat_row,mat_col), np.float64)
#print(seq_series_data_pred)
i = 0
#j = 0
k = 0
while (i <= n_seq):           # the training sequences start at time-step 1 (series_data[0] element) in this case
    try:
        for j in range(len_seq - 1):
            seq_series_data[k,j,:,:] = series_data[i+j,:,:]
    except IndexError:
        break
    #print(seq_series_data[k,:,:,:])
    #print(j)
    seq_series_data_pred = np.append(seq_series_data_pred, series_data[i+j+1:i+j+2,:,:], axis = 0)
    i += len_seq - n_overlap
    #j += 1
    k += 1
    #print(k,i)
seq_series_data_pred = np.delete(seq_series_data_pred, seq_series_data_pred[0,:,:], axis=0) # true output
#print(seq_series_data_pred)
#print(type(seq_series_data))

xtrain = np.reshape(seq_series_data[:, :-1, :, :], [49, 3, 4])
ytrain = np.reshape(seq_series_data_pred, [49, 4])
print(ytrain[0,:])
print(seq_series_data_pred[0,:,:])

[ 0.519 -5.523 -0.84   3.87 ]
[[ 0.519 -5.523]
 [-0.84   3.87 ]]




In [0]:
# test to check data-type and shape of seq_series_data, seq_series_data_pred, and a slice seq_series_data

print('shape of X (excluding the true output): ',seq_series_data[0,:-1,:,:].shape) 
print('no. of batches of X: ', seq_series_data.shape[0]) # 1 - prints the number of time-steps, each of these associated with a sequence of 2x2 matrices
#print(seq_series_data[0].shape)
print('shape of Y_pred and Y_true: ', seq_series_data_pred[:1,:,:].shape)
print('no. of predictions (equal to the no. of batches of X): ', seq_series_data_pred.shape[0]) 
            # 2 - prints the number of time-steps, each of these assocaited with one 2x2 matrix (model truth), must match 1 above
print('shape of X (excluding the true output): ', seq_series_data[0,:-1,:,:].shape) # prints the sequence-size used per prediction

shape of X (excluding the true output):  (3, 2, 2)
no. of batches of X:  49
shape of Y_pred and Y_true:  (1, 2, 2)
no. of predictions (equal to the no. of batches of X):  49
shape of X (excluding the true output):  (3, 2, 2)


**Declaring parameters and variables to be used in the model**

Define architectural (hyper-)parameters...

In [0]:
# define layers with #(nodes/layer), n_units_i, connected by weight matrices, w_i, and bias matrices, b_i

n_inp_n = series_data[:tsteps_train,:,:].shape[0]   # no. of 2x2 matrices fed as input ("units" per INPUT layer)
n_units_1 = 29    # no. of units in the first hidden layer


In [0]:
def reset_graph(seed=42):
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)


Construct input placeholders for the model...

In [0]:
reset_graph()
    
X = tf.placeholder(tf.float64, shape=(None, n_overlap, mat_row * mat_col), name='input')   # raw input (?x2x2)
Y_pred = tf.placeholder(tf.float64, shape=(None, mat_row*mat_col), name='model_output')
            # model output: predicting a 2x2 matrix "?" time-step(s) at a time
Y_true = tf.placeholder(tf.float64, shape=(None, mat_row*mat_col), name='true_output')      # true output: same shape as the model output
print(X.shape)

with tf.name_scope("ann_1"):    # test ANN model, "ann_1"
    h_1 = tf.layers.dense(X, n_units_1, activation=None, use_bias=False, name="hidden_layer_1")   # first hidden layer (linear activation)
    #h_2 = tf.layers.dense(h_1, n_units_1, activation=tf.nn.relu, use_bias=False, name="hidden_layer_2")   # first hidden layer (linear activation)
    #h_3 = tf.layers.dense(h_2, n_units_1, activation=tf.nn.relu, use_bias=False, name="hidden_layer_3")   # first hidden layer (linear activation)
    print(h_1.shape)
    f_1 = tf.get_variable("inp_to_outp_1", [3, 1], trainable=True, dtype=tf.float64)    # "filter" with zero strides, for "one-shot convolution"
    print(f_1.shape)
    hf_1 = tf.squeeze(tf.tensordot(h_1, f_1, axes = [1, 0], name="hidden_funnel_layer_1"), 2) # funneling layer, to reduce input dimensions to output dimensions 
    print(hf_1.shape)
    Y_pred = tf.layers.dense(hf_1, 4, activation=None, name="output_layer")          # output layer
    print(Y_pred.shape)

(?, 3, 4)
Instructions for updating:
Use keras.layers.Dense instead.
Instructions for updating:
Please use `layer.__call__` method instead.
(?, 3, 29)
(3, 1)
(?, 29)
(?, 4)


**Building a model**

In [0]:
defgrf = tf.get_default_graph()
defgrf.get_tensor_by_name('inp_to_outp_1:0')

<tf.Tensor 'inp_to_outp_1:0' shape=(3, 1) dtype=float64_ref>

Define the loss function...

In [0]:
with tf.name_scope("loss_ann_1"):   # loss function for the model "ann_1"
    loss = tf.losses.mean_squared_error(Y_pred, Y_true)   # element-wise MSE
    #loss = tf.math.reduce_mean(accuracy)    # the mean of the elements of the MSE tensor, computed above as "accuracy"
    
learn_rate = 0.1    # hyper-parameter for updating a variable v: v_new = v_old + delta_v = v_old - learn_rate*grad_v(loss)

with tf.name_scope("opt_ann_1"):    # optimizer for "ann_1"
    opt = tf.train.GradientDescentOptimizer(learn_rate)   # use the gradient descent optimization algorithm
    loss_min = opt.minimize(loss)   # operation to minimize "loss"
    


Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Set up initializer and session-saver objects (required for assigning initial values to the ```tf.Graph``` variables and saving the values of variables optimized in the session)...

In [0]:
n_epoch = 5000   # number of optimization steps on the WHOLE training dataset

init = tf.global_variables_initializer()    # initialize the model variables
saver = tf.train.Saver()    # saves the updated parameters from a session run

with tf.Session() as sess:
    init.run()
    for train_epoch in range(n_epoch):
        defgrf = tf.get_default_graph()
        _, lossval, threeten = sess.run([loss_min, loss, defgrf.get_tensor_by_name('inp_to_outp_1:0')], feed_dict={X: xtrain, Y_true: ytrain})
        if (train_epoch % 1000 == 0):
            print(train_epoch, "Loss (training):", lossval, Y_pred.eval(feed_dict={X: xtrain}).shape)
        
    saver.save(sess, "./trained_model")

0 Loss (training): 1.8862945 (49, 4)
1000 Loss (training): 0.00024184286 (49, 4)
2000 Loss (training): 3.6362042e-05 (49, 4)
3000 Loss (training): 5.6323343e-06 (49, 4)
4000 Loss (training): 2.0003706e-06 (49, 4)


In [0]:
threeten

array([[-0.13148336],
       [ 0.40517852],
       [ 0.87507961]])