<a href="https://colab.research.google.com/github/karnamohit/kranka_ucm/blob/master/tf1_test_1-1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Importing all the useful libraries...

In [0]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

Checking the version of TensorFlow, NumPy...

In [2]:
#print('TensorFlow version info:\t',tf.__version__)
!pip show tensorflow
print(' ')
print('------------------------------------------------------------')
print(' ')
!pip show numpy
#print('NumPy version info:\t \t',np.__version__)

Name: tensorflow
Version: 1.15.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /usr/local/lib/python3.6/dist-packages
Requires: keras-applications, six, wrapt, google-pasta, keras-preprocessing, grpcio, tensorflow-estimator, protobuf, absl-py, opt-einsum, numpy, termcolor, tensorboard, astor, gast, wheel
Required-by: stable-baselines, mesh-tensorflow, magenta, fancyimpute
 
------------------------------------------------------------
 
Name: numpy
Version: 1.16.5
Summary: NumPy is the fundamental package for array computing with Python.
Home-page: https://www.numpy.org
Author: Travis E. Oliphant et al.
Author-email: None
License: BSD
Location: /usr/local/lib/python3.6/dist-packages
Requires: 
Required-by: yellowbrick, xgboost, xarray, wordcloud, umap-learn, torchvision, torchtext, torch, thinc, Theano, tflearn, tensorflow, tenso

**Making, processing raw data**

Make up data (2x2, sequentially indexed matrices)...

In [0]:
series_data = np.zeros((100,2,2), np.float64)        # "time"-series data for a 2x2 evolving matrix with 100 time-steps, randomly initialized
for i in range(series_data.shape[0]):
    series_data[i,:,:] = (i**0.5)*3 + 5
#print(series_data)
tsteps = int(series_data.shape[0])         # total number of time-steps in the series
tsteps_train = int(tsteps/2)               # number of time-steps used for training
mat_row = int(series_data.shape[1])
mat_col = int(series_data.shape[2])

#print(series_data[tsteps_train-2,:,:])
#print(series_data[tsteps_train-1,:,:])
#print(series_data[tsteps_train,:,:])

Define a sequence (multiple consecutive 2x2 matrices) size, ```len_seq```, to be fed in as input (equivalent to the amount of memory in previous time-steps, in our case), and the amount of overlap, ```n_overlap``` between any two adjacent input sequences...

In [0]:
len_seq = 4
n_overlap = 3

if (n_overlap >= len_seq):
    raise Exception("n_overlap must be less than len_seq")

Build tensor of sequences, call it ```seq_series_data```...

In [5]:
n_seq = int((tsteps_train - (len_seq - n_overlap)) / (len_seq - n_overlap)) # number of sequences of size len_seq
#print(n_seq)
seq_series_data = np.zeros((n_seq,len_seq,mat_row,mat_col), np.float64)
seq_series_data_pred = np.zeros((1,mat_row,mat_col), np.float64)
#print(seq_series_data_pred)
i = 0
#j = 0
k = 0
while (i <= n_seq):           # the training sequences start at time-step 1 (series_data[0] element) in this case
    try:
        for j in range(len_seq - 1):
            seq_series_data[k,j,:,:] = series_data[i+j,:,:]
    except IndexError:
        break
    #print(seq_series_data[k,:,:,:])
    #print(j)
    seq_series_data_pred = np.append(seq_series_data_pred, series_data[i+j+1:i+j+2,:,:], axis = 0)
    i += len_seq - n_overlap
    #j += 1
    k += 1
    #print(k,i)
seq_series_data_pred = np.delete(seq_series_data_pred, seq_series_data_pred[0,:,:], axis=0) # true output
#print(seq_series_data_pred)
#print(type(seq_series_data))



In [6]:
# test to check data-type and shape of seq_series_data, seq_series_data_pred, and a slice seq_series_data

print('shape of X (excluding the true output): ',seq_series_data[0,:-1,:,:].shape) 
print('no. of batches of X: ', seq_series_data.shape[0]) # 1 - prints the number of time-steps, each of these associated with a sequence of 2x2 matrices
#print(seq_series_data[0].shape)
print('shape of Y_pred and Y_true: ', seq_series_data_pred[:1,:,:].shape)
print('no. of predictions (equal to the no. of batches of X): ', seq_series_data_pred.shape[0]) 
            # 2 - prints the number of time-steps, each of these assocaited with one 2x2 matrix (model truth), must match 1 above
print('shape of X (excluding the true output): ', seq_series_data[0,:-1,:,:].shape) # prints the sequence-size used per prediction

shape of X (excluding the true output):  (3, 2, 2)
no. of batches of X:  49
shape of Y_pred and Y_true:  (1, 2, 2)
no. of predictions (equal to the no. of batches of X):  49
shape of X (excluding the true output):  (3, 2, 2)


**Declaring parameters and variables to be used in the model**

Construct input placeholders for the model...

In [7]:
X = tf.placeholder(tf.float64, shape=(None, mat_row, mat_col), name='input')    # raw input (?x2x2)
X = tf.expand_dims(X,-1)
#x1 = tf.math.exp(X)    # model input (modified raw input) (3x2x2)
Y_pred = tf.placeholder(tf.float64, shape=(None, mat_row, mat_col),name='model_output')
            # model output: predicting a 2x2 matrix "?" time-step(s) at a time
Y_true = tf.placeholder(tf.float64, shape=(None, mat_row,mat_col),name='true_output')   # true output: same shape as the model output
X.get_shape()

TensorShape([Dimension(None), Dimension(2), Dimension(2), Dimension(1)])

**Building a model**

Define activation functions, operations, etc. ...

In [8]:
# using "activation=None" (linear operation) in tf.layers.dense for hidden layer h_1

def softplus(z, name="softplus"):
    return tf.math.softplus(z)

'''# using a convolution operation (with a custom filter), conv_3D, for hidden layer h_2

# define a custom filter, filter_1, to be used with the one-shot (stride-less) convolution operation, conv_3D_oneshot
#     ignoring padding for now
def filter_1(d, h, w):
    # the latest sequence gets the highest contribution
    filter = np.zeros((d, h, w), np.float64)
    for i in range(d):
        for j in range(h):
            for k in range(w):
                filter[i,j,k] = np.sqrt(np.exp(d))
    return tf.convert_to_tensor(filter, dtype=tf.int32)

# define a (stride-less) function for the convolution operation
def conv_3D_oneshot(z, name="custom_conv_3D_1"):
    shape = z.get_shape().as_list()
    depth = shape[0].value
    height = shape[1].value
    width = shape[2].value
    filter = filter_1(depth, height, width)
    return tf.math.reduce_sum(tf.math.multiply(z, filter), 0)'''

'# using a convolution operation (with a custom filter), conv_3D, for hidden layer h_2\n\n# define a custom filter, filter_1, to be used with the one-shot (stride-less) convolution operation, conv_3D_oneshot\n#     ignoring padding for now\ndef filter_1(d, h, w):\n    # the latest sequence gets the highest contribution\n    filter = np.zeros((d, h, w), np.float64)\n    for i in range(d):\n        for j in range(h):\n            for k in range(w):\n                filter[i,j,k] = np.sqrt(np.exp(d))\n    return tf.convert_to_tensor(filter, dtype=tf.int32)\n\n# define a (stride-less) function for the convolution operation\ndef conv_3D_oneshot(z, name="custom_conv_3D_1"):\n    shape = z.get_shape().as_list()\n    depth = shape[0].value\n    height = shape[1].value\n    width = shape[2].value\n    filter = filter_1(depth, height, width)\n    return tf.math.reduce_sum(tf.math.multiply(z, filter), 0)'

Define architectural (hyper-)parameters...

In [0]:
# define layers with #(nodes/layer), n_units_i, connected by weight matrices, w_i, and bias matrices, b_i

n_inp_n = seq_series_data[0,:-1,:,:].shape[0]   # no. of 2x2 matrices fed as input ("units" per INPUT layer)
n_units_1 = 4   # no. of units in the first hidden layer
n_units_1_conv = n_units_1
            # no. of (convolution) units in the output layer; must equal the number of units in the immediately preceding hidden layer
n_outp_n  = seq_series_data_pred[:1,:,:].shape[0]
            # number of (2x2) tensors the model must output (the same size as the sequence-size of the model output, 1 in the current case)

Build the layers...

In [10]:
with tf.name_scope("ann_1"):    # test ANN model, "ann_1"
    h_1 = tf.layers.dense(X, n_units_1, activation=softplus, use_bias=False, name="hidden_layer_1")    # first hidden layer (linear activation)
    f_1 = tf.get_variable("inp_to_outp_1", [n_outp_n, n_inp_n], trainable=True, dtype=tf.float64)    # "filter" with zero strides, for "one-shot convolution"
    hf_1 = tf.tensordot(f_1, h_1, axes = 1, name="hidden_funnel_layer_1")   # funneling layer, to reduce input dimensions to output dimensions 
    h_2 = tf.layers.dense(hf_1, n_outp_n, activation=None,use_bias=False, name="output_layer")    # output layer
    Y_pred = tf.squeeze(h_2, [3])    # model output

Instructions for updating:
Use keras.layers.Dense instead.
Instructions for updating:
Please use `layer.__call__` method instead.


In [11]:
Y_pred.get_shape()

TensorShape([Dimension(1), Dimension(2), Dimension(2)])

Define the loss function...

In [17]:
with tf.name_scope("loss_ann_1"):   # loss function for the model "ann_1"
    accuracy = tf.losses.mean_squared_error(Y_pred, Y_true)   # element-wise MSE
    loss = tf.math.reduce_mean(accuracy)    # the mean of the elements of the MSE tensor, computed above as "accuracy"

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Choose the optimization algorithm to be used...

In [0]:
learn_rate = 0.1    # hyper-parameter for updating a variable v: v_new = v_old + delta_v = v_old - learn_rate*grad_v(loss)

with tf.name_scope("opt_ann_1"):    # optimizer for "ann_1"
    opt = tf.train.GradientDescentOptimizer(learn_rate)   # use the gradient descent optimization algorithm
    loss_min = opt.minimize(loss)   # operation to minimize "loss"

Set up initializer and session-saver objects (required for assigning initial values to the ```tf.Graph``` variables and saving the values of variables optimized in the session)...

In [0]:
init = tf.global_variables_initializer()    # initialize the model variables
saver = tf.train.Saver()    # saves the updated parameters from a session run

**Running the model**

Assigning no. of training cycles, batch-sizes, etc. ...

In [0]:
n_epoch = 100   # number of optimization steps on the WHOLE training dataset

Use a context manager to run the ```tf.Session```...

In [21]:
with tf.Session() as sess:
    init.run()
    for train_epoch in range(n_epoch):
        for batch_iter in range(seq_series_data.shape[0]):
            inp = seq_series_data[batch_iter,:-1,:,:]
            out = np.broadcast_to(seq_series_data_pred[batch_iter,:,:], [1, 2, 2])
            sess.run(loss_min, feed_dict={X: inp, Y_true: out})
        if (train_epoch % 5 == 0):
            acc_train = accuracy.eval(feed_dict={X: inp, Y_true: out})
            print(train_epoch, "Batch accuracy (training):", acc_train)

ValueError: ignored