# Coding a RNN GRU layer from scratch

Take the weights and biases from a trained GRU layer in tensorflow. The GRU layer is hard wired to have 6 nodes in a layer and have 6 inputs. I can make this variable with time. I just want to test the GRU code for now

### GRU Architecture

![GRU architecture](images/gru.png)

Will use equarion below for GRU

![GRU equations](images/GRUeq.png)

In [1]:
import numpy as np
import math

In [2]:
# define a vectorised sigmoid function
def sigmoid(x):
  return 1 / (1 + math.exp(-x))

sigmoid_v = np.vectorize(sigmoid)

# define a vectorised tanh function
def mytanh(x):
  return math.tanh(x)

tanh_v =  np.vectorize(mytanh)

In [3]:
#Load matrices
# Layer is hardwired with 6 GRU nodes in a layer that take 6 inputs. 
# U is of size num_nodes x num_nodes
# W is of size num_nodes x num_inputs
# Biases are of size 1 x num_nodes
#So W and U are 6 x 6 matrices. 
# The biases are of size 1 x 6

matrices = np.load('GRUMatrices.npz', mmap_mode=None, allow_pickle=False, fix_imports=True)

#Get W, U snd bias matrices
wMatrix = matrices['wMatrix']
uMatrix = matrices['uMatrix']
biases = matrices['biases']

In [4]:
#Separate matrices into their z,r and h components
wZ = wMatrix[:,0:6]
wR = wMatrix[:,6:12]
wH = wMatrix[:,12:18]

uZ = uMatrix[:,0:6]
uR = uMatrix[:,6:12]
uH = uMatrix[:,12:18]

ibZ = biases[0,0:6]
ibR = biases[0,6:12]
ibH = biases[0,12:18]

rbZ = biases[1,0:6]
rbR = biases[1,6:12]
rbH = biases[1,12:18]

In [5]:
print('\nW matrices')
print(wMatrix)
print('\nwZ')
print(wZ)
print('\nwR')
print(wR)
print('\nwH')
print(wH)

print('\nU matrices')
print(uMatrix)
print('\nuZ')
print(uZ)
print('\nuR')
print(uR)
print('\nuH')
print(uH)

print('\nbias matrices')
print(biases)
print('\nibZ')
print(ibZ)
print('\nibR')
print(ibR)
print('\nibH')
print(ibH)
print('\nrbZ')
print(rbZ)
print('\nrbR')
print(rbR)
print('\nrbH')
print(rbH)


W matrices
[[ 1.47784948e-01 -6.36916831e-02 -6.44669294e-01 -5.29620498e-02
  -7.21437693e-01  5.82585454e-01  2.67814445e+00  1.32287696e-01
   2.04405591e-01  5.24857268e-02 -1.67315412e+00 -1.64620161e-01
   3.11126560e-01  4.08386141e-02  2.17992533e-02  1.02607355e-01
  -1.30847782e-01  9.23282206e-02]
 [-4.45018888e-01  8.12738612e-02 -2.89095491e-01  1.04009081e-02
  -4.20637697e-01 -3.83221179e-01 -9.99210104e-02  1.30941975e+00
   4.58170295e-01  9.33872938e-01  7.19227314e-01  1.71377391e-01
   6.26898646e-01 -3.17922011e-02  1.55206606e-01 -9.77380574e-02
  -1.81043684e-01 -1.86099619e-01]
 [ 8.21353868e-02  3.37142386e-02 -2.23273069e-01 -2.80014798e-02
   1.84208617e-01 -2.78432984e-02 -2.22398847e-01 -2.13686633e+00
  -9.05274212e-01 -4.50605661e-01 -3.14826757e-01 -1.20429240e-01
  -5.05215935e-02 -4.50068980e-01 -4.36236531e-01  3.04394886e-02
  -3.69305104e-01 -4.79589961e-02]
 [-3.19579542e-01  1.89596284e-02  1.17699847e-01 -1.89386141e-02
  -3.35439444e-02  2.5508

The output of each node is a scaler. h is a vector containing the output of each node. So for a layer of 6 nodes h will be a vector of size 6 for each step

In [6]:
#Calculate components of GRU
x_t = np.array([-0.005069427657872438, -0.1757027953863144, -0.9304834008216858, 0.524120032787323, 0.1267850697040558, 0.1844479888677597])
h_tminus1 = np.array([0.0, 0.0, 0.0, 0.0, 0.0, 0.0])

z_t =  np.matmul(np.transpose(wZ),x_t) + np.matmul(np.transpose(uZ), h_tminus1) + ibZ + rbZ
z_t = sigmoid_v(z_t)

r_t = np.matmul(np.transpose(wR),x_t) + np.matmul(np.transpose(uR), h_tminus1) + ibR + rbR
r_t = sigmoid_v(r_t)

for i in range(uH.shape[0]):
  uH_update = uH[:,i] * r_t[i]

h_that = np.matmul(np.transpose(wH), x_t) + np.matmul(np.transpose(uH_update), h_tminus1) + ibH + rbH 
h_that = tanh_v(h_that)

h = np.multiply(z_t, h_that) + np.multiply((1 - z_t),h_tminus1) 

In [7]:
print('h')
print(h)

h
[ 0.47965408 -0.34678561  0.27300359 -0.18528792 -0.26177192 -0.23279142]


Need to include input and output data to compare against