Title: Circular interconnected network: new type of neural network designed for Translation initiation site prediction  

Readme:
* TIS_dataset forder is assumed in the same directory of this notebook
* python version: 3.8x / numpy version: 1.24.3 / 
* please refer the describing document. This notebook assume you read this document https://docs.google.com/document/d/1Z1xFTYC3RXYgH9YIxCmCeJJEO783fIz0KfJQoNL2dC0/edit?usp=sharing
* This code is modified from    https://github.com/TheIndependentCode/Neural-Network
    * Theoretical background and implementation of original version of this code is descrived in
        * https://youtu.be/pauPCy_s0Ok?si=oYWBM5mMPHMc6vVf    (Overall code include Dense layer)
        * https://youtu.be/Lakz2MoHy6o?si=JjQ39sk87yFRvuhm    (Convolutional layer)
        * https://youtu.be/AbLvJVwySEo?si=zsEGSYAGg7HGAfM4    (Softmax layer)
* Because of too long execution time, the models or hyperparameters are not optimized. Anyone who has willing might easily find improved version of each.

This code will implement neural network as object and first use it for simplized mnist dataset, and to the DNA sequences dataset for translation initiation site prediction.

For describing the neural network,  
Let  
Y as output for an layer  
X as input for an layer  
W as weight for an layer  
B as bias for an layer  
E as Error for network(layers), which finally be calculated via error function  

del_x/del_y: denote partial derivative of x by y  
For each backpropagation, the code is assuming the del_E/del_Y is given by later(= previous in backward fassion) layer.  
(the initial del_E/del_Y is calculated by taking derivative of error function)  
And call this given del_E/del_Y as output_gradient.  

Thus each layer should  
1\) return del_E/del_Y(= output_gradient) in backward methods to give it for next backward calculation.
 
Also for optimizing parameter, it should calculate  
2) del_E/del_W  
3) del_E/del_B  
and use this adjust weights and biases.  

This class will be the base class for all layers.

In [1]:
# Base Layer class
class Layer:
    def __init__(self):
        self.input = None
        self.output = None

    def forward(self, input):
        #TODO: return output
        pass

    def backward(self, output_gradient, learning_rate):
        #TODO: 1)update parameters, 2) return input gradient
        pass

First build the Dense layer

In [2]:
import numpy as np
from scipy import signal

In [3]:
# Layer: Dense

# let input size(= n_inputs) i
# let output_size j

#forward
# input(= X) dimension: (i x 1), column vector
# output(= Y) dimension: (j x 1), column vector
# weight(= W) dimension: (j, i)
# bias(= B) dimension: (j x 1), column vector

# Y = W (dot) X + B ((dot) is dot product)

#backward (given: del_E/del_Y as output_gradient, dimension: (j x 1))
# del_E/del_W(dimension: (j, i)) = del_E/del_Y (dot) X_transpose 
# del_E/del_B(dimension: (j x 1), column vector) = del_E/del_Y 
# del_E/del_X(dimension: (j x 1), column vector) = W_transpose (dot) del_E/del_Y 

class Dense(Layer):
    def __init__(self, input_size, output_size):
        self.weights = np.random.randn(output_size, input_size) # W : (j, i)
        self.bias = np.random.randn(output_size, 1) # B : (j, 1)

    def forward(self, input):
        self.input = input # X : (i, 1)
        return np.dot(self.weights, self.input) + self.bias # (j, i) (dot) (i, 1) + (j, 1)

    def backward(self, output_gradient, learning_rate):
        weights_gradient = np.dot(output_gradient, self.input.T) # (j, i) = (j, 1) (dot) (1, i)
        self.weights -= learning_rate * weights_gradient
        self.bias -= learning_rate * output_gradient
        return np.dot(self.weights.T, output_gradient) # del_E/del_X

Build Circular Layer, note that n_together is size of relation

In [4]:
# Layer: Circular

# let input size(= n_inputs) i
# let n_together t
# then output_size will be i_C_t (i combination t), let this as j

#forward
# input(= X) dimension: (i x 1), column vector / this input X will makes combinations by n_together,
# which I will call "relations", or X_comb as abbreviated.
# relations(= X_comb) dimension: (j x 1), column vector
# output(= Y) dimension: (j x 1), column vector
# weight(= W) dimension: (j x 1), column vector
# bias(= B) dimension: (j x 1), column vector
# Y = W * X_comb + B (elementwise operation)


#backward (given: del_E/del_Y as output_gradient, dimension: (j x 1))
# del_E/del_W(dimension: (j x 1), column vector) = del_E/del_Y * X_comb (elementwise)
# del_E/del_B(dimension: (j x 1), column vector) = del_E/del_Y 
# del_E/del_X(dimension: (i x 1), column vector) : (descrived at the documentation, take care of subscript of combitation form, not a inteiger)
# let let k = (a combination), Each element del_E/del_xi in del_E/del_X 
# = sum(del_E/del_yk * X_comb[k_th element]) for k in all possible combinations, if k contains i

from itertools import combinations
import math

def product(iterable): # for forward method
    result = 1
    for x in iterable:
        result = result*x 
    return result
        

class Circular(Layer):
    def __init__(self, n_inputs, n_together=2):
        self.n_together = n_together
        self.len_weight = int(math.factorial(n_inputs) / (math.factorial(n_together)*math.factorial(n_inputs - n_together))) # self.len_weight = j
        self.weights = np.random.randn(self.len_weight, 1) # W : (j, 1)
        self.bias = np.random.randn(self.len_weight, 1) # B : (j, 1)
        self.index = [i for i in combinations(range(n_inputs), n_together)]

    def forward(self, input): # input(= X): (i x 1)
        self.input = input 
        self.relations = np.array([product(i) for i in combinations(input, self.n_together)]).reshape(self.len_weight, 1) # X_comb: (j, 1)
        return self.relations * self.weights + self.bias # (j, 1)

    def backward(self, output_gradient, learning_rate): #output_gradient dimension: (j x 1)        
        weights_gradient = output_gradient * self.relations # (j, 1)
        self.weights -= learning_rate * weights_gradient
        self.bias -= learning_rate * output_gradient # (j, 1)
        
        input_gradient = []
        for i in range(self.input.shape[0]): # for each input variable, for example x1, x2, ...
            summed_term = 0
            select_positions = [p for p, t in enumerate(self.index) if i in t] # select the position in which containing i
            for j in select_positions: 
                summed_term += float(output_gradient[j] * self.relations[j])
            input_gradient.append(summed_term)
        input_gradient = np.array(input_gradient).reshape(self.input.shape[0], 1) # (i, 1)
        return input_gradient

                

Build Convolutiona layer

In [5]:
# Layer: Convolution
class Convolutional(Layer):
    def __init__(self, input_shape, kernel_size, depth):
        input_depth, input_height, input_width = input_shape
        self.depth = depth
        self.input_shape = input_shape
        self.input_depth = input_depth
        self.output_shape = (depth, input_height - kernel_size + 1, input_width - kernel_size + 1)
        self.kernels_shape = (depth, input_depth, kernel_size, kernel_size)
        self.kernels = np.random.randn(*self.kernels_shape)
        self.biases = np.random.randn(*self.output_shape)

    def forward(self, input):
        self.input = input
        self.output = np.copy(self.biases)
        for i in range(self.depth):
            for j in range(self.input_depth):
                self.output[i] += signal.correlate2d(self.input[j], self.kernels[i, j], "valid")
        #print(self.output.shape)
        return self.output

    def backward(self, output_gradient, learning_rate):
        kernels_gradient = np.zeros(self.kernels_shape)
        input_gradient = np.zeros(self.input_shape)

        for i in range(self.depth):
            for j in range(self.input_depth):
                kernels_gradient[i, j] = signal.correlate2d(self.input[j], output_gradient[i], "valid")
                input_gradient[j] += signal.convolve2d(output_gradient[i], self.kernels[i, j], "full")

        self.kernels -= learning_rate * kernels_gradient
        self.biases -= learning_rate * output_gradient
        return input_gradient

This class will be the base for all activation class

In [6]:
# Base Activation layer class
class Activation(Layer):
    def __init__(self, activation, activation_prime):
        self.activation = activation
        self.activation_prime = activation_prime

    def forward(self, input): #input : (i, 1), output : (i, 1)
        self.input = input
        return self.activation(self.input)

    def backward(self, output_gradient, learning_rate):
        # TODO: 1) update parameters(but no trainable parameters), 2) return input gradient
        return np.multiply(output_gradient, self.activation_prime(self.input))

Build several activation functions

In [7]:
# Activations
class Tanh(Activation):
    def __init__(self):
        tanh = lambda x: np.tanh(x)
        tanh_prime = lambda x: 1 - np.tanh(x)**2
        super().__init__(tanh, tanh_prime)

class ReLU(Activation):
    def __init__(self):
        relu = lambda x: np.maximum(0, x)
        relu_prime = lambda x: (x > 0).astype(float)
        super().__init__(relu, relu_prime)

class Sigmoid(Activation):
    def __init__(self):
        def safe_sigmoid(x):
            # For preventing overflow, limit the input range
            x_clipped = np.clip(x, -500, 500)
            return 1 / (1 + np.exp(-x_clipped))

        def safe_sigmoid_prime(x):
            sigmoid_output = safe_sigmoid(x)
            return sigmoid_output * (1 - sigmoid_output)

        super().__init__(safe_sigmoid, safe_sigmoid_prime)

class Softmax(Layer):
    def forward(self, input):
        # Subtract the max value from the input array for numerical stability
        shift = input - np.max(input)
        exps = np.exp(shift)
        self.output = exps / np.sum(exps)
        return self.output
    
    def backward(self, output_gradient, learning_rate):
        n = np.size(self.output)
        # Adjusted backward computation with the safe forward output
        return np.dot((np.identity(n) - self.output.T) * self.output, output_gradient)

Build Error functions, here mse and binary_cross_entropy was built

In [8]:
# Error
def mse(y_true, y_pred):
    return np.mean(np.power(y_true - y_pred, 2))

def mse_prime(y_true, y_pred):
    return 2 * (y_pred - y_true) / np.size(y_true)

def binary_cross_entropy(y_true, y_pred):
    # Avoid division by zero and log(0) by clipping y_pred values
    y_pred = np.clip(y_pred, 1e-15, 1 - 1e-15)
    return np.mean(-y_true * np.log(y_pred) - (1 - y_true) * np.log(1 - y_pred))

def binary_cross_entropy_prime(y_true, y_pred):
    # Avoid division by zero by clipping y_pred values
    y_pred = np.clip(y_pred, 1e-15, 1 - 1e-15)
    return ((1 - y_true) / (1 - y_pred) - y_true / y_pred) / np.size(y_true)

Build predict loop for forward propagation and train loop for back propagation

In [9]:
# def predict, train
def predict(network, input):
    output = input
    for layer in network:
        output = layer.forward(output)
    return output

def train(network, loss, loss_prime, x_train, y_train, x_val=None, y_val=None, epochs = 1000, learning_rate = 0.01, verbose = True):
    for e in range(epochs):
        #train
        error = 0
        for x, y in zip(x_train, y_train):
            # forward
            output = predict(network, x)

            # error
            error += loss(y, output)

            # backward
            grad = loss_prime(y, output)
            for layer in reversed(network):
                grad = layer.backward(grad, learning_rate)

        error = error / len(x_train)
        if verbose:
            if x_val is not None and len(x_val) > 0:
                print(f"{e + 1}/{epochs}, train error={error}", end = ' ')
            else:
                print(f"{e + 1}/{epochs}, train error={error}")    
                
        #validation
        if x_val is not None and len(x_val) > 0:            
            error_val = 0
            for x, y in zip(x_val, y_val):
                # forward
                output = predict(network, x)
                # error
                error_val += loss(y, output)
            error_val = error_val / len(x_val)
            if verbose:
                print(f"  validation error={error_val}")

Build reshaping layer

In [10]:
# reshape
class Reshape(Layer):
    def __init__(self, input_shape, output_shape):
        self.input_shape = input_shape
        self.output_shape = output_shape

    def forward(self, input):
        return np.reshape(input, self.output_shape)

    def backward(self, output_gradient, learning_rate):
        return np.reshape(output_gradient, self.input_shape)

For the fast check whether the implemented layer is working or not, prepare mnist dataset, the handwrited number image

In [11]:
#import numpy as np
import keras
from keras.datasets import mnist
import np_utils

* version: np-utils                     0.6.0
* version: keras                        2.13.1

In [12]:
#from collections.abc import Iterable
#from keras.utils.np_utils import to_categorical
from tensorflow.keras.utils import to_categorical

As it is simple check, exculde the data except for 0 and 1

In [13]:
def preprocess_data(x, y, limit):
    zero_index = np.where(y == 0)[0][:limit]
    one_index = np.where(y == 1)[0][:limit]
    all_indices = np.hstack((zero_index, one_index))
    all_indices = np.random.permutation(all_indices)
    x, y = x[all_indices], y[all_indices]
    x = x.reshape(len(x), 1, 28, 28)
    x = x.astype("float32") / 255
    y = to_categorical(y)
    y = y.reshape(len(y), 2, 1)
    return x, y

# load MNIST from server, limit to 100 images per class since we're not training on GPU
(m_x_train, m_y_train), (m_x_test, m_y_test) = mnist.load_data()
m_x_train, m_y_train = preprocess_data(m_x_train, m_y_train, 500)
m_x_test, m_y_test = preprocess_data(m_x_test, m_y_test, 500)


Build convolutional model, to check whether error of train and validation is decrease as it is trained.

In [14]:
# Convolution network

# neural network
network = [
    Convolutional((1, 28, 28), 3, 5),
    Sigmoid(),
    Reshape((5, 26, 26), (5 * 26 * 26, 1)),
    Dense(5 * 26 * 26, 100),
    Sigmoid(),
    Dense(100, 2),
    Sigmoid()
]

# train
train(
    network,
    binary_cross_entropy,
    binary_cross_entropy_prime,
    m_x_train,
    m_y_train,
    m_x_test,
    m_y_test,
    epochs=20,
    learning_rate=0.01
)

# test
Right = 0
Wrong = 0
for x, y in zip(m_x_test, m_y_test):
    output = predict(network, x)
    if np.argmax(output) == np.argmax(y):
        Right += 1
    else:
        Wrong += 1
print(f"Test, Right: {Right} times, Wrong: {Wrong} times, Wrong/Right = {Wrong/Right}")

1/20, train error=0.39147647175344863   validation error=0.18406659619871465
2/20, train error=0.12700877654532094   validation error=0.11721094763286864
3/20, train error=0.08418077814935798   validation error=0.0869000686717636
4/20, train error=0.06363767849222185   validation error=0.06924793848096711
5/20, train error=0.05115577174635473   validation error=0.05844430175633433
6/20, train error=0.0432034682809363   validation error=0.05098965058524624
7/20, train error=0.03762562657276694   validation error=0.04607041820943734
8/20, train error=0.03339814601677857   validation error=0.04250611061566109
9/20, train error=0.029921225044868594   validation error=0.03938813119552638
10/20, train error=0.02702313439888719   validation error=0.036327639407849416
11/20, train error=0.024510698664234574   validation error=0.03374252108812848
12/20, train error=0.022368460194230975   validation error=0.03232467059298107
13/20, train error=0.020642817459206556   validation error=0.0318070368

This seems good decrease in train and validation error, note that here the test set is equal to validation set, not a seperate data.

Now build other model using Circular layer, to check whether Circular layer works or not. Here, because Circular layer itself might give output with too large size, used convolutional before it to reduce the input size into Circular.

In [15]:
# Convolution network, with Circular Layer
# Both model works actually, but I think there is no need to showing both output

# neural network
'''
network = [
    Convolutional((1, 28, 28), 3, 5),
    Sigmoid(),
    Reshape((5, 26, 26), (5 * 26 * 26, 1)),
    Dense(5 * 26 * 26, 30),
    Sigmoid(),
    Circular(30, 2), # n_together=2, thus output will be 30_C_2 = 435
    ReLU(),
    Dense(435, 2),
    Softmax()

]

# train
train(
    network,
    binary_cross_entropy,
    binary_cross_entropy_prime,
    m_x_train,
    m_y_train,
    m_x_test,
    m_y_test,
    epochs=20,
    learning_rate=0.0012
)

# test
Right = 0
Wrong = 0
for x, y in zip(m_x_test, m_y_test):
    output = predict(network, x)
    if np.argmax(output) == np.argmax(y):
        Right += 1
    else:
        Wrong += 1
print(f"Test, Right: {Right} times, Wrong: {Wrong} times, Wrong/Right = {Wrong/Right}")

'''

# Convolution network, with Circular Layer

# neural network
network = [
    Convolutional((1, 28, 28), 3, 5),
    Sigmoid(),
    Reshape((5, 26, 26), (5 * 26 * 26, 1)),
    Dense(5 * 26 * 26, 30),
    Sigmoid(),
    Circular(30, 3), # n_together=3, thus output will be 30_C_3 = 4060
    ReLU(),
    Dense(4060, 2),
    Softmax()

]

# train
train(
    network,
    binary_cross_entropy,
    binary_cross_entropy_prime,
    m_x_train,
    m_y_train,
    m_x_test,
    m_y_test,
    epochs=10,
    learning_rate=0.0006
)

# test
Right = 0
Wrong = 0
for x, y in zip(m_x_test, m_y_test):
    output = predict(network, x)
    if np.argmax(output) == np.argmax(y):
        Right += 1
    else:
        Wrong += 1
print(f"Test, Right: {Right} times, Wrong: {Wrong} times, Wrong/Right = {Wrong/Right}")

1/10, train error=1.1619911340891247   validation error=0.3645011552236331
2/10, train error=0.22568269522445436   validation error=0.31406259337417175
3/10, train error=0.1538040279762966   validation error=0.27558185870328783
4/10, train error=0.10572494692151056   validation error=0.22982245756546826
5/10, train error=0.08359157008819565   validation error=0.19453750109677512
6/10, train error=0.06734858519735075   validation error=0.18963391283293485
7/10, train error=0.06181058752041263   validation error=0.16061836428355428
8/10, train error=0.05132044872437344   validation error=0.10942657847583305
9/10, train error=0.04232472020926567   validation error=0.09404775238975877
10/10, train error=0.0364895351147966   validation error=0.08717474975947688
Test, Right: 983 times, Wrong: 17 times, Wrong/Right = 0.017293997965412006


It seems works fine, test the layer again in more complex model.

In [16]:
# Convolution network, with Circular Layer

# neural network
network = [
    Convolutional((1, 28, 28), 3, 5),
    Sigmoid(),
    Reshape((5, 26, 26), (5 * 26 * 26, 1)),
    Dense(5 * 26 * 26, 30),
    Sigmoid(),
    Circular(30, 2), # n_together=2, thus output will be 30_C_2 = 435
    ReLU(),
    Dense(435, 50),
    ReLU(),
    Dense(50, 2),
    Softmax()

]

# train
train(
    network,
    binary_cross_entropy,
    binary_cross_entropy_prime,
    m_x_train,
    m_y_train,
    m_x_test,
    m_y_test,
    epochs=10,
    learning_rate=0.0005
)

# test
Right = 0
Wrong = 0
for x, y in zip(m_x_test, m_y_test):
    output = predict(network, x)
    if np.argmax(output) == np.argmax(y):
        Right += 1
    else:
        Wrong += 1
print(f"Test, Right: {Right} times, Wrong: {Wrong} times, Wrong/Right = {Wrong/Right}")

1/10, train error=1.9343855213128818   validation error=1.6454316332799002
2/10, train error=1.0560433361368684   validation error=0.5413971950768551
3/10, train error=0.5557046178308283   validation error=0.45006505767932414
4/10, train error=0.32075517788667846   validation error=0.3579076298304094
5/10, train error=0.2713502388016024   validation error=0.609028501253306
6/10, train error=0.23071739309064407   validation error=0.2287461649941782
7/10, train error=0.2140869999916326   validation error=0.15705199304946862
8/10, train error=0.13127163905656228   validation error=0.16164471725819587
9/10, train error=0.14388122806349063   validation error=0.11859220105844318
10/10, train error=0.09879278956102391   validation error=0.14094382491561072
Test, Right: 975 times, Wrong: 25 times, Wrong/Right = 0.02564102564102564


This also shows good decrease in train and validation error, note that again the test set is equal to validation set, not a seperate data.  

  
* Remark: This complex model may fail to learn in same hyperparameter, depend on its random initialization. I experienced more than half of trial gives fail to learning, giving about 17.xxx error for every epoch. It might be think that this circular layer model has high degree of freedom. Possible reason for that I can think is that each terms of output associated only just 2 input variable, as here I used size of relation(, which is n_together) as 2, adjusting parameter leads not as stable change as conventional layers.

Thus layers are working properly, now apply it onto DNA dataset to predict translation initiation site

Prepare TIS data,   
arabTIS_train.pos file are DNA sequences containing TIS,   
arabTIS_train.neg file are DNA sequences does not contain TIS   
All the sequence length is identical as 300bp   

Pos data will be labeled as True, Neg data will be labeled as False

Read each file,

In [17]:
f1 = open("TIS_dataset/arabTIS_train.pos", 'r')
f2 = open("TIS_dataset/arabTIS_train.neg", 'r')
pos_lines = f1.readlines()
neg_lines = f2.readlines()
for (idx, line) in enumerate(pos_lines):
    pos_lines[idx] = line.strip()
for (idx, line) in enumerate(neg_lines):
    neg_lines[idx] = line.strip()
f1.close()
f2.close()
print(f"number of sequence at pos: {len(pos_lines)}, sequence length: {len(pos_lines[5])}")
print(f"number of sequence at neg: {len(neg_lines)}, sequence length: {len(neg_lines[5])}")

number of sequence at pos: 21342, sequence length: 300
number of sequence at neg: 21342, sequence length: 300


Initialize the variables for one hot encode the sequence data

In [18]:
TIS_pos = np.zeros((len(pos_lines), 1, 300, 4))
TIS_neg = np.zeros((len(neg_lines), 1, 300, 4))
print(f"TIS_pos.shape: {TIS_pos.shape}, TIS_neg.shape: {TIS_neg.shape}")

TIS_pos.shape: (21342, 1, 300, 4), TIS_neg.shape: (21342, 1, 300, 4)


One hot encode the sequences,
([0, 0, 0, 1], [0, 0, 1, 0], [0, 1, 0, 0], [1, 0, 0, 0], [0, 0, 0, 0]) for (A, C, G, T, N)

In [19]:
for idx_0, i in enumerate(pos_lines):
    for idx_1, j in enumerate(i):
        if j == 'A':
            TIS_pos[idx_0][0][idx_1] = np.array([0, 0, 0, 1])
        elif j == 'C':
            TIS_pos[idx_0][0][idx_1] = np.array([0, 0, 1, 0])
        elif j == 'G':
            TIS_pos[idx_0][0][idx_1] = np.array([0, 1, 0, 0])
        elif j == 'T':
            TIS_pos[idx_0][0][idx_1] = np.array([1, 0, 0, 0])
        else:
            TIS_pos[idx_0][0][idx_1] = np.array([0, 0, 0, 0])

for idx_0, i in enumerate(neg_lines):
    for idx_1, j in enumerate(i):
        if j == 'A':
            TIS_neg[idx_0][0][idx_1] = np.array([0, 0, 0, 1])
        elif j == 'C':
            TIS_neg[idx_0][0][idx_1] = np.array([0, 0, 1, 0])
        elif j == 'G':
            TIS_neg[idx_0][0][idx_1] = np.array([0, 1, 0, 0])
        elif j == 'T':
            TIS_neg[idx_0][0][idx_1] = np.array([1, 0, 0, 0])
        else:
            TIS_neg[idx_0][0][idx_1] = np.array([0, 0, 0, 0])

print(TIS_pos.shape, TIS_neg.shape)

(21342, 1, 300, 4) (21342, 1, 300, 4)


Label the TIS positive data as true, [[1], [0]], a column vector,   
Label the TIS negative data as False, [[0], [1]], a column vector

In [20]:
TIS_data = []
for i in range(len(TIS_pos)):
    TIS_data.append((TIS_pos[i], np.array([[1], [0]])))
    TIS_data.append((TIS_neg[i], np.array([[0], [1]])))

Prepare the train, validation, test dataset.

In [21]:
import random

train_data_index = random.sample(range(42684), 42684)
val_data_index = []
[val_data_index.append(train_data_index.pop()) for i in range(4000)]
test_data_index = []
[test_data_index.append(val_data_index.pop()) for i in range(3500)]
train_data_index.sort()
val_data_index.sort()
test_data_index.sort()

In [22]:
x_train = []
y_train = []
x_val = []
y_val = []
x_test = []
y_test = []

for i in range(42684):
    if i in val_data_index:
        x_val.append(TIS_data[i][0])
        y_val.append(TIS_data[i][1])
    elif i in test_data_index:
        x_test.append(TIS_data[i][0])
        y_test.append(TIS_data[i][1])
    elif i in train_data_index:
        x_train.append(TIS_data[i][0])
        y_train.append(TIS_data[i][1])

print('train:', f'len(x_train): {len(x_train)}, len(y_train): {len(y_train)}')
print('validation:', f'len(x_val): {len(x_val)}, len(y_val): {len(y_val)}')
print('test:', f'len(x_test): {len(x_test)}, len(y_test): {len(y_test)}')

train: len(x_train): 38684, len(y_train): 38684
validation: len(x_val): 500, len(y_val): 500
test: len(x_test): 3500, len(y_test): 3500


The dataset is ready succesfully.

Reshape it into numpy array, enable numpy operation in network

In [23]:
x_train = np.array(x_train)
y_train = np.array(y_train)
x_val = np.array(x_val)
y_val = np.array(y_val)
x_test = np.array(x_test)
y_test = np.array(y_test)

Check the dimension of prepared data

In [24]:
x_train[0].shape

(1, 300, 4)

In [25]:
y_train.shape

(38684, 2, 1)

In [26]:
y_train[0].shape

(2, 1)

Giving desired shape

Now make model containing the Circular layer, to see whether it can capture the features of TIS.

In [27]:
network = [
    #Convolutional((1, 300, 4), 3, 4), # input shape (1, 300, 4) / kernel size 2 x 2 / number of kernel 4
    #Sigmoid(),
    Reshape((1, 300, 4), (1 * 300 * 4, 1)),
    Dense(1 * 300 * 4, 50),
    Sigmoid(),
    Circular(50, 2), # n_together=2, thus output will be 50_C_2
    ReLU(),
    Dense(1225, 2),
    Softmax()

    

]

train(
    network,
    binary_cross_entropy,
    binary_cross_entropy_prime,
    x_train,
    y_train,
    x_val,
    y_val,
    epochs=80,
    learning_rate=0.00055
)

# test
Right = 0
Wrong = 0
for x, y in zip(x_test, y_test):
    output = predict(network, x)
    if np.argmax(output) == np.argmax(y):
        Right += 1
    else:
        Wrong += 1
print(f"Test, Right: {Right} times, Wrong: {Wrong} times, Wrong/Right = {Wrong/Right}")

1/80, train error=2.222203702719871   validation error=1.349319896815469
2/80, train error=1.4118358523443713   validation error=1.0799821464037012
3/80, train error=1.1439994298328755   validation error=0.9422419348575862
4/80, train error=0.9918070375472801   validation error=0.8505113747710742
5/80, train error=0.8959257211944895   validation error=0.7816558839202828
6/80, train error=0.8284857606498162   validation error=0.73750038672399
7/80, train error=0.777082605188581   validation error=0.7025743586482597
8/80, train error=0.7351011298541124   validation error=0.6695665844946849
9/80, train error=0.6994207235516217   validation error=0.642243930837283
10/80, train error=0.6685529811028743   validation error=0.617401625719925
11/80, train error=0.6410240146940213   validation error=0.5944500405382183
12/80, train error=0.6161521564531883   validation error=0.5708929153891944
13/80, train error=0.5933099369635566   validation error=0.5481784011537344
14/80, train error=0.5725332

It shows decreasing at the error on both train and validation. Circular layer can capture the feature of TIS succesfully.

Test circular layer again in more complex model

In [28]:
network = [
    #Convolutional((1, 300, 4), 3, 4), # input shape (1, 300, 4) / kernel size 2 x 2 / number of kernel 4
    #Sigmoid(),
    Reshape((1, 300, 4), (1 * 300 * 4, 1)),
    Dense(1 * 300 * 4, 500),
    ReLU(),
    Dense(500, 50),
    Sigmoid(),
    Circular(50, 2), # n_together=2, thus output will be 50_C_2
    ReLU(),
    Dense(1225, 300),
    ReLU(),
    Dense(300, 2),
    Softmax()

]

train(
    network,
    binary_cross_entropy,
    binary_cross_entropy_prime,
    x_train,
    y_train,
    x_val,
    y_val,
    epochs=80,
    learning_rate=0.000004
)

# test
Right = 0
Wrong = 0
for x, y in zip(x_test, y_test):
    output = predict(network, x)
    if np.argmax(output) == np.argmax(y):
        Right += 1
    else:
        Wrong += 1
print(f"Test, Right: {Right} times, Wrong: {Wrong} times, Wrong/Right = {Wrong/Right}")

1/80, train error=15.20988437295994   validation error=16.700661469511044
2/80, train error=15.028111540875155   validation error=16.594196875464284
3/80, train error=14.752672267052429   validation error=16.303972926180027
4/80, train error=14.51329653374669   validation error=15.7364224104542
5/80, train error=14.274532174888344   validation error=15.70931995059532
6/80, train error=14.074482999459459   validation error=15.539106027408703
7/80, train error=13.950851361692784   validation error=15.64453875407856
8/80, train error=13.74583334793326   validation error=15.547577726369255
9/80, train error=13.557855455436014   validation error=15.462629517532513
10/80, train error=13.362731923319036   validation error=15.19895775268855
11/80, train error=13.186683544380854   validation error=14.978312147379299
12/80, train error=12.983959103740606   validation error=14.655859795517593
13/80, train error=12.818832776451124   validation error=14.60804326347095
14/80, train error=12.68648967

It seems the model is too complex for the dataset or need to find proper parameters, but it can decrease in its error function.

Remark: I tried to build the model that only uses conventional layers such as Dense and Convolution, for the compoarision with the model using Circular layer, but failed to build such model, giving all models no decrease of error at high value error.

In [29]:
#control model
network = [
    #Convolutional((1, 300, 4), 3, 4), # input shape (1, 300, 4) / kernel size 2 x 2 / number of kernel 4
    #Sigmoid(),
    Reshape((1, 300, 4), (1 * 300 * 4, 1)),
    Dense(1 * 300 * 4, 500),
    ReLU(),
    Dense(500, 200), # n_together=2, thus output will be 50_C_2
    ReLU(),
    Dense(200, 50), # n_together=2, thus output will be 50_C_2
    ReLU(),
    Dense(50, 2),
    Softmax()

]

train(
    network,
    binary_cross_entropy,
    binary_cross_entropy_prime,
    x_train,
    y_train,
    x_val,
    y_val,
    epochs=10,
    learning_rate=0.0001
)

# test
Right = 0
Wrong = 0
for x, y in zip(x_test, y_test):
    output = predict(network, x)
    if np.argmax(output) == np.argmax(y):
        Right += 1
    else:
        Wrong += 1
print(f"Test, Right: {Right} times, Wrong: {Wrong} times, Wrong/Right = {Wrong/Right}")

1/10, train error=17.263584885506553   validation error=19.411017020817813
2/10, train error=17.256195282143214   validation error=19.411017020817813
3/10, train error=17.256195282143214   validation error=19.411017020817813
4/10, train error=17.256195282143214   validation error=19.411017020817813
5/10, train error=17.256195282143214   validation error=19.411017020817813
6/10, train error=17.256195282143214   validation error=19.411017020817813
7/10, train error=17.256195282143214   validation error=19.411017020817813
8/10, train error=17.256195282143214   validation error=19.411017020817813
9/10, train error=17.256195282143214   validation error=19.411017020817813
10/10, train error=17.256195282143214   validation error=19.411017020817813
Test, Right: 1766 times, Wrong: 1734 times, Wrong/Right = 0.9818799546998868


In [30]:
# control model
network = [
    Convolutional((1, 300, 4), 4, 4), # input shape (1, 300, 4) / kernel size 2 x 2 / number of kernel 4
    Sigmoid(),
    Reshape((4, 297, 1), (4 * 297 * 1, 1)),
    Dense(4 * 297 * 1, 200),
    ReLU(),
    Dense(200, 50), # n_together=2, thus output will be 50_C_2
    ReLU(),
    Dense(50, 2),
    Softmax()

]

train(
    network,
    binary_cross_entropy,
    binary_cross_entropy_prime,
    x_train,
    y_train,
    x_val,
    y_val,
    epochs=10,
    learning_rate=0.0001
)


# test
Right = 0
Wrong = 0
for x, y in zip(x_test, y_test):
    output = predict(network, x)
    if np.argmax(output) == np.argmax(y):
        Right += 1
    else:
        Wrong += 1
print(f"Test, Right: {Right} times, Wrong: {Wrong} times, Wrong/Right = {Wrong/Right}")

1/10, train error=17.27903915087038   validation error=15.128159172808166
2/10, train error=17.268537311153477   validation error=19.480095373205067
3/10, train error=17.252535362731177   validation error=19.411017020817813
4/10, train error=17.254409573519823   validation error=19.411017020817813
5/10, train error=17.254409573519823   validation error=19.411017020817813
6/10, train error=17.254409573519823   validation error=19.411017020817813
7/10, train error=17.254409573519823   validation error=19.411017020817813
8/10, train error=17.254409573519823   validation error=19.411017020817813
9/10, train error=17.254409573519823   validation error=19.411017020817813
10/10, train error=17.254409573519823   validation error=19.411017020817813
Test, Right: 1767 times, Wrong: 1733 times, Wrong/Right = 0.9807583474816073


In [31]:
#control model
network = [
    #Convolutional((1, 300, 4), 3, 4), # input shape (1, 300, 4) / kernel size 2 x 2 / number of kernel 4
    #Sigmoid(),
    Reshape((1, 300, 4), (1 * 300 * 4, 1)),
    Dense(1 * 300 * 4, 500),
    ReLU(),
    Dense(500, 500), # n_together=2, thus output will be 50_C_2
    ReLU(),
    Dense(500, 200), # n_together=2, thus output will be 50_C_2
    ReLU(),
    Dense(200, 200), # n_together=2, thus output will be 50_C_2
    ReLU(),
    Dense(200, 50), # n_together=2, thus output will be 50_C_2
    ReLU(),
    Dense(50, 2),
    Softmax()

]

train(
    network,
    binary_cross_entropy,
    binary_cross_entropy_prime,
    x_train,
    y_train,
    x_val,
    y_val,
    epochs=10,
    learning_rate=0.0001
)

# test
Right = 0
Wrong = 0
for x, y in zip(x_test, y_test):
    output = predict(network, x)
    if np.argmax(output) == np.argmax(y):
        Right += 1
    else:
        Wrong += 1
print(f"Test, Right: {Right} times, Wrong: {Wrong} times, Wrong/Right = {Wrong/Right}")

1/10, train error=17.290123745987664   validation error=15.128159172808166
2/10, train error=17.290123745987664   validation error=15.128159172808166
3/10, train error=17.290123745987664   validation error=15.128159172808166
4/10, train error=17.290123745987664   validation error=15.128159172808166
5/10, train error=17.290123745987664   validation error=15.128159172808166
6/10, train error=17.290123745987664   validation error=15.128159172808166
7/10, train error=17.290123745987664   validation error=15.128159172808166
8/10, train error=17.290123745987664   validation error=15.128159172808166
9/10, train error=17.290123745987664   validation error=15.128159172808166
10/10, train error=17.290123745987664   validation error=15.128159172808166
Test, Right: 1733 times, Wrong: 1767 times, Wrong/Right = 1.0196191575302942
