# Visualize RNA Input Data for Convolutional Neural Network

A convolutional neural networks (CNN) are a class of of neural networks that are designed specifically for image processing. Recently,[ _Zhang et. al. (2019)_ ](https://www.frontiersin.org/articles/10.3389/fgene.2019.00467/full) created a convolutional neural network to predict RNA secondary structures. I would like to implement there technique and further improve on their architecture by implementing more advanced machine learning techniques. However, to begin, I will recreate there method of transforming the input data of RNA sequences into matrices that can be interpretted as images to the CNN.

# Sequences to Matrices

Given an RNA sequences, I will turn it into a 2D image. The algorithm can be found on pg. 5 in [ _Zhang et. al. (2019)_ ](https://www.frontiersin.org/articles/10.3389/fgene.2019.00467/full). The code I have written to implement this algorithm is

In [1]:
import numpy as np

# Helper function for finding possible base pairs
def pairs(a, b, x):
    if a == 'A' and b == 'U' or a == 'U' and b == 'A':
        return 2.0
    elif a == 'G' and b == 'C' or a == 'C' and b == 'G':
        return 3.0
    elif a == 'U' and b == 'G' or a == 'G' and b == 'U':
        return x
    else:
        return 0

def RNAmatrix(seq):
    N = len(seq)
    rnaMatrix = np.zeros((N, N)) # Create empty matrx
    for i in range(N):
        for j in range(N):
            weight = 0
            rnaMatrix[i][j] = pairs(seq[i], seq[j], 0.8) # Can R_i pair with R_j ? 
            if rnaMatrix[i][j] > 0: # Yes
                alpha = 0
                while i - alpha >= 0 and j + alpha <N: # Loop through possible other pairs in the sequence
                    P = pairs(seq[i -alpha], seq[j + alpha], 0.8)
                    if P == 0:
                        break
                    else:
                        weight += np.exp(-0.5 * alpha * alpha) * P
                        alpha +=1
            if weight > 0: # Loop through possible other pairs in the sequence
                beta = 1
                while i + beta < N and j - beta >=0:
                    P = pairs(seq[i + beta], seq[j - beta], 0.8)
                    if P == 0:
                        break
                    else:
                        weight += np.exp(-0.5 * beta * beta) * P
                        beta +=1
            rnaMatrix[i][j] = weight # Set the matrix element equal to the weight 
    rnaMatrix /= np.amax(rnaMatrix) # Normalize the matrix between 0 and 1
    rnaMatrix = np.array(rnaMatrix) # Transform to array
    return rnaMatrix