### Nucleotide Chemical Property `NCP`

\begin{equation}
  x_i =
    \begin{cases}
      1 & \text{if $N_i \in \{A, G\} $ }\\
      0 & \text{if $N_i \in \{C, U\} $ }\\
    \end{cases}
\end{equation}

\begin{equation}
  y_i =
    \begin{cases}
      1 & \text{if $N_i \in \{A, C\} $ }\\
      0 & \text{if $N_i \in \{G, U\} $ }\\
    \end{cases}
\end{equation}

\begin{equation}
  z_i =
    \begin{cases}
      1 & \text{if $N_i \in \{A, U\} $ }\\
      0 & \text{if $N_i \in \{C, G\} $ }\\
    \end{cases}
\end{equation}

In [33]:
def ncp(sequence):
    for n in sequence:
        x = 1 if n == 'A' or n == 'G' else 0
        y = 1 if n == 'A' or n == 'C' else 0
        z = 1 if n == 'A' or n == 'U' else 0

        yield [x, y, z]

In [34]:
list(ncp('ACGU'))  # should be equal to [[1, 1, 1], [0, 1, 0], [1, 0, 0], [0, 0, 1]]

[[1, 1, 1], [0, 1, 0], [1, 0, 0], [0, 0, 1]]

### Nucleotide Density `ND`

**ND** incorporates the local occurrence frequency of a nucleotide and its distribution in **RNA**, it can be formulated as,
\begin{equation}
    d_i = \frac{1}{N_i} \sum_{j = 1}^{l} f(n_j),
    \phantom{123}
    f(n_j) = \begin{cases}
        1 & \text{if $N_j = q$ where $q \in \{A, C, U, G\}$}\\
        0 & \text{if otherwise}
    \end{cases}
\end{equation}

In [10]:
def nd(sequence):
    count = {'A': 0, 'C': 0, 'U': 0, 'G': 0}
    for index, n in enumerate(sequence):
        new_val = count[n] + 1
        count[n] = new_val

        yield new_val / (index + 1)

In [30]:
seq = 'AGCGUAAC'
density = nd(seq)
for item in seq:
    print(item, next(density))
list(nd('AGCGUAAC'))

A 1.0
G 0.5
C 0.3333333333333333
G 0.5
U 0.2
A 0.3333333333333333
A 0.42857142857142855
C 0.25


[1.0,
 0.5,
 0.3333333333333333,
 0.5,
 0.2,
 0.3333333333333333,
 0.42857142857142855,
 0.25]

### Pseudo nucleotide composition `PseKNC`

By combining both **NCP** and **ND** we have **PseKNC** which is given as,
$$
N_i = (x_i, y_i, z_i, d_i)
$$

In [26]:
def pse_knc(sequence):
    _nd = nd(sequence)
    _ncp = ncp(sequence)

    for i in _ncp:
        i.append(next(_nd))
        yield i

In [27]:
list(pse_knc('AGCGUAAC'))

[[1, 1, 1, 1.0],
 [1, 0, 0, 0.5],
 [0, 1, 0, 0.3333333333333333],
 [1, 0, 0, 0.5],
 [0, 0, 1, 0.2],
 [1, 1, 1, 0.3333333333333333],
 [1, 1, 1, 0.42857142857142855],
 [0, 1, 0, 0.25]]

In [31]:
def ncp(seq):
    seq_length = len(seq)
    ncp_lsit = [None] * seq_length * 3
    for j in range(seq_length):
        if seq[j] == 'A':
            ncp_lsit[j * 3] = 1
            ncp_lsit[j * 3 + 1] = 1
            ncp_lsit[j * 3 + 2] = 1
        elif seq[j] == 'U':
            ncp_lsit[j * 3] = 0
            ncp_lsit[j * 3 + 1] = 0
            ncp_lsit[j * 3 + 2] = 1
        elif seq[j] == 'C':
            ncp_lsit[j * 3] = 0
            ncp_lsit[j * 3 + 1] = 1
            ncp_lsit[j * 3 + 2] = 0
        elif seq[j] == 'G':
            ncp_lsit[j * 3] = 1
            ncp_lsit[j * 3 + 1] = 0
            ncp_lsit[j * 3 + 2] = 0
    return ncp_lsit

In [32]:
ncp('AGCGUAAC')

[1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0]