# <center> **07. RNN FUNDAMENTALS** </center>



## <span style="color:red"> Outline </span>
1. **Learning sequences: recurrent applications**
2. **RNN fundamentals**
3. **RNN from intuitive example**




In [1]:
#@title 1. MONTAR EL DRIVE  (If you are in in colab){ display-mode: "form" }
import os
from google.colab import drive
drive.mount('/content/drive', force_remount=True)
os.chdir('/content/drive/My Drive/')
print(os.getcwd())

Mounted at /content/drive
/content/drive/My Drive


In [2]:
#@title Load libraries { display-mode: "form" }
#@markdown Here is important to change to GPU

import numpy as np
import pandas as pd
import sys
import matplotlib.pyplot as plt
from sklearn import datasets
import tensorflow as tf
print(tf.__version__)

2.17.1


#**1. Learning sequences: Recurrent applications**

DNNs and CNNs are feedforward models, that follow a line of **learning task**, from  input data of fixed size to output response (predict labels).

- But, what about with data such as temporal series, videos, songs, writting text detection?.

- How to formulate a model to predict new data  but considering **historical inputs**, non defined sizes, and namely produce new data?

This kind of data is know as **sequences** and provide a bunch of new applications that requires other deep modeling and learning. In figure we can obsver some of principal applications on this area:

<img src="https://github.com/wDavid98/IA-docs/blob/main/data/RNN/sequence_models.png?raw=true" style="width:1500px;heigth:100px">

To try with these problems, in deep learning, there exist the family of  **Recurrent neural networks (RNNs)**  to handle with sequential data. So,


### **Welcome to RNN models!**

- An online example of  melody generation can be found [here](https://ganharp.ctpt.co/)

- An online example of  text generation is [here](https://transformer.huggingface.co/doc/xlnet)

- Some generated music is [here](https://soundcloud.com/gaurav-sharma-2269224/music-sequence-1-1)


# **2. RNN fundamentals**

A main drawback of DNN and CNN is that only observe current inputs, but no **historical** data. They dont exploit **temporal correlation** of sequences


<img src="https://github.com/wDavid98/IA-docs/blob/main/data/RNN/RNN_1.png?raw=true" style="width:200px;heigth:70px">

The **Recurrent neural networks** are learning methodologies that use **activations** as a new input, emulating **MEMORY**


<img src="https://github.com/wDavid98/IA-docs/blob/main/data/RNN/RNN_2.png?raw=true" style="width:200px;heigth:70px">

In this case,

$$a_t = \sigma(W_{aa}a_{t-1} + W_{ax}X_t + b_t)$$

and

$$y_t = \sigma(W_{ya}a_{t} + b_y)$$



**Go to the action!**


#**3. RNN from intuitive example**

We are going to run a model to generate dynosaur names from a set of training data. For doing so, we will train a RNN that receive sequences of characterers in training, and then, **letter by letter will generate dynosaur names**

### **3.1. Prepare the data**

The first step in all applications, but specially here, is the treatment of data. Here we need to define which is a dictionary, a sample, and a training set.

In [3]:
#@title **code** Initialize problem, read training **characters**
tf.random.set_seed(23)
np.random.seed(23)

dino_names = open('/content/drive/MyDrive/Academia40/Notebooks/data/dinos.txt','r').read()
dino_names = dino_names.lower()
print(dino_names)

aachenosaurus
aardonyx
abdallahsaurus
abelisaurus
abrictosaurus
abrosaurus
abydosaurus
acanthopholis
achelousaurus
acheroraptor
achillesaurus
achillobator
acristavus
acrocanthosaurus
acrotholus
actiosaurus
adamantisaurus
adasaurus
adelolophus
adeopapposaurus
aegyptosaurus
aeolosaurus
aepisaurus
aepyornithomimus
aerosteon
aetonyxafromimus
afrovenator
agathaumas
aggiosaurus
agilisaurus
agnosphitys
agrosaurus
agujaceratops
agustinia
ahshislepelta
airakoraptor
ajancingenia
ajkaceratops
alamosaurus
alaskacephale
albalophosaurus
albertaceratops
albertadromeus
albertavenator
albertonykus
albertosaurus
albinykus
albisaurus
alcovasaurus
alectrosaurus
aletopelta
algoasaurus
alioramus
aliwalia
allosaurus
almas
alnashetri
alocodon
altirhinus
altispinax
alvarezsaurus
alwalkeria
alxasaurus
amargasaurus
amargastegos
amargatitanis
amazonsaurus
ammosaurus
ampelosaurus
amphicoelias
amphicoelicaudia
amphisaurus
amtocephale
amtosaurus
amurosaurus
amygdalodon
anabisetia
anasazisaurus
anatosaurus
anatotitan

In [4]:
# Create a dictionary
## Strcutura de datos set -> No permite repetidos.
dictionary = list(set(dino_names))
dataset_size, dic_size = len(dino_names), len(dictionary)
print("characters for training: ", dataset_size, " dictionary: ",  dic_size)
print(dictionary)


characters for training:  19909  dictionary:  27
['a', 'h', 'm', 'i', 'f', 'z', 'j', 'b', '\n', 'e', 's', 'k', 'l', 'd', 'v', 'p', 'g', 'r', 'n', 'y', 'x', 'o', 'w', 'u', 'q', 't', 'c']


#**CHALLENGE**
Create a dictionary structure to run model with integer index associated to characters, and then prediction covert from integers to corresponding characters:

- One python dictionary with structure `{'char': int}`
- One python dictionary with structure `{'int': char}`



In [5]:
for ind,car in enumerate(sorted(dictionary)):
  print(car,ind)


 0
a 1
b 2
c 3
d 4
e 5
f 6
g 7
h 8
i 9
j 10
k 11
l 12
m 13
n 14
o 15
p 16
q 17
r 18
s 19
t 20
u 21
v 22
w 23
x 24
y 25
z 26


In [6]:
#@title **code student**
# Conversión de caracteres a índices y viceversa
def index_to_charact(dictionary):
  car_a_ind = { car:ind for ind,car in enumerate(sorted(dictionary))}
  ind_a_car = { ind:car for ind,car in enumerate(sorted(dictionary))}
  return car_a_ind, ind_a_car

car_a_ind, ind_a_car = index_to_charact(dictionary)
print(car_a_ind)
print(ind_a_car)

{'\n': 0, 'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6, 'g': 7, 'h': 8, 'i': 9, 'j': 10, 'k': 11, 'l': 12, 'm': 13, 'n': 14, 'o': 15, 'p': 16, 'q': 17, 'r': 18, 's': 19, 't': 20, 'u': 21, 'v': 22, 'w': 23, 'x': 24, 'y': 25, 'z': 26}
{0: '\n', 1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e', 6: 'f', 7: 'g', 8: 'h', 9: 'i', 10: 'j', 11: 'k', 12: 'l', 13: 'm', 14: 'n', 15: 'o', 16: 'p', 17: 'q', 18: 'r', 19: 's', 20: 't', 21: 'u', 22: 'v', 23: 'w', 24: 'x', 25: 'y', 26: 'z'}



#**CHALLENGE**
Now, we need to read and construct  the dataset. Read each training sample (each dynosaour name is a sample).

- Use `strip`
- Shuffle the dataset
- It should return a list of names randomly arrranged.



In [7]:
#@title **code:** Crear lista con ejemplos de entrenamiento y mezclarla aleatoriamente
def list_names_randomly_arranged():
  # Importing allowed libraries
  import numpy as np
  # Put a random seed
  np.random.seed(0)

  with open("/content/drive/MyDrive/Academia40/Notebooks/data/dinos.txt") as f:
    sample = f.readlines()
  #print(sample)

  sample_set = []
  for x in sample:
    # Strip remove spaces at the beginning and at the end of the string
    # https://www.w3schools.com/python/ref_string_strip.asp
    sample_set.append(x.lower().strip())
    #print(x.lower().strip())

  return sample_set

sample_set = list_names_randomly_arranged()
print(len(sample_set), sample_set)

1536 ['aachenosaurus', 'aardonyx', 'abdallahsaurus', 'abelisaurus', 'abrictosaurus', 'abrosaurus', 'abydosaurus', 'acanthopholis', 'achelousaurus', 'acheroraptor', 'achillesaurus', 'achillobator', 'acristavus', 'acrocanthosaurus', 'acrotholus', 'actiosaurus', 'adamantisaurus', 'adasaurus', 'adelolophus', 'adeopapposaurus', 'aegyptosaurus', 'aeolosaurus', 'aepisaurus', 'aepyornithomimus', 'aerosteon', 'aetonyxafromimus', 'afrovenator', 'agathaumas', 'aggiosaurus', 'agilisaurus', 'agnosphitys', 'agrosaurus', 'agujaceratops', 'agustinia', 'ahshislepelta', 'airakoraptor', 'ajancingenia', 'ajkaceratops', 'alamosaurus', 'alaskacephale', 'albalophosaurus', 'albertaceratops', 'albertadromeus', 'albertavenator', 'albertonykus', 'albertosaurus', 'albinykus', 'albisaurus', 'alcovasaurus', 'alectrosaurus', 'aletopelta', 'algoasaurus', 'alioramus', 'aliwalia', 'allosaurus', 'almas', 'alnashetri', 'alocodon', 'altirhinus', 'altispinax', 'alvarezsaurus', 'alwalkeria', 'alxasaurus', 'amargasaurus', 'a

In [8]:
#@title **code** (hidden layer) related with lengh of characters
n_a = 25

In [9]:
sample = sample_set[np.random.randint(0,len(sample_set))]
print("random sample: ", sample)
# to convert each expression in a numerical exppression
X = [None] + [car_a_ind[c] for c in sample]
print("X: ", X)

Y = X[1:] + [car_a_ind['\n']]
print('Y: ',Y)

x = np.zeros((len(X),1,dic_size))
print("x shape: ",x.shape)
onehot = tf.keras.utils.to_categorical(X[1:],dic_size).reshape(len(X)-1,1,dic_size)
print("onehot size: ",tf.keras.utils.to_categorical(X[1:],dic_size).reshape(len(X)-1,1,dic_size).shape)

## qué letra está transformando
ix = 0
print("Y[1] = ",Y[ix]," one-hot-encoding: ",onehot[ix])

x[1:,:,:] = onehot
y = tf.keras.utils.to_categorical(Y,dic_size).reshape(len(X),dic_size)

random sample:  jinzhousaurus
X:  [None, 10, 9, 14, 26, 8, 15, 21, 19, 1, 21, 18, 21, 19]
Y:  [10, 9, 14, 26, 8, 15, 21, 19, 1, 21, 18, 21, 19, 0]
x shape:  (14, 1, 27)
onehot size:  (13, 1, 27)
Y[1] =  10  one-hot-encoding:  [[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0.]]


Now, we need to do a function that generate samples for training and thereafter
we will be ready to start RNN construction

- Which are the outputs?
- How works `yield`?







In [10]:
def train_generator():
    while True:
        # **professor** take a random sample
        sample = sample_set[np.random.randint(0,len(sample_set))]
        print("random sample: ", sample)
        # **professor** to convert e  ach expression in a numerical exppression
        X = [None] + [car_a_ind[c] for c in sample]
        #print("X: ", X)

        # Crear "Y", resultado de desplazar "X" un caracter a la derecha
        Y = X[1:] + [car_a_ind['\n']]
        #print("Y: ", Y)

        # **professor** to represent "X" and "Y" in  one-hot
        x = np.zeros((len(X),1,dic_size))
        onehot = tf.keras.utils.to_categorical(X[1:],dic_size).reshape(len(X)-1,1,dic_size)
        x[1:,:,:] = onehot
        y = tf.keras.utils.to_categorical(Y,dic_size).reshape(len(X),dic_size)

        # **professor** initial activation in zeros
        a = np.zeros((len(X), n_a))

        yield (x, a), y

In [11]:
#@title Train generator of words
gen = train_generator()
next(gen)

random sample:  gongpoquansaurus


((array([[[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
           0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
  
         [[0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,
           0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
  
         [[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.,
           0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
  
         [[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.,
           0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
  
         [[0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,
           0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
  
         [[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
           1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
  
         [[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.,
           0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
  
         [[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 

### **2.2. Built the RNN model**

Once we have all the components to define our sequential problem, we can proceed on RNN configuration. As any deep model, we need to define inputs, outputs, and layers. In this case, we also need to define the recurrent_cell.

Remember how we can understand RNN models, and how this model can unfold to understand the prediction and sequence generation process.

<img src="https://github.com/wDavid98/IA-docs/blob/main/data/RNN/RNN_3.png?raw=true" style="width:150px;heigth:70px">

In [12]:
n_a = 25 # units
input  = tf.keras.layers.Input(shape=(None,dic_size))
a0 = tf.keras.layers.Input(shape=(n_a,))

recurrent_cell = tf.keras.layers.SimpleRNN(n_a, activation='tanh', return_state = True)
layer_out = tf.keras.layers.Dense(dic_size, activation='softmax')

output = []
hs, _ = recurrent_cell(input, initial_state=a0)
output.append(layer_out(hs))

model = tf.keras.models.Model([input,a0],output)
model.summary()

Now, we can test the model.

- Explain the below code. Which is the process in a RNN test?
- **What about the results?**

In [13]:
x = np.zeros((1,1,dic_size,))
a = np.zeros((1, n_a))
print(x, a)

generated_name = ''
fin_linea = '\n'
car = -1
name_length =20

for i in range(name_length):
  a, _ = recurrent_cell(tf.keras.backend.constant(x), initial_state=tf.keras.backend.constant(a))
  y = layer_out(a)
  predict = tf.keras.backend.eval(y)
  ix = np.random.choice(list(range(dic_size)),p=predict.ravel())
  #print(ix)
  #print("prediccion: ", predict)
  #print(i, type(i))
  car = ind_a_car[ix]
  generated_name += car

  x = tf.keras.utils.to_categorical(ix,dic_size).reshape(1,1,dic_size)
  a = tf.keras.backend.eval(a)



  if car==fin_linea:
    generated_name += '\n'
    break

print("generated name: ", generated_name)

[[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0.]]] [[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0.]]
generated name:  spolrkwzjtnoyb




Remember how is the recurrent process for testing, that will be also used on learning..

<img src="https://github.com/wDavid98/IA-docs/blob/main/data/RNN/RNN_4.png?raw=true" style="width:150px;heigth:70px">


### **2.3. Train and adjust the RNN model**

As any deep learning, we need to adjust weights to represent the problem. A better adjustment represent a better task developing. In RNN is also used gradient descent and back-propagation principles, so we will built the model and in next lecture we will go deeply on training model.

We are ready!...now we are goin to train our network



In [15]:
opt = tf.keras.optimizers.SGD(learning_rate=0.0005)
model.compile(optimizer=opt, loss='categorical_crossentropy')

BATCH_SIZE = 80			# number of training samples
NITS = 20			# iterations

for j in range(NITS):
    hist = model.fit(train_generator(), steps_per_epoch=BATCH_SIZE, epochs=1, verbose=0)
    #train Evolution
    if j%100 == 0:
        print('\nIteración: %d, Error: %f' % (j, hist.history['loss'][0]) + '\n')

random sample:  stegosaurus
random sample:  sinosaurus
random sample:  kulindadromeus
random sample:  microhadrosaurus
random sample:  aeolosaurus
random sample:  proceratops
random sample:  unenlagia
random sample:  mosaiceratops
random sample:  majungatholus
random sample:  laosaurus
random sample:  erlikosaurus
random sample:  camposaurus
random sample:  sinopelta
random sample:  chialingosaurus
random sample:  hoplitosaurus
random sample:  hoplosaurus
random sample:  alvarezsaurus
random sample:  suuwassea
random sample:  texasetes
random sample:  fukuivenator
random sample:  stephanosaurus
random sample:  dystylosaurus
random sample:  notocolossus
random sample:  huayangosaurus
random sample:  gideonmantellia
random sample:  galvesaurus
random sample:  ornithomimus
random sample:  protecovasaurus
random sample:  caudocoelus
random sample:  acristavus
random sample:  dyoplosaurus
random sample:  barsboldia
random sample:  diamantinasaurus
random sample:  eodromaeus
random sample:  

In [16]:
#@title **code** generate
def gen_name(car_a_ind, dic_size,n_a, name_length=20):
  x = np.zeros((1,1,dic_size,))
  a = np.zeros((1, n_a))
  generated_name = ''

  fin_linea = '\n'
  car = -1
  for i in range(name_length):
    #Generate prediction
    a, _ = recurrent_cell(tf.keras.backend.constant(x), initial_state=tf.keras.backend.constant(a))
    y = layer_out(a)
    predict = tf.keras.backend.eval(y)

    #select randomly an element with major probability the largest one
    ix = np.random.choice(list(range(dic_size)),p=predict.ravel())

    # convert to character and add to name
    car = ind_a_car[ix]
    generated_name += car

    # to built x_(t+1) = y_t, and a_t = a_(t-1)
    x = tf.keras.utils.to_categorical(ix,dic_size).reshape(1,1,dic_size)
    a = tf.keras.backend.eval(a)

    if car==fin_linea:
      generated_name += '\n'
      break

  return generated_name

In [17]:
for i in range(20):
    print("name", i, ": ",  gen_name(car_a_ind, dic_size,n_a, name_length=50))

name 0 :  piriaiypruareedrauroupourasoycaenstmarusoalencrirt
name 1 :  aatorcpoanbaoauskronusyedonurruon


name 2 :  oacioplybaneorimoeocyugsasxyl


name 3 :  hoeltaueoraa


name 4 :  ragrurn


name 5 :  umpaapco


name 6 :  an


name 7 :  arleycaulilbakrubiotnure


name 8 :  aaloosos


name 9 :  ass


name 10 :  brhhvha


name 11 :  nsa


name 12 :  twilkjaaregaot


name 13 :  olohpcartkau


name 14 :  qgejsompjra


name 15 :  


name 16 :  zezthgxisa


name 17 :  aroiouribxy


name 18 :  t


name 19 :  s




#**CHALLENGE**

- Re-train the model with additional steps to observe the performance
- Change the optimized, learning rate, and input_size


# References

[1. A guide to convolution arithmetic for deep learning](https://arxiv.org/pdf/1603.07285.pdf)
