<a href="https://colab.research.google.com/github/datasigntist/deeplearning/blob/master/Recurrent_Neural_Networks_Introduction_using_Character_Level_Generation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Recurrent Neural Networks Introduction**

**Author**: Vishwanathan Raman
**Email**: datasigntist@gmail.com


---


**Change History**


*   17-Oct-2019 -- Initial Creation


---

**Credits**
The code articulated here has been inspired from coursera deeplearning.ai Assignments


---

**Use Case Description**

This notebook develops a recurrent neural network (rnn) on names of dinosaurs using Character Level Generation. The number of characters or the vocabulary size is all the alphabets and the new line character. The new line character serves as the end point till which the rnn will be applied. 

At certain intervals, dinosaur names are randomly generated using the developed rnn. In the initial iterations, you will find gibberish but progressively you will find the right words appearing

RNN is its simplest form looks like a diagram from circuit theory. LSTM and GRN is an extension of simplistic RNN. 



---


**Other Learning Resources**

The following youtube playlist lists all the concepts related to Deep Learning 


*   https://www.youtube.com/watch?v=yEfsDHymL0w&list=PLZnyIsit9AM7yeTZuBmezKNc6hFHUPImh
*   https://www.youtube.com/watch?v=YgpI2aROLlo&list=PLZnyIsit9AM7HBPn6m06ddzw_N9zGk--2
*   https://www.youtube.com/watch?v=186rxP6qfJA&list=PLZnyIsit9AM7VI4ylALdbeS93i-nonUzZ






## **Importing Libraries**

In [0]:
import numpy as np
import random

## **Simplistic RNN**

![alt text](https://github.com/datasigntist/imagesforNotebook/blob/master/RNN%20Cell.png?raw=true)

## **RNN Network**

This is a train of RNN Cells. Analogy for a RNN network could be a train with its compartments.

![alt text](https://github.com/datasigntist/imagesforNotebook/blob/master/RNN%20Network.png?raw=true)

## **Importing and Exploring Dataset**

In [21]:
!wget "https://raw.githubusercontent.com/datasigntist/datasetsForTraining/master/dinos.txt"

--2019-10-28 06:55:55--  https://raw.githubusercontent.com/datasigntist/datasetsForTraining/master/dinos.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 19909 (19K) [text/plain]
Saving to: ‘dinos.txt.1’


2019-10-28 06:55:55 (1.94 MB/s) - ‘dinos.txt.1’ saved [19909/19909]



In [3]:
data = open("dinos.txt", 'r').read()
data= data.lower()
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
print('There are %d total characters and %d unique characters in your data.' % (data_size, vocab_size))

There are 19909 total characters and 27 unique characters in your data.


In [4]:
dinoData = data.split("\n")
print("There are a total of ",len(dinoData)," dinosaur names")

There are a total of  1536  dinosaur names


In [0]:
char_to_ix = { ch:i for i,ch in enumerate(sorted(chars)) }
ix_to_char = { i:ch for i,ch in enumerate(sorted(chars)) }

In [31]:
print(char_to_ix)
print(ix_to_char)

{'\n': 0, 'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6, 'g': 7, 'h': 8, 'i': 9, 'j': 10, 'k': 11, 'l': 12, 'm': 13, 'n': 14, 'o': 15, 'p': 16, 'q': 17, 'r': 18, 's': 19, 't': 20, 'u': 21, 'v': 22, 'w': 23, 'x': 24, 'y': 25, 'z': 26}
{0: '\n', 1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e', 6: 'f', 7: 'g', 8: 'h', 9: 'i', 10: 'j', 11: 'k', 12: 'l', 13: 'm', 14: 'n', 15: 'o', 16: 'p', 17: 'q', 18: 'r', 19: 's', 20: 't', 21: 'u', 22: 'v', 23: 'w', 24: 'x', 25: 'y', 26: 'z'}


Lets take the simplest dinosaur name. As you can see here "Mei" is a dinosaur which is of shortest length. You can find more details about "Mei" here https://www.nature.com/news/2004/041011/full/news041011-7.html. 

Here we are looking to build a model that can auto generate a dinosaur name. In order to do that we need to go over the existing dinosaur names and find the pattern. Each letter in the dinosaur name appears in a specific sequence and is not random. Hence a sequence modelling is applied where each character is sent through the recurrent neural network. Each character is an input at a specific time step. Each character is represented by its respective one hot encoding vector which is given as the input into the network.

This is reflected in the diagram below. The one hot encoding vector shows the place where 1 is enabled in the vector of zeros. For each dinosaur name this step is repeated and the weights are optimized across the iterations.


In [60]:
dinoName = [elem for elem in dinoData if len(elem) == min([len(elem) for elem in dinoData])]
dinoNameElements = [elem for elem in dinoName[0]]
for elem in dinoNameElements:
  dinoVector = np.zeros((27,1))
  dinoVector[char_to_ix[elem]] = 1
  print("character :",elem,"\n one hot vector : ",dinoVector.T.ravel())

character : m 
 one hot vector :  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0.]
character : e 
 one hot vector :  [0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0.]
character : i 
 one hot vector :  [0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0.]


![alt text](https://github.com/datasigntist/imagesforNotebook/blob/master/RNN%20Network%20Sample.png?raw=true)

## **Helper Functions**

In [0]:
def softmax(x):
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum(axis=0)

def print_sample(sample_ix, ix_to_char):
    txt = ''.join(ix_to_char[ix] for ix in sample_ix)
    txt = txt[0].upper() + txt[1:]  # capitalize first character 
    print ('%s' % (txt, ), end='')    

def get_initial_loss(vocab_size, seq_length):
    return -np.log(1.0/vocab_size)*seq_length

def smooth(loss, cur_loss):
    return loss * 0.999 + cur_loss * 0.001

def initialize_parameters(n_a, n_x, n_y):
    """
    Initialize parameters with small random values
    
    Returns:
    parameters -- python dictionary containing:
                        Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)
                        Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)
                        Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)
                        b --  Bias, numpy array of shape (n_a, 1)
                        by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)
    """
    np.random.seed(1)
    Wax = np.random.randn(n_a, n_x)*0.01 # input to hidden
    Waa = np.random.randn(n_a, n_a)*0.01 # hidden to hidden
    Wya = np.random.randn(n_y, n_a)*0.01 # hidden to output
    b = np.zeros((n_a, 1)) # hidden bias
    by = np.zeros((n_y, 1)) # output bias
    
    parameters = {"Wax": Wax, "Waa": Waa, "Wya": Wya, "b": b,"by": by}
    
    return parameters

def rnn_step_forward(parameters, a_prev, x):
    
    Waa, Wax, Wya, by, b = parameters['Waa'], parameters['Wax'], parameters['Wya'], parameters['by'], parameters['b']
    a_next = np.tanh(np.dot(Wax, x) + np.dot(Waa, a_prev) + b) # hidden state
    p_t = softmax(np.dot(Wya, a_next) + by) # unnormalized log probabilities for next chars # probabilities for next chars 
    
    return a_next, p_t

def rnn_step_backward(dy, gradients, parameters, x, a, a_prev):
    
    gradients['dWya'] += np.dot(dy, a.T)
    gradients['dby'] += dy
    da = np.dot(parameters['Wya'].T, dy) + gradients['da_next'] # backprop into h
    daraw = (1 - a * a) * da # backprop through tanh nonlinearity
    gradients['db'] += daraw
    gradients['dWax'] += np.dot(daraw, x.T)
    gradients['dWaa'] += np.dot(daraw, a_prev.T)
    gradients['da_next'] = np.dot(parameters['Waa'].T, daraw)
    return gradients

def update_parameters(parameters, gradients, lr):

    parameters['Wax'] += -lr * gradients['dWax']
    parameters['Waa'] += -lr * gradients['dWaa']
    parameters['Wya'] += -lr * gradients['dWya']
    parameters['b']  += -lr * gradients['db']
    parameters['by']  += -lr * gradients['dby']
    return parameters

def rnn_forward(X, Y, a0, parameters, vocab_size = 27):
    
    # Initialize x, a and y_hat as empty dictionaries
    x, a, y_hat = {}, {}, {}
    
    a[-1] = np.copy(a0)
    
    # initialize your loss to 0
    loss = 0
    
    for t in range(len(X)):
        
        # Set x[t] to be the one-hot vector representation of the t'th character in X.
        # if X[t] == None, we just have x[t]=0. This is used to set the input for the first timestep to the zero vector. 
        x[t] = np.zeros((vocab_size,1)) 
        if (X[t] != None):
            x[t][X[t]] = 1
        
        # Run one step forward of the RNN
        a[t], y_hat[t] = rnn_step_forward(parameters, a[t-1], x[t])
        
        # Update the loss by substracting the cross-entropy term of this time-step from it.
        loss -= np.log(y_hat[t][Y[t],0])
        
    cache = (y_hat, a, x)
        
    return loss, cache

def rnn_backward(X, Y, parameters, cache):
    # Initialize gradients as an empty dictionary
    gradients = {}
    
    # Retrieve from cache and parameters
    (y_hat, a, x) = cache
    Waa, Wax, Wya, by, b = parameters['Waa'], parameters['Wax'], parameters['Wya'], parameters['by'], parameters['b']
    
    # each one should be initialized to zeros of the same dimension as its corresponding parameter
    gradients['dWax'], gradients['dWaa'], gradients['dWya'] = np.zeros_like(Wax), np.zeros_like(Waa), np.zeros_like(Wya)
    gradients['db'], gradients['dby'] = np.zeros_like(b), np.zeros_like(by)
    gradients['da_next'] = np.zeros_like(a[0])
    
    ### START CODE HERE ###
    # Backpropagate through time
    for t in reversed(range(len(X))):
        dy = np.copy(y_hat[t])
        dy[Y[t]] -= 1
        gradients = rnn_step_backward(dy, gradients, parameters, x[t], a[t], a[t-1])
    ### END CODE HERE ###
    
    return gradients, a

In [0]:
def clip(gradients, maxValue):
    '''
    Clips the gradients' values between minimum and maximum.
    
    Arguments:
    gradients -- a dictionary containing the gradients "dWaa", "dWax", "dWya", "db", "dby"
    maxValue -- everything above this number is set to this number, and everything less than -maxValue is set to -maxValue
    
    Returns: 
    gradients -- a dictionary with the clipped gradients.
    '''
    
    dWaa, dWax, dWya, db, dby = gradients['dWaa'], gradients['dWax'], gradients['dWya'], gradients['db'], gradients['dby']
   
    # clip to mitigate exploding gradients, loop over [dWax, dWaa, dWya, db, dby]. (≈2 lines)
    for gradient in [dWax, dWaa, dWya, db, dby]:
        np.clip(gradient,-1.*maxValue,maxValue,gradient)
    
    gradients = {"dWaa": dWaa, "dWax": dWax, "dWya": dWya, "db": db, "dby": dby}
    
    return gradients

In [0]:
def sample(parameters, char_to_ix, seed):
    """
    Sample a sequence of characters according to a sequence of probability distributions output of the RNN

    Arguments:
    parameters -- python dictionary containing the parameters Waa, Wax, Wya, by, and b. 
    char_to_ix -- python dictionary mapping each character to an index.
    seed -- used for grading purposes. Do not worry about it.

    Returns:
    indices -- a list of length n containing the indices of the sampled characters.
    """
    
    # Retrieve parameters and relevant shapes from "parameters" dictionary
    Waa, Wax, Wya, by, b = parameters['Waa'], parameters['Wax'], parameters['Wya'], parameters['by'], parameters['b']
    vocab_size = by.shape[0]
    n_a = Waa.shape[1]
    
    ### START CODE HERE ###
    # Step 1: Create the one-hot vector x for the first character (initializing the sequence generation). (≈1 line)
    x = np.zeros((vocab_size, 1))
    # Step 1': Initialize a_prev as zeros (≈1 line)
    a_prev = np.zeros((n_a, 1))
    
    # Create an empty list of indices, this is the list which will contain the list of indices of the characters to generate (≈1 line)
    indices = []
    
    # Idx is a flag to detect a newline character, we initialize it to -1
    idx = -1 
    
    # Loop over time-steps t. At each time-step, sample a character from a probability distribution and append 
    # its index to "indices". We'll stop if we reach 50 characters (which should be very unlikely with a well 
    # trained model), which helps debugging and prevents entering an infinite loop. 
    counter = 0
    newline_character = char_to_ix['\n']
    
    while (idx != newline_character and counter != 50):
        
        # Step 2: Forward propagate x using the equations (1), (2) and (3)
        a = np.tanh(np.dot(Wax,x)+np.dot(Waa,a_prev)+b)
        z = np.dot(Wya,a)+by
        y = softmax(z)
        
        # for grading purposes
        np.random.seed(counter+seed) 

        #print("Probability ",y.ravel()[np.argmax(y.ravel())])
        
        # Step 3: Sample the index of a character within the vocabulary from the probability distribution y
        idx = np.random.choice(list(range(vocab_size)), p = y.ravel())

        #print("idx ",idx)        

        # Append the index to "indices"
        indices.append(idx)
        
        # Step 4: Overwrite the input character as the one corresponding to the sampled index.
        x = np.zeros((vocab_size, 1))
        x[idx] = 1
        
        # Update "a_prev" to be "a"
        a_prev = a
        
        # for grading purposes
        seed += 1
        counter +=1
        
    ### END CODE HERE ###

    if (counter == 50):
        indices.append(char_to_ix['\n'])
    
    return indices

## **Modelling**

### **Step by Step execution of Optimization**

In [63]:
sampleDinoName = "mei"

index = 0

for loop in range(len(dinoData)):
  if (dinoData[loop] == sampleDinoName):
    index = loop
    break

print("Selecting this item for illustration ",dinoData[index])

# Represent it in the form of number

X = [None] + [char_to_ix[ch] for ch in dinoData[index]] 
Y = X[1:] + [char_to_ix["\n"]]

print(dinoData[1]," represented as training data ",X," and also Y ",Y)

# Perform one optimization step: Forward-prop -> Backward-prop -> Clip -> Update parameters
# Choose a learning rate of 0.01
# curr_loss, gradients, a_prev = optimize(X, Y, a_prev, parameters)
# Breaking down the forward propagation in RNN

# Initialize x, a and y_hat as empty dictionaries
x, a, y_hat = {}, {}, {}

#n_a -- number of units of the RNN cell
n_a = 50
dino_names = 2
vocab_size = 27

# Retrieve n_x and n_y from vocab_size
n_x, n_y = vocab_size, vocab_size

# Initialize parameters
parameters = initialize_parameters(n_a, n_x, n_y)
print("Waa ",parameters["Waa"].shape," Wax ",parameters["Wax"].shape," Wya ",parameters["Wya"].shape," b ",parameters["b"].shape," by ",parameters["by"].shape)

# Initialize the hidden state of your LSTM
a_prev = np.zeros((n_a, 1))

a[-1] = np.copy(a_prev)

# initialize your loss to 0
loss = 0

print("Length of X ",len(X))

# Execute for the length of vector which is the number of timesteps
for t in range(len(X)):
    
    # Set x[t] to be the one-hot vector representation of the t'th character in X.
    # if X[t] == None, we just have x[t]=0. This is used to set the input for the first timestep to the zero vector. 
    x[t] = np.zeros((vocab_size,1)) 
    if (X[t] != None):
        x[t][X[t]] = 1

    print("t:",t," The input to RNN:",x[t].T)
    
    # Run one step forward of the RNN
    a[t], y_hat[t] = rnn_step_forward(parameters, a[t-1], x[t])
    
    print("y_hat[t] ",y_hat[t].T," Shape ",y_hat[t].shape," Y[t] ",Y[t]," y_hat[t][Y[t],0] ",y_hat[t][Y[t],0])
    # Update the loss by substracting the cross-entropy term of this time-step from it.
    loss -= np.log(y_hat[t][Y[t],0])

    print("Loss ",loss)
    
    # The number of dinosaur names to print
    seed = 0
    for name in range(dino_names):
        
        # Sample indices and print them
        sampled_indices = sample(parameters, char_to_ix, seed)
        print_sample(sampled_indices, ix_to_char)
        
        seed += 1  # To get the same result for grading purposed, increment the seed by one. 

    print('\n')

Selecting this item for illustration  mei
mei  represented as training data  [None, 13, 5, 9]  and also Y  [13, 5, 9, 0]
Waa  (50, 50)  Wax  (50, 27)  Wya  (27, 50)  b  (50, 1)  by  (27, 1)
Length of X  4
t: 0  The input to RNN: [[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0.]]
y_hat[t]  [[0.03703704 0.03703704 0.03703704 0.03703704 0.03703704 0.03703704
  0.03703704 0.03703704 0.03703704 0.03703704 0.03703704 0.03703704
  0.03703704 0.03703704 0.03703704 0.03703704 0.03703704 0.03703704
  0.03703704 0.03703704 0.03703704 0.03703704 0.03703704 0.03703704
  0.03703704 0.03703704 0.03703704]]  Shape  (27, 1)  Y[t]  13  y_hat[t][Y[t],0]  0.037037037037037035
Loss  3.295836866004329
Nkzxwtdmfqoeyhsqwasjkjvu
Kneb


t: 1  The input to RNN: [[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0.]]
y_hat[t]  [[0.03710036 0.03700388 0.03702218 0.03703304 0.03704387 0.03704862
  0.03701184 0.03703686 0.03700148 0.0370376  0.037036

In [0]:
def optimize(X, Y, a_prev, parameters, learning_rate = 0.01):
    """
    Execute one step of the optimization to train the model.
    
    Arguments:
    X -- list of integers, where each integer is a number that maps to a character in the vocabulary.
    Y -- list of integers, exactly the same as X but shifted one index to the left.
    a_prev -- previous hidden state.
    parameters -- python dictionary containing:
                        Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)
                        Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)
                        Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)
                        b --  Bias, numpy array of shape (n_a, 1)
                        by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)
    learning_rate -- learning rate for the model.
    
    Returns:
    loss -- value of the loss function (cross-entropy)
    gradients -- python dictionary containing:
                        dWax -- Gradients of input-to-hidden weights, of shape (n_a, n_x)
                        dWaa -- Gradients of hidden-to-hidden weights, of shape (n_a, n_a)
                        dWya -- Gradients of hidden-to-output weights, of shape (n_y, n_a)
                        db -- Gradients of bias vector, of shape (n_a, 1)
                        dby -- Gradients of output bias vector, of shape (n_y, 1)
    a[len(X)-1] -- the last hidden state, of shape (n_a, 1)
    """
    # Forward propagate through time (≈1 line)
    loss, cache = rnn_forward(X, Y, a_prev, parameters)
    
    # Backpropagate through time (≈1 line)
    gradients, a = rnn_backward(X, Y, parameters, cache)
    
    # Clip your gradients between -5 (min) and 5 (max) (≈1 line)
    gradients = clip(gradients, 5)
    
    # Update parameters (≈1 line)
    parameters = update_parameters(parameters, gradients, learning_rate)
    
    return loss, gradients, a[len(X)-1]

In [0]:
def model(data, ix_to_char, char_to_ix, num_iterations = 35000, n_a = 50, dino_names = 7, vocab_size = 27):
    """
    Trains the model and generates dinosaur names. 
    
    Arguments:
    data -- text corpus
    ix_to_char -- dictionary that maps the index to a character
    char_to_ix -- dictionary that maps a character to an index
    num_iterations -- number of iterations to train the model for
    n_a -- number of units of the RNN cell
    dino_names -- number of dinosaur names you want to sample at each iteration. 
    vocab_size -- number of unique characters found in the text, size of the vocabulary
    
    Returns:
    parameters -- learned parameters
    """
    
    # Retrieve n_x and n_y from vocab_size
    n_x, n_y = vocab_size, vocab_size
    
    # Initialize parameters
    parameters = initialize_parameters(n_a, n_x, n_y)
    print("Waa ",parameters["Waa"].shape," Wax ",parameters["Wax"].shape," Wya ",parameters["Wya"].shape," b ",parameters["b"].shape," by ",parameters["by"].shape)

    # Initialize loss (this is required because we want to smooth our loss, don't worry about it)
    loss = get_initial_loss(vocab_size, dino_names)
    print("Initial Loss ",loss)

    # Build list of all dinosaur names (training examples).
    with open("dinos.txt") as f:
        examples = f.readlines()
    examples = [x.lower().strip() for x in examples]
    
    # Shuffle list of all dinosaur names
    np.random.seed(0)
    np.random.shuffle(examples)
    
    # Initialize the hidden state of your LSTM
    a_prev = np.zeros((n_a, 1))

    # Optimization loop
    for j in range(num_iterations):

        # Use the hint above to define one training example (X,Y) (≈ 2 lines)
        index = j % len(examples)
        X = [None] + [char_to_ix[ch] for ch in examples[index]] 
        Y = X[1:] + [char_to_ix["\n"]]

        # Perform one optimization step: Forward-prop -> Backward-prop -> Clip -> Update parameters
        # Choose a learning rate of 0.01
        curr_loss, gradients, a_prev = optimize(X, Y, a_prev, parameters)

        # Use a latency trick to keep the loss smooth. It happens here to accelerate the training.
        loss = smooth(loss, curr_loss)

        # Every 4000 Iteration, generate "n" characters thanks to sample() to check if the model is learning properly
        if j % 4000 == 0:
            
            print('Iteration: %d, Loss: %f' % (j, loss) + '\n')
            
            # The number of dinosaur names to print
            seed = 0
            for name in range(dino_names):
                
                # Sample indices and print them
                sampled_indices = sample(parameters, char_to_ix, seed)
                print_sample(sampled_indices, ix_to_char)
                
                seed += 1  # To get the same result for grading purposed, increment the seed by one. 
      
            print('\n')
        
    return parameters        

In [75]:
parameters = model(data, ix_to_char, char_to_ix, num_iterations=20000)

Waa  (50, 50)  Wax  (50, 27)  Wya  (27, 50)  b  (50, 1)  by  (27, 1)
Initial Loss  23.070858062030304
Iteration: 0, Loss: 23.087336

Nkzxwtdmfqoeyhsqwasjkjvu
Kneb
Kzxwtdmfqoeyhsqwasjkjvu
Neb
Zxwtdmfqoeyhsqwasjkjvu
Eb
Xwtdmfqoeyhsqwasjkjvu


Iteration: 4000, Loss: 25.901815

Mivrosaurus
Inee
Ivtroplisaurus
Mbaaisaurus
Wusichisaurus
Cabaselachus
Toraperlethosdarenitochusthiamamumamaon


Iteration: 8000, Loss: 24.070350

Onxusichepriuon
Kilabersaurus
Lutrodon
Omaaerosaurus
Xutrcheps
Edaksoje
Trodiktonus


Iteration: 12000, Loss: 23.291971

Onyxosaurus
Kica
Lustrepiosaurus
Olaagrraiansaurus
Yuspangosaurus
Eealosaurus
Trognesaurus


Iteration: 16000, Loss: 23.276015

Meutosaurus
Indabdosaurus
Itrsaurus
Macalosaurus
Yuspandon
Caahosaurus
Trodon


