# Recurrent Neural Network

**[RNN](https://en.wikipedia.org/wiki/Recurrent_neural_network)** are particularly useful for processing sequences of data, such as time series, speech, or text, because they maintain a state that captures information about earlier elements in the sequence.

### Characteristics

- Unlike standard feedforward neural networks, RNNs have a **memory** that captures information about what has been calculated so far, effectively allowing them to make predictions based on previous data in the sequence.

### Known Limitations

- The [Vanishing Gradient Problem](https://en.wikipedia.org/wiki/Vanishing_gradient_problem)
- The [Exploding Gradient Problem](https://www.educative.io/answers/what-is-exploding-gradient-problem)
- They can be computationally expensive and difficult to parallelize, limiting their scalability to large datasets.



### Import libraries and data

In [1]:
import numpy as np
import collections
from llm.config import DATA_DIR, IMAGES_DIR

### Data Preparation

In [2]:
from llm.core.rnn import prepare_data
from llm.core.functions import read_text_file

filepath = DATA_DIR.joinpath('Prometheus.txt')
chap_one = read_text_file(filepath)
X, y, vocab_size, max_length, tokenizer = prepare_data(chap_one)

2024-05-10 12:10:05.363954: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/marco/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package punkt to /home/marco/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /home/marco/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


### Create & Train the RRN Model

In [None]:
from llm.core.rnn import create_rnn_model

model = create_rnn_model(vocab_size)
model.fit(X, y, epochs=500, verbose=1)

Epoch 1/500
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 34ms/step - accuracy: 0.0191 - loss: 4.5745
Epoch 2/500
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 34ms/step - accuracy: 0.0905 - loss: 4.3779
Epoch 3/500
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 31ms/step - accuracy: 0.1601 - loss: 4.2617
Epoch 4/500
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 31ms/step - accuracy: 0.2681 - loss: 4.1460
Epoch 5/500
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 34ms/step - accuracy: 0.3247 - loss: 4.0483
Epoch 6/500
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 28ms/step - accuracy: 0.3952 - loss: 3.9452
Epoch 7/500
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 37ms/step - accuracy: 0.5215 - loss: 3.8336
Epoch 8/500
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 30ms/step - accuracy: 0.5391 - loss: 3.7494
Epoch 9/500
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[3

### Generate Text

In [None]:
from llm.core.rnn import generate_text

seed_text = 'The next day, Prometheus'
num_words_to_generate = 20
generated_text = generate_text(model, tokenizer, seed_text, num_words_to_generate, max_length)
generated_text

In [None]:
model.summary()

In [None]:
from llm.core.rnn import extract_embeddings
from llm.core.rnn import extract_sentiment_labels
from llm.core.visualization import map_labels_to_colors

embeddings, words = extract_embeddings(model, tokenizer)
category_labels = extract_sentiment_labels(words)

In [None]:
from llm.core.embeddings import apply_mds
from llm.core.embeddings import apply_isomap
from llm.core.embeddings import apply_tsne
from llm.core.embeddings import apply_pca

from llm.core.visualization import plot_embeddings
import matplotlib.pyplot as plt

# Apply dimensionality reduction
embeddings_mds = apply_mds(embeddings)
embeddings_isomap = apply_isomap(embeddings)
embeddings_tsne = apply_tsne(embeddings)
embeddings_pca = apply_pca(embeddings)

# Plot the results
fig = plot_embeddings(
    embeddings_list=[embeddings_pca, embeddings_mds, embeddings_isomap, embeddings_tsne],
    words=words,
    titles=['PCA', 'MDS', 'Isomap', 't-SNE'],
    colors=category_labels
)

plt.savefig(IMAGES_DIR.joinpath('prometheo.jpeg'))