# Problema 4: Geração de Texto com RNN, LSTM e GRU

**Objetivo:** Comparar o desempenho de três arquiteturas de redes neurais recorrentes (`SimpleRNN`, `LSTM` e `GRU`) em uma tarefa de geração de texto.

**Abordagem:**
1.  Carregar e pré-processar o texto de três livros da série Harry Potter.
2.  Utilizar um **Tokenizador** para criar um vocabulário de palavras.
3.  Treinar cada modelo para prever a próxima palavra em uma sequência.
4.  Gerar texto com cada modelo para comparar visualmente a coerência e qualidade.

### 1. Importação das Bibliotecas

Vamos importar todas as bibliotecas necessárias para o projeto.

In [1]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, SimpleRNN, LSTM, GRU, Embedding
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras import mixed_precision

import os
import gc

print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
print("Num CPUs Available: ", len(tf.config.list_physical_devices('CPU')))
print("TensorFlow version:", tf.__version__)

mixed_precision.set_global_policy('mixed_float16')

2025-10-12 09:42:21.092657: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Num GPUs Available:  1
Num CPUs Available:  1
TensorFlow version: 2.20.0


### 2. Carregamento e Pré-processamento dos Dados

Carregamos os textos, juntamos em um único corpus e usamos o `Tokenizer` para criar um vocabulário de palavras.

In [2]:
# Carregar e concatenar os textos
text = ""
for i in range(1, 4):
    filepath = os.path.join('dataset', f'harry_potter_{i}.txt')
    with open(filepath, 'r', encoding='utf-8') as f:
        text += f.read()

# --- Tokenização por Palavra ---
tokenizer = Tokenizer(oov_token="<unk>")
tokenizer.fit_on_texts([text])

word_to_int = tokenizer.word_index
int_to_word = {i: w for w, i in word_to_int.items()}
vocab_size = len(word_to_int) + 1
print(f"Tamanho do vocabulário: {vocab_size} palavras")

# Converter todo o texto para uma sequência de inteiros
full_sequence = tokenizer.texts_to_sequences([text])[0]

Tamanho do vocabulário: 16633 palavras


### 3. Criação das Sequências de Treino

Transformamos a longa sequência de palavras em pares de `(entrada, saída)` para o treinamento.

In [3]:
seq_length = 50  # Usaremos 50 palavras para prever a 51ª
X_data = []
y_data = []

for i in range(seq_length, len(full_sequence)):
    in_seq = full_sequence[i-seq_length:i]
    out_word = full_sequence[i]
    X_data.append(in_seq)
    y_data.append(out_word)

n_patterns = len(X_data)
print(f"Total de sequências de treino: {n_patterns}")

# Preparar os dados para a rede neural
X = np.array(X_data)
y = np.array(y_data)

Total de sequências de treino: 291509


### 4. Definição do Modelo e Funções de Apoio

Criamos funções para construir os modelos e gerar texto. Também configuramos os `callbacks` para um treinamento eficiente.

In [21]:
def create_model(recurrent_layer, vocab_size, seq_length):
    model = Sequential([
        Embedding(input_dim=vocab_size, output_dim=100, input_length=seq_length),
        recurrent_layer(128, return_sequences=True),
        recurrent_layer(128),
        Dense(vocab_size, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

def generate_text(model, tokenizer, seed_text, num_words_to_gen=40, temperature=0.7):
    full_generated_text = seed_text.lower()
    print(f"Semente: \"{seed_text}\" | Temperatura: {temperature}")
    print("Texto gerado:")
    print("------------------")
    print(seed_text, end=' ')
    
    for _ in range(num_words_to_gen):
        token_list = tokenizer.texts_to_sequences([full_generated_text])[0]
        token_list = pad_sequences([token_list], maxlen=seq_length, padding='pre')
        
        # Previsão e aplicação da temperatura
        prediction = model.predict(token_list, verbose=0)[0]
        prediction = np.log(prediction + 1e-7) / temperature # Adicionado 1e-7 para evitar log(0)
        
        # Amostragem da próxima palavra
        index = tf.random.categorical(prediction[np.newaxis, :], num_samples=1)[0, 0].numpy()
        
        output_word = int_to_word.get(index, "<unk>")
        full_generated_text += " " + output_word
        print(output_word, end=' ')
    print("\n------------------\n")

# Parâmetros de treino
epochs = 170
seed = "harry potter olhou para o castelo e"

### 5. Treinamento e Avaliação dos Modelos

Treinamos cada modelo sequencialmente, gerando o texto e limpando a memória da GPU após cada um para evitar sobrecarga.

In [5]:
# --- Modelo 1: SimpleRNN ---
print("### Treinando o Modelo SimpleRNN ###")
rnn_model = create_model(SimpleRNN, vocab_size, seq_length)
rnn_model.fit(X, y, epochs=epochs, batch_size=128, verbose=1, validation_split=0.1)

print("\n--- Geração com SimpleRNN ---")
generate_text(rnn_model, tokenizer, seed)

# Salvar o modelo
rnn_model.save('modelo_simplernn_harry_potter.keras')
print("✓ Modelo SimpleRNN salvo como 'modelo_simplernn_harry_potter.keras'")

del rnn_model
tf.keras.backend.clear_session()
gc.collect()

### Treinando o Modelo SimpleRNN ###
Epoch 1/170


I0000 00:00:1760272944.019274    7468 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5465 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4060 Ti, pci bus id: 0000:07:00.0, compute capability: 8.9
2025-10-12 09:42:25.645322: I external/local_xla/xla/service/service.cc:163] XLA service 0x7effb8003850 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2025-10-12 09:42:25.645336: I external/local_xla/xla/service/service.cc:171]   StreamExecutor device (0): NVIDIA GeForce RTX 4060 Ti, Compute Capability 8.9
2025-10-12 09:42:25.727899: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2025-10-12 09:42:32.078518: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:473] Loaded cuDNN version 90300
2025-10-12 09:42:34.340704: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All confi

[1m  21/2050[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m16s[0m 8ms/step - accuracy: 0.0019 - loss: 9.6919       

I0000 00:00:1760272973.739846    7562 device_compiler.h:196] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


[1m2046/2050[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 8ms/step - accuracy: 0.0451 - loss: 7.1998

2025-10-12 09:43:10.028670: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.
2025-10-12 09:43:10.028694: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.
2025-10-12 09:43:10.028704: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.
2025-10-12 09:43:10.028718: I external/l

[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.0451 - loss: 7.1990

2025-10-12 09:43:15.142860: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.
2025-10-12 09:43:15.142883: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.
2025-10-12 09:43:15.142893: I external/local_xla/xla/service/gpu/autotuning/dot_search_space.cc:208] All configs were filtered out because none of them sufficiently match the hints. Maybe the hints set does not contain a good representative set of valid configs? Working around this by using the full hints set instead.








[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 11ms/step - accuracy: 0.0563 - loss: 6.7943 - val_accuracy: 0.0859 - val_loss: 6.1942
Epoch 2/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 8ms/step - accuracy: 0.1045 - loss: 5.9353 - val_accuracy: 0.1204 - val_loss: 5.7133
Epoch 3/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 8ms/step - accuracy: 0.1286 - loss: 5.5031 - val_accuracy: 0.1373 - val_loss: 5.5690
Epoch 4/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 8ms/step - accuracy: 0.1444 - loss: 5.2222 - val_accuracy: 0.1450 - val_loss: 5.4975
Epoch 5/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 8ms/step - accuracy: 0.1549 - loss: 5.0090 - val_accuracy: 0.1502 - val_loss: 5.4734
Epoch 6/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 8ms/step - accuracy: 0.1649 - loss: 4.8296 - val_accuracy: 0.1519 - val_loss: 5.4834
Epoch 7/170
[1

0

In [6]:
# --- Modelo 2: LSTM ---
print("\n### Treinando o Modelo LSTM ###")
lstm_model = create_model(LSTM, vocab_size, seq_length)
lstm_model.fit(X, y, epochs=epochs, batch_size=128, verbose=1, validation_split=0.1)

print("\n--- Geração com LSTM ---")
generate_text(lstm_model, tokenizer, seed)

# Salvar o modelo
lstm_model.save('modelo_lstm_harry_potter.keras')
print("Modelo LSTM salvo como 'modelo_lstm_harry_potter.keras'")

del lstm_model
tf.keras.backend.clear_session()
gc.collect()


### Treinando o Modelo LSTM ###
Epoch 1/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m18s[0m 8ms/step - accuracy: 0.0557 - loss: 6.7075 - val_accuracy: 0.0778 - val_loss: 6.3268
Epoch 2/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 7ms/step - accuracy: 0.0891 - loss: 6.1053 - val_accuracy: 0.1004 - val_loss: 6.0536
Epoch 3/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 8ms/step - accuracy: 0.1094 - loss: 5.7684 - val_accuracy: 0.1134 - val_loss: 5.9106
Epoch 4/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 7ms/step - accuracy: 0.1218 - loss: 5.5324 - val_accuracy: 0.1232 - val_loss: 5.8362
Epoch 5/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 7ms/step - accuracy: 0.1323 - loss: 5.3365 - val_accuracy: 0.1291 - val_loss: 5.8133
Epoch 6/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 7ms/step - accuracy: 0.1426 - loss: 5.1600 - val_accurac

0

In [7]:
# --- Modelo 3: GRU ---
print("\n### Treinando o Modelo GRU ###")
gru_model = create_model(GRU, vocab_size, seq_length)
gru_model.fit(X, y, epochs=epochs, batch_size=128, verbose=1, validation_split=0.1)

print("\n--- Geração com GRU ---")
generate_text(gru_model, tokenizer, seed)

# Salvar o modelo
gru_model.save('modelo_gru_harry_potter.keras')
print("Modelo GRU salvo como 'modelo_gru_harry_potter.keras'")

del gru_model
tf.keras.backend.clear_session()
gc.collect()


### Treinando o Modelo GRU ###
Epoch 1/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m16s[0m 7ms/step - accuracy: 0.0502 - loss: 6.9398 - val_accuracy: 0.0828 - val_loss: 6.3740
Epoch 2/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 7ms/step - accuracy: 0.1035 - loss: 5.9616 - val_accuracy: 0.1273 - val_loss: 5.7489
Epoch 3/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 7ms/step - accuracy: 0.1376 - loss: 5.4135 - val_accuracy: 0.1434 - val_loss: 5.6021
Epoch 4/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 7ms/step - accuracy: 0.1567 - loss: 5.0530 - val_accuracy: 0.1487 - val_loss: 5.5634
Epoch 5/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 7ms/step - accuracy: 0.1730 - loss: 4.7612 - val_accuracy: 0.1508 - val_loss: 5.5849
Epoch 6/170
[1m2050/2050[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 7ms/step - accuracy: 0.1879 - loss: 4.5089 - val_accuracy

0

### 6. Carregar Modelos Salvos e Gerar Novos Textos

Use esta seção para carregar qualquer um dos modelos treinados e gerar texto com diferentes sementes (frases iniciais).

In [33]:
# Escolha qual modelo carregar (descomente a linha desejada)
lstm_model = 'modelo_lstm_harry_potter.keras'  # Altere para o modelo desejado
simplernn_model = 'modelo_simplernn_harry_potter.keras'
gru_model = 'modelo_gru_harry_potter.keras'

loaded_lstm_model = tf.keras.models.load_model(lstm_model)
loaded_simplernn_model = tf.keras.models.load_model(simplernn_model)
loaded_gru_model = tf.keras.models.load_model(gru_model)

# Defina sua própria frase inicial aqui
nova_semente = "Era uma vez um menino"
num_words_to_gen=30
temperature=0.7

# Gerar texto com o modelo carregado
# Você pode ajustar a temperatura para controlar a criatividade:
# - temperature=0.5 : Mais conservador e previsível
# - temperature=0.7 : Balanceado (padrão)
# - temperature=1.0 : Mais criativo e arriscado
print("\n--- Geração com SimpleRNN Carregado ---")
generate_text(loaded_simplernn_model, tokenizer, nova_semente, num_words_to_gen=num_words_to_gen, temperature=temperature)

print("\n--- Geração com LSTM Carregado ---")
generate_text(loaded_lstm_model, tokenizer, nova_semente, num_words_to_gen=num_words_to_gen, temperature=temperature)

print("\n--- Geração com GRU Carregado ---")
generate_text(loaded_gru_model, tokenizer, nova_semente, num_words_to_gen=num_words_to_gen, temperature=temperature)

  saveable.load_own_variables(weights_store.get(inner_path))



--- Geração com SimpleRNN Carregado ---
Semente: "Era uma vez um menino" | Temperatura: 0.7
Texto gerado:
------------------
Era uma vez um menino de lareira atirou os dois garotos viu os cabelos de gilderoy lockhart arre estava sentada na parede maciça formada louca às suas costas colin franziu certa dificuldade com a popular 
------------------


--- Geração com LSTM Carregado ---
Semente: "Era uma vez um menino" | Temperatura: 0.7
Texto gerado:
------------------
Era uma vez um menino de aspecto severo quanto lord de perguntas que estava crescendo ninguém podia perceber o que havia à cara livre e o truque afinal achou que não fizessem bruxarias como passar 
------------------


--- Geração com GRU Carregado ---
Semente: "Era uma vez um menino" | Temperatura: 0.7
Texto gerado:
------------------
Era uma vez um menino muito boa e cinco em cinco metros era uma brincadeira cinco minutos em casa será a sua mulher sangue ruim ultimamente e sempre resmungando velho ali fora da sra dursley