# Inferencia
Para realizar inferencia con los modelos obtenidos, debemos contar con un estado inicial, que en este caso podría representar el inicio de la noticia que queremos crear. De esta manera, usamos el modelo para que nos de el próximo caracter o palabra de la noticia iterando sobre el resultado hasta que se logre la logitud deseada

In [1]:
import tensorflow as tf
import numpy as np
import os
import time

In [2]:
# vocab de caracteres
vocab = [' ', '!', '"', '#', '$', '&', "'", '(', ')', '*', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '=', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '\\', '_', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

In [3]:
# Para modelo a nivel de caracter
ids_from_chars = tf.keras.layers.StringLookup(
    vocabulary=list(vocab), mask_token=None)
chars_from_ids = tf.keras.layers.StringLookup(
    vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None)
def text_from_ids(ids):
  return tf.strings.reduce_join(chars_from_ids(ids), axis=-1)
# Length of the vocabulary in StringLookup Layer
vocab_size = len(ids_from_chars.get_vocabulary())
vocab_size
# The embedding dimension
embedding_dim = 256

# Number of RNN units
rnn_units = 1024

In [4]:
class MyModel(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, rnn_units):
        super(MyModel, self).__init__()
        self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
        self.gru = tf.keras.layers.GRU(rnn_units,
                                       return_sequences=True,
                                       return_state=True)
        self.dense = tf.keras.layers.Dense(vocab_size)

    def call(self, inputs, states=None, return_state=False, training=False):
        x = self.embedding(inputs, training=training)
        if states is None:
            states = self.gru.get_initial_state(x)
        x, states = self.gru(x, initial_state=states, training=training)
        x = self.dense(x, training=training)

        if return_state:
            return x, states
        else:
            return x
    def load_model(self, filepath):
      return tf.keras.models.load_weights(filepath)

In [17]:
model = MyModel(
    vocab_size=vocab_size,
    embedding_dim=embedding_dim,
    rnn_units=rnn_units)

In [6]:
# Llamar al método build() del modelo con las dimensiones de entrada
model.build(tf.TensorShape([None, None]))

# Ahora puedes llamar a model.summary()
model.summary()



Model: "my_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       multiple                  21248     
                                                                 
 gru (GRU)                   multiple                  3938304   
                                                                 
 dense (Dense)               multiple                  85075     
                                                                 
Total params: 4044627 (15.43 MB)
Trainable params: 4044627 (15.43 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [26]:
import gdown
gdown.download_folder("https://drive.google.com/drive/folders/1xeRI1UE0ZZIrmrxgs9x_U3p7bSmSSpF1?usp=sharing", quiet=True)

['/content/AA2-Models_and_vocab/character-article-model.data-00000-of-00001',
 '/content/AA2-Models_and_vocab/character-article-model.index',
 '/content/AA2-Models_and_vocab/vocab.txt',
 '/content/AA2-Models_and_vocab/word-article-model.data-00000-of-00001',
 '/content/AA2-Models_and_vocab/word-article-model.index']

In [21]:
model.load_weights('AA2-Models_and_vocab/character-article-model')

<tensorflow.python.checkpoint.checkpoint.CheckpointLoadStatus at 0x7a6f02884dc0>

In [23]:
current_body = 'News announced that new hedge funds could go  '
seq_length=100
for i in range(200):
  inputs = [ids_from_chars(tf.strings.unicode_split(current_body, 'UTF-8'))]
  inputs = tf.keras.preprocessing.sequence.pad_sequences(inputs, maxlen=seq_length, padding='pre')
  inputs = tf.convert_to_tensor(inputs)
  predictions = model(inputs)
  predictions = predictions[-1, :]
  predicted_id = tf.random.categorical(predictions, num_samples=1)[-1, 0].numpy()
  current_body += ids_from_chars.get_vocabulary()[predicted_id]
  if ids_from_chars.get_vocabulary()[predicted_id] == '\n':
    break
print(current_body)

News announced that new hedge funds could go  a place to lift three games -- whose growing part a suicide car bomber in Buidlin's focused Taiwan's governing update to Mac OS initial future USL preparations were ushering its new iPod math between 


La inferencia del modelo a caracteres parece tener sentido a lo sumo viendo n-gramas. Sin embargo pierde contexto a largo plazo facilmente y parece alucinar. De todas maneras, cada palabra parece tener sentido por lo menos con las contiguas.

# Inferencia palabra a palabra

In [27]:
# Cargar la lista desde un archivo de texto
with open('AA2-Models_and_vocab/vocab.txt', 'r') as f:
    vocab = [line.strip() for line in f]

print("Vocabulario cargado desde 'vocab.txt'.")

Vocabulario cargado desde 'vocab.txt'.


In [28]:
# Para modelo a nivel de caracter
ids_from_chars = tf.keras.layers.StringLookup(
    vocabulary=list(vocab), mask_token=None)
chars_from_ids = tf.keras.layers.StringLookup(
    vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None)
def text_from_ids(ids):
  return tf.strings.reduce_join(chars_from_ids(ids), axis=-1)
# Length of the vocabulary in StringLookup Layer
vocab_size = len(ids_from_chars.get_vocabulary())
vocab_size
# The embedding dimension
embedding_dim = 256

# Number of RNN units
rnn_units = 512

In [29]:
model = MyModel(
    vocab_size=vocab_size,
    embedding_dim=embedding_dim,
    rnn_units=rnn_units)
# Llamar al método build() del modelo con las dimensiones de entrada
model.build(tf.TensorShape([None, None]))

# Ahora puedes llamar a model.summary()
model.summary()

Model: "my_model_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_2 (Embedding)     multiple                  1920256   
                                                                 
 gru_2 (GRU)                 multiple                  1182720   
                                                                 
 dense_2 (Dense)             multiple                  3848013   
                                                                 
Total params: 6950989 (26.52 MB)
Trainable params: 6950989 (26.52 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [30]:
model.load_weights('AA2-Models_and_vocab/word-article-model')

<tensorflow.python.checkpoint.checkpoint.CheckpointLoadStatus at 0x7a6f05fa26e0>

In [33]:
import re
current_body = 'New beauty product just launched in '
for i in range(20):
  # Convert to lowercase
  article = current_body.lower()
  # Remove special characters, keeping only alphanumeric characters and spaces
  article = re.sub(r'[^a-z0-9\s]', '', article)
  words = re.findall(r'\b\w+\b', article)
  inputs = [ids_from_chars(word) for word in words]
  # Pad the sequences
  inputs = tf.keras.preprocessing.sequence.pad_sequences([inputs], maxlen=seq_length, padding='pre')
  inputs = tf.convert_to_tensor(inputs)

  predictions = model(inputs)
  predictions = predictions[-1, :]
  predicted_id = tf.random.categorical(predictions, num_samples=1)[-1, 0].numpy()
  current_body += ' '
  current_body += ids_from_chars.get_vocabulary()[predicted_id]
  if ids_from_chars.get_vocabulary()[predicted_id] == '\n':
    break
print(current_body)

New beauty product just launched in  hedge equipment on wednesday as investors bet an services group that was increased growth in new york [UNK] has cost


Parece llegar a un poco más de sentido, pues parece mantener la memoria a largo plazo. Esto puede ser debido a que las celdas ahora guardan palabras en lugar de caracteres, se pierde menos la memoria. Sin embargo no se logra un buen resultado. Además, al limitar el vocabulario a las primeras 7500 palabras, es posible que aparezcan caracteres [UNK]. Esto puede ser enmendado tomando el segundo caracter más probable en el caso de que [UNK] sea el que más probabilidad tenga.

A pesar de que los resultados pueden no ser los esperados, hay que tener en cuenta que queda un margen muy grande para seguir entrenando estos modelos, el problema yace en la memoria del entorno.