##Importamos librerías

In [None]:
import tensorflow as tf
import numpy as np
import os
import time
!pip install tensorflow==2.15.1



Establecer GPU por defecto en caso de estar disponible.

In [None]:
# Configurar para que TensorFlow utilice la GPU por defecto
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        # Configurar para que TensorFlow asigne memoria dinámicamente
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        # Especificar la GPU por defecto
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
    except RuntimeError as e:
        # Manejar error
        print(e)

##Preporcesamiento del texto

In [None]:
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt


Miramos las primeras filas del texto

In [None]:
# Take a look at the first 250 characters in text
print(text[:250])

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



Vemos cuantos caracteres únicos tenemos

In [None]:
# The unique characters in the file
vocab = sorted(set(text))
print(f'{len(vocab)} unique characters')

65 unique characters


#Modelo caracter a caracter

## Vectorizacion del texto

Previo al entrenamiento, necesitamos convertir el texto a una representacion numerica.

Creamos la capa tf.keras.layers.StringLookup convierte de tokens a IDs de caracteres:

In [None]:
ids_from_chars = tf.keras.layers.StringLookup(
    vocabulary=list(vocab), mask_token=None)

Dado que el proposito de este laboratorio es generar texto, tambien sera importante invertir esta representacion y recuperar texto legible desde estos IDs. Para esto podemos usar tf.keras.layers.StringLookup(..., invert=True).

In [None]:
chars_from_ids = tf.keras.layers.StringLookup(
    vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None)

Finalmente usando tf.strings.reduce_join se pueden volver a juntar los caracteres en texto.

In [None]:
def text_from_ids(ids):
  return tf.strings.reduce_join(chars_from_ids(ids), axis=-1)

## Crear los ejemplos de entrenamiento



Dividimos el texto en secuencias de ejemplo. Cada secuencia de entrada contendrá seq_length caracteres del texto.

Para cada secuencia de entrada, los targets correspondientes contienen la misma longitud de texto, excepto que se desplazan un carácter hacia la derecha.

Para hacer esto, usamos la función tf.data.Dataset.from_tensor_slices para convertir el vector de texto en una secuencia de índices de caracteres.

In [None]:
all_ids = ids_from_chars(tf.strings.unicode_split(text, 'UTF-8'))
all_ids

<tf.Tensor: shape=(1115394,), dtype=int64, numpy=array([19, 48, 57, ..., 46,  9,  1])>

In [None]:
ids_dataset = tf.data.Dataset.from_tensor_slices(all_ids)

In [None]:
for ids in ids_dataset.take(10):
    print(chars_from_ids(ids).numpy().decode('utf-8'))

F
i
r
s
t
 
C
i
t
i


In [None]:
seq_length = 100

El método batch nos permite convertir fácilmente estos caracteres individuales en secuencias del tamaño deseado.

In [None]:
sequences = ids_dataset.batch(seq_length+1, drop_remainder=True)

for seq in sequences.take(1):
  print(chars_from_ids(seq))

tf.Tensor(
[b'F' b'i' b'r' b's' b't' b' ' b'C' b'i' b't' b'i' b'z' b'e' b'n' b':'
 b'\n' b'B' b'e' b'f' b'o' b'r' b'e' b' ' b'w' b'e' b' ' b'p' b'r' b'o'
 b'c' b'e' b'e' b'd' b' ' b'a' b'n' b'y' b' ' b'f' b'u' b'r' b't' b'h'
 b'e' b'r' b',' b' ' b'h' b'e' b'a' b'r' b' ' b'm' b'e' b' ' b's' b'p'
 b'e' b'a' b'k' b'.' b'\n' b'\n' b'A' b'l' b'l' b':' b'\n' b'S' b'p' b'e'
 b'a' b'k' b',' b' ' b's' b'p' b'e' b'a' b'k' b'.' b'\n' b'\n' b'F' b'i'
 b'r' b's' b't' b' ' b'C' b'i' b't' b'i' b'z' b'e' b'n' b':' b'\n' b'Y'
 b'o' b'u' b' '], shape=(101,), dtype=string)


Es mas facil ver lo que esta haciendo si unimos de vuelta los tokens en texto:

In [None]:
for seq in sequences.take(5):
  print(text_from_ids(seq).numpy())

b'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou '
b'are all resolved rather to die than to famish?\n\nAll:\nResolved. resolved.\n\nFirst Citizen:\nFirst, you k'
b"now Caius Marcius is chief enemy to the people.\n\nAll:\nWe know't, we know't.\n\nFirst Citizen:\nLet us ki"
b"ll him, and we'll have corn at our own price.\nIs't a verdict?\n\nAll:\nNo more talking on't; let it be d"
b'one: away, away!\n\nSecond Citizen:\nOne word, good citizens.\n\nFirst Citizen:\nWe are accounted poor citi'


Para el entrenamiento, necesitaremos un conjunto de datos de pares (input, label). Donde input y label son secuencias. En cada timestep, la entrada es el carácter actual y la etiqueta es el siguiente carácter.

Aquí hay una función que toma una secuencia como entrada, la duplica y la desplaza para alinear la entrada y la etiqueta para cada timestep:

In [None]:
def split_input_target(sequence):
    input_text = sequence[:-1]
    target_text = sequence[1:]
    return input_text, target_text

In [None]:
dataset = sequences.map(split_input_target)

In [None]:
for input_example, target_example in dataset.take(1):
    print("Input :", text_from_ids(input_example).numpy())
    print("Target:", text_from_ids(target_example).numpy())

Input : b'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou'
Target: b'irst Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou '


### Batches de entrenamiento

Usamos `tf.data` para dividir el texto en secuencias manejables. Pero antes de introducir estos datos en el modelo, es necesario mezclarlos y batchearlos.

In [None]:
# Batch size
BATCH_SIZE = 64

# Buffer size to shuffle the dataset
# (TF data is designed to work with possibly infinite sequences,
# so it doesn't attempt to shuffle the entire sequence in memory. Instead,
# it maintains a buffer in which it shuffles elements).
BUFFER_SIZE = 10000

dataset = (
    dataset
    .shuffle(BUFFER_SIZE)
    .batch(BATCH_SIZE, drop_remainder=True)
    .prefetch(tf.data.experimental.AUTOTUNE))

dataset

<_PrefetchDataset element_spec=(TensorSpec(shape=(64, 100), dtype=tf.int64, name=None), TensorSpec(shape=(64, 100), dtype=tf.int64, name=None))>

## Construccion del modelo

En esta sección definimos el modelo como una subclase de `keras.Model`

Este modelo tiene tres capas:

* `tf.keras.layers.Embedding`: La capa de entrada. Una lookup table entrenable que asignará cada ID de carácter a un vector con dimensiones `embedding_dim`;
* `tf.keras.layers.GRU`: una capa recurrente GRU de tamaño `units=rnn_units` (también se puede usar una capa LSTM aquí).
* `tf.keras.layers.Dense`: La capa de salida, con salidas `vocab_size`. Genera un logit para cada carácter del vocabulario. Estas son las probabilidades de cada caracter según el modelo.

In [None]:
# Length of the vocabulary in StringLookup Layer
vocab_size = len(ids_from_chars.get_vocabulary())

# The embedding dimension
embedding_dim = 128

# Number of RNN units
rnn_units = 512

In [None]:
class MyModel(tf.keras.Model):
  def __init__(self, vocab_size, embedding_dim, rnn_units):
    super().__init__()
    self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
    self.gru = tf.keras.layers.GRU(rnn_units,
                                   return_sequences=True,
                                   return_state=True)
    self.dense = tf.keras.layers.Dense(vocab_size)

  def call(self, inputs, states=None, return_state=False, training=False):
    x = inputs
    x = self.embedding(x)
    if states is None:
      states = self.gru.get_initial_state(x)
    x, states = self.gru(x, initial_state=states, training=training)
    x = self.dense(x, training=training)

    if return_state:
      return x, states
    else:
      return x

In [None]:
model = MyModel(
    vocab_size=vocab_size,
    embedding_dim=embedding_dim,
    rnn_units=rnn_units)

Por cada caracter el modelo calcula su embedding, corre la GRU un timestep con el embedding como entrada y aplica la capa densa para generar los logits prediciendo la probabilidades del siguiente caracter.

## Probar el modelo

Ejecutamos el modelo para ver que se comporta como se esperaba.

Primero verificamos la shape de salida:

In [None]:
for input_example_batch, target_example_batch in dataset.take(1):
    example_batch_predictions = model(input_example_batch)
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")

(64, 100, 66) # (batch_size, sequence_length, vocab_size)


En el ejemplo anterior, la longitud de la secuencia de la entrada es 100, pero el modelo se puede ejecutar con entradas de cualquier longitud:

In [None]:
model.summary()

Model: "my_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       multiple                  8448      
                                                                 
 gru (GRU)                   multiple                  986112    
                                                                 
 dense (Dense)               multiple                  33858     
                                                                 
Total params: 1028418 (3.92 MB)
Trainable params: 1028418 (3.92 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


Para obtener predicciones reales del modelo, se deben tomar muestras de la distribución de salida para obtener índices de caracteres reales. Esta distribución está definida por los logits sobre el vocabulario de los caracteres.

Nota: Es importante tomar una muestra de esta distribución, ya que tomar el argmax de la distribución puede fácilmente hacer que el modelo se atasque en un bucle.

Tomando como ejemplo el primero del batch:

In [None]:
sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
sampled_indices = tf.squeeze(sampled_indices, axis=-1).numpy()

Esto nos da para cada timestep una predicción del siguiente índice de caracteres:

In [None]:
sampled_indices

array([53, 55, 30, 25, 63, 47, 46, 32, 51, 55, 30, 40, 35,  5, 35, 13, 27,
        0, 11, 49, 63, 42, 32, 41,  9, 49, 53, 16, 51, 27, 15, 46, 49, 31,
       18,  4, 12, 61, 12, 30, 12, 33, 54, 23,  3, 39, 26, 46, 15, 46, 14,
       40, 15, 26, 12, 64,  9, 13, 39, 16, 32, 13,  2, 13, 46, 17, 36, 15,
       36,  9, 27, 50, 11, 42, 24,  7, 43, 58, 15, 19, 37, 48, 11, 29, 34,
       47, 40, 46, 36, 19, 26, 53, 33, 16,  4, 29, 14,  5, 55, 11])

Por ultimo los decodificamos para ver el texto predicho por este modelo no entrenado:

In [None]:
print("Input:\n", text_from_ids(input_example_batch[0]).numpy())
print()
print("Next Char Predictions:\n", text_from_ids(sampled_indices).numpy())

Input:
 b",\nStirr'd up by God, thus boldly for his king:\nMy Lord of Hereford here, whom you call king,\nIs a fo"

Next Char Predictions:
 b'npQLxhgSlpQaV&V?N[UNK]:jxcSb.jnClNBgjRE$;v;Q;ToJ!ZMgBgAaBM;y.?ZCS? ?gDWBW.Nk:cK,dsBFXi:PUhagWFMnTC$PA&p:'


Como vemos, la prediccion sin entrenar el modelo es mala

## Entrenamiento del modelo

El problema puede tratarse como un problema de clasificación estándar. Dado el estado RNN anterior y la entrada en este timestep, predice la clase del siguiente carácter.

### Agregamos un optimizador y una funcion costo

La función de pérdida estándar `tf.keras.losses.sparse_categorical_crossentropy` funciona en este caso porque se aplica en la última dimensión de las predicciones.

Debido a que su modelo devuelve logits, necesita configurar el indicador `from_logits`.

In [None]:
loss = tf.losses.SparseCategoricalCrossentropy(from_logits=True)

In [None]:
example_batch_mean_loss = loss(target_example_batch, example_batch_predictions)
print("Prediction shape: ", example_batch_predictions.shape, " # (batch_size, sequence_length, vocab_size)")
print("Mean loss:        ", example_batch_mean_loss)

Prediction shape:  (64, 100, 66)  # (batch_size, sequence_length, vocab_size)
Mean loss:         tf.Tensor(4.190164, shape=(), dtype=float32)


Un modelo recién inicializado no debería estar demasiado seguro de sí mismo, todos los logits de salida deberían tener magnitudes similares. Para confirmar esto, puede comprobar que la exponencial del costo medio es aproximadamente igual al tamaño del vocabulario. Una pérdida mucho mayor significa que el modelo está seguro de sus respuestas incorrectas y está mal inicializado:

In [None]:
tf.exp(example_batch_mean_loss).numpy()

66.03362

Como podemos comprobar la exponencial de costo es similar al tamaño de caracteres (65)

Compilamos el modelo con tf.keras.Model.compile indicando el optimizador y la funcion costo:

In [None]:
model.compile(optimizer='adam', loss=loss)

### Checkpoints del modelo

Usamos el callback `tf.keras.callbacks.ModelCheckpoint` para que se guarden checkpoints del modelo durante el entrenamiento.

In [None]:
# Directory where the checkpoints will be saved
checkpoint_dir = './training_checkpoints'
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}.weights.h5")

checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

## Ejecucion del entrenamiento

In [None]:
EPOCHS = 30

In [None]:
history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


## Generacion de texto

La forma más sencilla de generar texto con este modelo es ejecutarlo en un bucle y realizar un seguimiento del estado interno del modelo a medida que lo ejecutamos.

Para generar texto, la salida del modelo se retroalimenta a la entrada

Cada vez que llamamos al modelo, pasamos algún texto y un estado interno. El modelo devuelve una predicción para el siguiente caracter y su nuevo estado. Vuelva a pasar la predicción y el estado para continuar generando texto.

Lo siguiente hace una predicción de un solo paso:

In [None]:
class OneStep(tf.keras.Model):
  def __init__(self, model, chars_from_ids, ids_from_chars, temperature=1.0):
    super().__init__()
    self.temperature = temperature
    self.model = model
    self.chars_from_ids = chars_from_ids
    self.ids_from_chars = ids_from_chars

    # Create a mask to prevent "[UNK]" from being generated.
    skip_ids = self.ids_from_chars(['[UNK]'])[:, None]
    sparse_mask = tf.SparseTensor(
        # Put a -inf at each bad index.
        values=[-float('inf')]*len(skip_ids),
        indices=skip_ids,
        # Match the shape to the vocabulary
        dense_shape=[len(ids_from_chars.get_vocabulary())])
    self.prediction_mask = tf.sparse.to_dense(sparse_mask)

  @tf.function
  def generate_one_step(self, inputs, states=None):
    # Convert strings to token IDs.
    input_chars = tf.strings.unicode_split(inputs, 'UTF-8')
    input_ids = self.ids_from_chars(input_chars).to_tensor()

    # Run the model.
    # predicted_logits.shape is [batch, char, next_char_logits]
    predicted_logits, states = self.model(inputs=input_ids, states=states,
                                          return_state=True)
    # Only use the last prediction.
    predicted_logits = predicted_logits[:, -1, :]
    predicted_logits = predicted_logits/self.temperature
    # Apply the prediction mask: prevent "[UNK]" from being generated.
    predicted_logits = predicted_logits + self.prediction_mask

    # Sample the output logits to generate token IDs.
    predicted_ids = tf.random.categorical(predicted_logits, num_samples=1)
    predicted_ids = tf.squeeze(predicted_ids, axis=-1)

    # Convert from token ids to characters
    predicted_chars = self.chars_from_ids(predicted_ids)

    # Return the characters and model state.
    return predicted_chars, states

In [None]:
one_step_model = OneStep(model, chars_from_ids, ids_from_chars)

Lo ejecutamos en un bucle para generar texto. Al observar el texto generado, veremos que el modelo sabe cuándo poner mayúsculas, hacer párrafos e imita un vocabulario de escritura similar a sheakspeare. Probamos con 10, 20 y 30 epocas y pudimos ver que a medida que aumentabamos las epocas las frases van siendo mas coherentes.

In [None]:
start = time.time()
states = None
next_char = tf.constant(['First Citizen'])
result = [next_char]

for n in range(300):
  next_char, states = one_step_model.generate_one_step(next_char, states=states)
  result.append(next_char)

result = tf.strings.join(result)
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n' + '_'*80)
print('\nRun time:', end - start)

First Citizen:
But how fares your son's, being something uptorsty?
Clarence! My drightly your conceit; it
returns the marice of thy tears, which waiting from the peapes company
Ressenged on our most dignifies demands;
And strike after rume of him that thou art
As Paese: you have recooted me for my most shall be  

________________________________________________________________________________

Run time: 0.804802417755127


generamos 4 frases mas para comparar

In [None]:
start = time.time()
states = None
next_char = tf.constant(['First Citizen'])
result = [next_char]

for n in range(300):
  next_char, states = one_step_model.generate_one_step(next_char, states=states)
  result.append(next_char)

result = tf.strings.join(result)
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n' + '_'*80)
print('\nRun time:', end - start)

First Citizen:
For ever-hole a thousand officers: at my hands
While vesing cut of monstrous misant, be so; O, then
I know no hope.

PRINCE EDWARD:
All shall thy want of sto'l hollow forsook in him?
Teach this case I have heard a slagest man
Than blood which no pill'd your forsy; set A brace and borough!

SICINIU 

________________________________________________________________________________

Run time: 1.3600642681121826


In [None]:
start = time.time()
states = None
next_char = tf.constant(['First Citizen'])
result = [next_char]

for n in range(400):
  next_char, states = one_step_model.generate_one_step(next_char, states=states)
  result.append(next_char)

result = tf.strings.join(result)
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n' + '_'*80)
print('\nRun time:', end - start)

First Citizen:
We'll Prospe in the marketh and his lands with down:
Then shall I near again, too late 'sil
him not: I will attend me, dost thou did make us the
farvot: with a lengy of the title of all discretits,
Because honourable strew thy blood: they are traitor,
The commonts of the two?

BOHNA:
'Tis gone we swear it, put in the chair of your followers:
Nay, patience, hear our love,
Shortly I throw my fathe 

________________________________________________________________________________

Run time: 0.679373025894165


In [None]:
start = time.time()
states = None
next_char = tf.constant(['JULIET'])
result = [next_char]

for n in range(500):
  next_char, states = one_step_model.generate_one_step(next_char, states=states)
  result.append(next_char)

result = tf.strings.join(result)
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n' + '_'*80)
print('\nRun time:', end - start)

JULIET:
O BoyoD Worth thou palth-may.
Know, Mard;--O, what nurisb-rew meez,
Put moliniat,
God and Lond guilton me a duel at,
Sying' peer;
You'll dear so founger3 PRyISCLue'lls.

PROSPERO:
Appime
of Romes jointenhian.

Amblibro, believe not inctriasute turror,
Haves and bjop kispstancledey caquque when
achinabit
Nating-eaticoble. Balkfolds bys;
For plugapup? therebike
Losis Buhberie!
To have I voly Tybalt. Grorn low:
The Earl of Mercupe widow? Dare our retort?
Arm'Od looket Minen:
Therefoldwamed, up co 

________________________________________________________________________________

Run time: 1.1021220684051514


In [None]:
start = time.time()
states = None
next_char = tf.constant(['First Citizen'])
result = [next_char]

for n in range(800):
  next_char, states = one_step_model.generate_one_step(next_char, states=states)
  result.append(next_char)

result = tf.strings.join(result)
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n' + '_'*80)
print('\nRun time:', end - start)

First Citizen:
I have a happiness, I will dram her eye,
Came to your king'd and wedding chestion
And metries lirth than you upon by.

DUKE OF AUMERLE:
My lord, your queen poherty, very seove, I would pardon me;
No, I'll not will keep and think our faithful self-door;
You shall be done in Carisaunt flatker me in myself?

HORTENSIO:
Madam, my lord; not to know his sister
Servant was to be so; then be mad.

MONTAGUE:
Worth this, Sir, mumber not! I will grant? upon
him!

Shepherd:
Look to the Capitol?

LIONT:
Ristlef Gloucester, to her faults with mindred
Alas, I know you are a sort such butters of the issue.

KING HENRY VI:
First son: and friar, being slender we have seen.

Pedant:
O, what I'll not be.

JULIET:
Beseech you, like a falsehood of the death.
Now, Trubtiness, sir? how sweet best tofful soul is 

________________________________________________________________________________

Run time: 2.423008680343628


A medida que vamos cambiando el largo de la secuencia, vemos que logra armar un dialogo en el formato de verso y prosa que requerimos. Si le cambiamos la palabra de origen continúa armando textos coherentes y con el formato de verso y prosa que requerimos.

Ese modelo tenía una temperatura igual a 1, ahora vamos a probar con distintas temperaturas para ver que textos devuelve.

In [None]:
one_step_model = OneStep(model, chars_from_ids, ids_from_chars, temperature = 0.5)

In [None]:
start = time.time()
states = None
next_char = tf.constant(['First Citizen'])
result = [next_char]

for n in range(1000):
  next_char, states = one_step_model.generate_one_step(next_char, states=states)
  result.append(next_char)

result = tf.strings.join(result)
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n' + '_'*80)
print('\nRun time:', end - start)

First Citizen:
He'la,
do, dam''d by
Rikely apides,
Blufs too toun is butied: winterous Eunher bid le.

ATH:
And yet me pows; I told you what: what's flautisp;
Besids' you, dracI.
: it custom aburue yel? why igly should!

TRANIO:
What teked was; l'dein,
A
meeting natide, agrwerateden jaw.
Who wills, 'tisg!
Say,' faces you. At just,
Weg-done: noino, telk not.--
ANNELO:
O bravid, virtuzo! Juliegh!'
OH No?

ISABELLA:
comfene and hapits mut, weswifegefol; nod brings afted
off:
Let's hief man try upperGit Vellow:
Didmfur!
'Twill mirdlery I; no embtality, a
blieh--'onqu'k, juck miso--tuills zaulty precipe:
Briargs and when Kame.

Second SuT'RLVAK:
His.
First Sebain a JulomityCUTY:

BENCOUrse:
Pittlus
Make not 'Widl-
Bitionfly,'
Curr he, him fray:
Quest Laiches; fhom Plince bl comon'st losp, 'both 'Ssalt:
I' ta'tty thousands. Was Af-same;
West I'Bqueatily wenh amVO Dukibs;
Yea, impudey, of what nam-shoulder'sing runifp, noke
sycalate
Till juddes upon, I;' mis-shanero to't;
like lie,
fohs, is i

No vemos demasiados cambios poniendo la temperatura en 0.5, probemos con algo mas bajo todavía

In [None]:
one_step_model = OneStep(model, chars_from_ids, ids_from_chars, temperature = 0.1)

In [None]:
start = time.time()
states = None
next_char = tf.constant(['First Citizen'])
result = [next_char]

for n in range(1000):
  next_char, states = one_step_model.generate_one_step(next_char, states=states)
  result.append(next_char)

result = tf.strings.join(result)
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n' + '_'*80)
print('\nRun time:', end - start)

First Citizen:
We are too soon, and speak the sea, having no more
Than the sun short a poor prince, and still strike at all.

BUCKINGHAM:
What is the marriage of the county slay;
For the law shall not be all the seasons.

First Servingman:
What news abroad?

BRUTUS:
We are all unpossess it to the crown, the senate bods,
That we may command a feast of the world:
And when he says it is to be a soul, and therefore,
I will be set of his looks are speeding soul!
And what the duke is spite of all the duke.

GLOUCESTER:
And therefore for a heavy sound,
And therefore leave us to the crown,
And with the state and prince as thou art deceived;
And therefore let us so he did.

DUKE OF YORK:
What is the county serve?

Second Servingman:
What news abroad?

BRUTUS:
We are all undertake to see the county of the county
Part of the seasing manner of my soul is worth
Than when they say, and there a man doth so,
That we may come to the crown, and so we proclaim
What you have spent an enemy,
And so I come,

Tampoco vemos cambios en el fromato del texto y en la coherencia, porque al poner una temperatura baja acercamos las probabilidades altas a 1 y las bajas a 0 entonces hacemos que las palabras que tenian mas probabilidades de seleccionarse sigan siendo las mas seleccionadas.

In [None]:
one_step_model = OneStep(model, chars_from_ids, ids_from_chars, temperature = 2)

In [None]:
start = time.time()
states = None
next_char = tf.constant(['First Citizen'])
result = [next_char]

for n in range(1000):
  next_char, states = one_step_model.generate_one_step(next_char, states=states)
  result.append(next_char)

result = tf.strings.join(result)
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n' + '_'*80)
print('\nRun time:', end - start)

First Citizen-muld welch;'cuaig noth makes the worf. Bravoody Inount
Mass'd 'tway, nuish'd with Warfignagely.
Tofw'd her-heards my body'mess awa! she Farlal
Phtimbs your favour, whights ip true;
Strulls. OF SyO$M:
Icunot;
Of clowns, that, Too is not bolXness; Ased togot you;
Murder, to etwear him rie.

BeyectiP!

yORA Molsocarward:
Faithlo, a my, my Rore
Tastiv up: Prmioy!
Riss liftly he troublives: God, I'll stoe hit
di' taiis.' Hast Kemprust
there, remember, you adorion-delity of furilap-ul:
Hark-plas thy clatour cessides!

OffilUM:
Gar; liothe within; to quest's refiad; nursh: as, no, Chringnaus
Juaok to keep cemiladous! You. ceacied namn,-which will ve
finger-if you forgive: and quittle if
Must elaw' notionets', as shring joy
And fruice sha$low'd for, refatemple, suffismasquita.
Petruumit swas, rup? Esquatwer'd,
Impaset our pincwive: castles ich as
you mosh, by tioted joir agpactide. Give unfisedranch Divoss,
Throw your minip withal. Never!
Ye, insted well: let thy majorumpshy. How

Acá vemos que no es coherente con lo que escribe, las palabras no tienen sentido, aunque mantiene la estructura del texto. Esto se debe a que pasa justamente lo contrario a lo que dijimos antes, todas las probabilidades se "aplanan" entonces mas palabras raras tienen posibilidad de ser elegidas como proxima palabra. Por eso el texto es mas "volado" y no tiene tanto sentido.

# Modelo palabra a palabra

Ahora vamos a hacer lo mismo pero palabra a palabra

## Vectorización del texto

vemos cuantas palabras tiene mi diccionario

In [None]:
# Divide el texto en palabras
words = tf.strings.split(text)  # Tokenización básica por espacio
vocab = sorted(set(words.numpy()))  # Vocabulario único de palabras
print(f'{len(vocab)} unique words')

25670 unique words


como tiene muchas palabras vamos a sacar la puntuacion que acompaña a todas las palabras

In [None]:
import string
# Eliminar signos de puntuación de cada palabra
translator = str.maketrans('', '', string.punctuation)
clean_vocab = [word.decode('utf-8').translate(translator) for word in vocab]
unique_vocab = list(dict.fromkeys(clean_vocab))
# Convertir la lista a un tensor
vocab = tf.constant(unique_vocab)
vocab = sorted(set(vocab.numpy()))
print(f'{len(vocab)} unique words')

14746 unique words


In [None]:
# Usar StringLookup para pasar las palabras a ids
ids_from_words = tf.keras.layers.StringLookup(
    vocabulary=list(vocab), mask_token=None)

In [None]:
# Crear el inverso para convertir IDs a palabras
words_from_ids = tf.keras.layers.StringLookup(
    vocabulary=ids_from_words.get_vocabulary(), invert=True, mask_token=None)

# Función para reconstruir texto desde IDs
def text_from_words(word_ids):
    words = words_from_ids(word_ids)  # Convertir IDs a palabras
    return tf.strings.reduce_join(words, separator=" ", axis=-1)  # Unir palabras con espacio


## Ejemplos de entrenamiento

In [None]:
all_ids = ids_from_words(words)
all_ids

<tf.Tensor: shape=(202651,), dtype=int64, numpy=array([  975,     0,   275, ..., 13264,  3503,     0])>

In [None]:
ids_dataset = tf.data.Dataset.from_tensor_slices(all_ids)

In [None]:
for ids in ids_dataset.take(20):
    print(words_from_ids(ids).numpy().decode('utf-8'))

First
[UNK]
Before
we
proceed
any
[UNK]
hear
me
[UNK]
[UNK]
[UNK]
[UNK]
First
[UNK]
You
are
all
resolved
rather


In [None]:
seq_length = 100

In [None]:
sequences = ids_dataset.batch(seq_length+1, drop_remainder=True)

for seq in sequences.take(1):
  print(words_from_ids(seq))

tf.Tensor(
[b'First' b'[UNK]' b'Before' b'we' b'proceed' b'any' b'[UNK]' b'hear'
 b'me' b'[UNK]' b'[UNK]' b'[UNK]' b'[UNK]' b'First' b'[UNK]' b'You' b'are'
 b'all' b'resolved' b'rather' b'to' b'die' b'than' b'to' b'[UNK]' b'[UNK]'
 b'[UNK]' b'[UNK]' b'First' b'[UNK]' b'[UNK]' b'you' b'know' b'Caius'
 b'Marcius' b'is' b'chief' b'enemy' b'to' b'the' b'[UNK]' b'[UNK]' b'We'
 b'[UNK]' b'we' b'[UNK]' b'First' b'[UNK]' b'Let' b'us' b'kill' b'[UNK]'
 b'and' b'[UNK]' b'have' b'corn' b'at' b'our' b'own' b'[UNK]' b'[UNK]'
 b'a' b'[UNK]' b'[UNK]' b'No' b'more' b'talking' b'[UNK]' b'let' b'it'
 b'be' b'[UNK]' b'[UNK]' b'[UNK]' b'Second' b'[UNK]' b'One' b'[UNK]'
 b'good' b'[UNK]' b'First' b'[UNK]' b'We' b'are' b'accounted' b'poor'
 b'[UNK]' b'the' b'patricians' b'[UNK]' b'What' b'authority' b'surfeits'
 b'on' b'would' b'relieve' b'[UNK]' b'if' b'they' b'would' b'yield'], shape=(101,), dtype=string)


In [None]:
# Crear secuencias con el dataset
sequences = ids_dataset.batch(seq_length + 1, drop_remainder=True)

# Imprimir la primera secuencia como palabras
for seq in sequences.take(1):  # Toma la primera secuencia
    words = words_from_ids(seq)  # Convierte IDs a palabras
    words_decoded = [word.numpy().decode('utf-8') for word in words]  # Decodifica cada palabra
    print("Secuencia:", words_decoded)

Secuencia: ['First', '[UNK]', 'Before', 'we', 'proceed', 'any', '[UNK]', 'hear', 'me', '[UNK]', '[UNK]', '[UNK]', '[UNK]', 'First', '[UNK]', 'You', 'are', 'all', 'resolved', 'rather', 'to', 'die', 'than', 'to', '[UNK]', '[UNK]', '[UNK]', '[UNK]', 'First', '[UNK]', '[UNK]', 'you', 'know', 'Caius', 'Marcius', 'is', 'chief', 'enemy', 'to', 'the', '[UNK]', '[UNK]', 'We', '[UNK]', 'we', '[UNK]', 'First', '[UNK]', 'Let', 'us', 'kill', '[UNK]', 'and', '[UNK]', 'have', 'corn', 'at', 'our', 'own', '[UNK]', '[UNK]', 'a', '[UNK]', '[UNK]', 'No', 'more', 'talking', '[UNK]', 'let', 'it', 'be', '[UNK]', '[UNK]', '[UNK]', 'Second', '[UNK]', 'One', '[UNK]', 'good', '[UNK]', 'First', '[UNK]', 'We', 'are', 'accounted', 'poor', '[UNK]', 'the', 'patricians', '[UNK]', 'What', 'authority', 'surfeits', 'on', 'would', 'relieve', '[UNK]', 'if', 'they', 'would', 'yield']


In [None]:
def split_input_target(sequence):
    input_text = sequence[:-1]
    target_text = sequence[1:]
    return input_text, target_text

In [None]:
# Dividir el texto en palabras
texto = "First Citizen: Before we proceed any further, hear me speak."
words = tf.strings.split([texto])  # TensorFlow también divide en palabras

# Convertir a lista (opcional) para trabajar con la función
words = words.numpy().tolist()[0]  # Convertir el resultado a lista de palabras

# Aplicar la función de split
input_words, target_words = split_input_target(words)

print("Entrada:", input_words)
print("Salida:", target_words)

Entrada: [b'First', b'Citizen:', b'Before', b'we', b'proceed', b'any', b'further,', b'hear', b'me']
Salida: [b'Citizen:', b'Before', b'we', b'proceed', b'any', b'further,', b'hear', b'me', b'speak.']


In [None]:
dataset = sequences.map(split_input_target)

In [None]:
# Batch size
BATCH_SIZE = 64

# Buffer size to shuffle the dataset
# (TF data is designed to work with possibly infinite sequences,
# so it doesn't attempt to shuffle the entire sequence in memory. Instead,
# it maintains a buffer in which it shuffles elements).
BUFFER_SIZE = 10000

dataset = (
    dataset
    .shuffle(BUFFER_SIZE)
    .batch(BATCH_SIZE, drop_remainder=True)
    .prefetch(tf.data.experimental.AUTOTUNE))

dataset

<_PrefetchDataset element_spec=(TensorSpec(shape=(64, 100), dtype=tf.int64, name=None), TensorSpec(shape=(64, 100), dtype=tf.int64, name=None))>

## Construcción del modelo

In [None]:
# Length of the vocabulary in StringLookup Layer
vocab_size = len(ids_from_words.get_vocabulary())

# The embedding dimension
embedding_dim = 128

# Number of RNN units
rnn_units = 512

In [None]:
class MyModel(tf.keras.Model):
  def __init__(self, vocab_size, embedding_dim, rnn_units):
    super().__init__(self)
    self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
    self.gru = tf.keras.layers.GRU(rnn_units,
                                   return_sequences=True,
                                   return_state=True)
    self.dense = tf.keras.layers.Dense(vocab_size)

  def call(self, inputs, states=None, return_state=False, training=False):
    x = inputs
    x = self.embedding(x, training=training)
    if states is None:
      states = self.gru.get_initial_state(x)
    x, states = self.gru(x, initial_state=states, training=training)
    x = self.dense(x, training=training)

    if return_state:
      return x, states
    else:
      return x

In [None]:
model = MyModel(
    vocab_size=vocab_size,
    embedding_dim=embedding_dim,
    rnn_units=rnn_units)

## Probamos el modelo

In [None]:
for input_example_batch, target_example_batch in dataset.take(1):
    example_batch_predictions = model(input_example_batch)
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")

(64, 100, 14747) # (batch_size, sequence_length, vocab_size)


In [None]:
model.summary()

Model: "my_model_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_2 (Embedding)     multiple                  1887616   
                                                                 
 gru_2 (GRU)                 multiple                  986112    
                                                                 
 dense_2 (Dense)             multiple                  7565211   
                                                                 
Total params: 10438939 (39.82 MB)
Trainable params: 10438939 (39.82 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [None]:
sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
sampled_indices = tf.squeeze(sampled_indices, axis=-1).numpy()

In [None]:
sampled_indices

array([ 9808,  7037, 11473, 12282,  5594,  3055, 11688,  2688, 12922,
       11048, 14590,  8550, 14147, 13827, 11825,    69, 12300,  1956,
       10463,  8096,  5062,  3308,  8009,  2411,  8319,  1924,  5612,
       11432,  9522,  7273,  1537,  6654,  4500,  7541, 12838,  9958,
       14681,  9954,  2372,  6587,  5748,  4919,  3042,  4628, 14700,
        8333,  5932, 14086,  1578,  2311, 11069,    45,  3898,  6640,
       13551,  1896,  4952,  8199,  2921,  1854, 13310, 14181,   254,
        8001,  6490, 13057,  5285, 11256,  4760, 14583, 11169,   851,
        3287, 10463, 10741, 11343,  1635,  6666,  9787, 12118, 12895,
        8017,   954, 11411,  2335,  7037,  1192,  1906, 13348, 14107,
       10803, 12662,   393, 13285, 11585,  9538,  9128,  5001,  7238,
       11825])

In [None]:
print("Input:\n", text_from_words(input_example_batch[0]).numpy())
print()
print("Next Char Predictions:\n", text_from_words(sampled_indices).numpy())

Input:
 b'[UNK] [UNK] [UNK] and the third in your [UNK] the very butcher of a silk [UNK] a [UNK] a [UNK] a gentleman of the very first [UNK] of the first and second [UNK] [UNK] the immortal [UNK] the punto [UNK] the [UNK] [UNK] The [UNK] [UNK] The pox of such [UNK] [UNK] affecting [UNK] these new tuners of [UNK] [UNK] [UNK] a very good [UNK] a very tall [UNK] a very good [UNK] [UNK] is not this a lamentable [UNK] [UNK] that we should be thus afflicted with these strange [UNK] these [UNK] these [UNK] who stand so much on'

Next Char Predictions:
 b'nurse forswear roared smothered degenerate acceptance scarrd Turph supreme rapiers wonders knit voluptuously unmask sell Affection sob Plantagenets plaster image consists although howsoeer Sound intended Peruse delight riding mounted gentleI Lewis fault careful guards such opportunity wrongst opes Slandering fall dial commons abstinence chariot yest interior distilled viewd Love She rates Accursed benched fasts triumph Paul complots indite Wi

probamos la predicción con el modelo sin entrenar y vemos que devuelve un texto incoherente y sin formato

## Entrenamiento del modelo

In [None]:
loss = tf.losses.SparseCategoricalCrossentropy(from_logits=True)

In [None]:
example_batch_mean_loss = loss(target_example_batch, example_batch_predictions)
print("Prediction shape: ", example_batch_predictions.shape, " # (batch_size, sequence_length, vocab_size)")
print("Mean loss:        ", example_batch_mean_loss)

Prediction shape:  (64, 100, 14747)  # (batch_size, sequence_length, vocab_size)
Mean loss:         tf.Tensor(9.5987215, shape=(), dtype=float32)


In [None]:
model.compile(optimizer='adam', loss=loss)

In [None]:
# Directory where the checkpoints will be saved
checkpoint_dir = './training_checkpoints'
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

## Ejecución del entrenamiento

In [None]:
EPOCHS = 20

In [None]:
history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


## Generación de texto

In [None]:
class OneStep(tf.keras.Model):
  def __init__(self, model, words_from_ids, ids_from_words, temperature=1.0):
    super().__init__()
    self.temperature = temperature
    self.model = model
    self.words_from_ids = words_from_ids
    self.ids_from_words = ids_from_words

    # Create a mask to prevent "[UNK]" from being generated.
    skip_ids = self.ids_from_words(['[UNK]'])[:, None]
    sparse_mask = tf.SparseTensor(
        # Put a -inf at each bad index.
        values=[-float('inf')]*len(skip_ids),
        indices=skip_ids,
        # Match the shape to the vocabulary
        dense_shape=[len(ids_from_words.get_vocabulary())])
    self.prediction_mask = tf.sparse.to_dense(sparse_mask)

  @tf.function
  def generate_one_step(self, inputs, states=None):
    # Convert strings to token IDs.
    input_words = tf.strings.split(inputs, 'UTF-8')
    input_ids = self.ids_from_words(input_words).to_tensor()

    # Reshape input_ids to 3D
    #input_ids = tf.squeeze(input_ids, axis=0)  # Remove the extra dimension

    # Run the model.
    # predicted_logits.shape is [batch, char, next_char_logits]
    predicted_logits, states = self.model(inputs=input_ids, states=states,
                                          return_state=True)
    # Only use the last prediction.
    predicted_logits = predicted_logits[:, -1, :]
    predicted_logits = predicted_logits/self.temperature
    # Apply the prediction mask: prevent "[UNK]" from being generated.
    predicted_logits = predicted_logits + self.prediction_mask

    # Sample the output logits to generate token IDs.
    predicted_ids = tf.random.categorical(predicted_logits, num_samples=1)
    predicted_ids = tf.squeeze(predicted_ids, axis=-1)

    # Convert from token ids to characters
    predicted_words = self.words_from_ids(predicted_ids)

    # Return the characters and model state.
    return predicted_words, states

In [None]:
one_step_model = OneStep(model, words_from_ids, ids_from_words)

In [None]:
start = time.time()
states = None
next_word = tf.constant(['First Citizen:\nBefore we proceed any further, hear me speak.'])
#next_word = tf.strings.split(next_word)
# Convertir a lista (opcional) para trabajar con la función
#words = words.numpy().tolist()[0]  # Convertir el resultado a lista de palabras
result = [next_word]

for n in range(100):
  next_word, states = one_step_model.generate_one_step(next_word, states=states)
  result.append(next_word)

result = tf.strings.join(result, separator=' ')
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n')
print('\nRun time:', end - start)

First Citizen:
Before we proceed any further, hear me speak. Do I that both then was his my to her horn to a the didst so thy but I and as the yet rise to this our coverture I be make little quoth for so art popular The the play meal to to a carry forgot these man things I were a it make To in his Sixth the Resign to fell to prosperously the mortal the time for I me wit went to as be yet thy thee For besides thee would He of sight in varlets with masterless in I With to thy such while that unseen more 



Run time: 1.7145397663116455


Vemos que puede armar un texto mas o menos coherente pero no los separa por parrafos como si lo hace el de caracter a caracter

generamos 4 frases mas para comparar

In [None]:
start = time.time()
states = None
next_word = tf.constant(['First Citizen:\nBefore we proceed any further, hear me speak.'])
#next_word = tf.strings.split(next_word)
# Convertir a lista (opcional) para trabajar con la función
#words = words.numpy().tolist()[0]  # Convertir el resultado a lista de palabras
result = [next_word]

for n in range(100):
  next_word, states = one_step_model.generate_one_step(next_word, states=states)
  result.append(next_word)

result = tf.strings.join(result, separator=' ')
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n')
print('\nRun time:', end - start)

First Citizen:
Before we proceed any further, hear me speak. such encourage the most kings lasting no my were her his blush That of the Were you said must for the soft of the For once no sweeter instructs of have the it of from hide my to the vows bare veins not not I to even was a since thou not finger many preserve fought to Must he long so generally If fetch means of good my sentence work any that to thunder of by then hope us I play clear in those cheapest is gravity have one and I been bridegroom have one Than Verona virtues foot plainly 



Run time: 0.4776022434234619


con este probamos que cuando corres con la misma longitud de secuencia y el mismo texto de entrada te devuelve distintos textos

In [None]:
start = time.time()
states = None
next_word = tf.constant(['First Citizen:\nBefore we proceed any further, hear me speak.'])
#next_word = tf.strings.split(next_word)
# Convertir a lista (opcional) para trabajar con la función
#words = words.numpy().tolist()[0]  # Convertir el resultado a lista de palabras
result = [next_word]

for n in range(200):
  next_word, states = one_step_model.generate_one_step(next_word, states=states)
  result.append(next_word)

result = tf.strings.join(result, separator=' ')
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n')
print('\nRun time:', end - start)

First Citizen:
Before we proceed any further, hear me speak. In thyself shall do this careful and will thy one day we have the If make of such trust the are Lay As have it hath do me bearing Especially then drave no Travelling thee mortal That is your upon Our that hearing opposite Commit me your helmet hast what sees the time instantly not would beseech win honourable our heart in hopes no By not dead the this Shall the true first not be my lend that For thy Brave hinder Marcius Strange blow was breathed shall me from thou a in Hero to it from the That from authority me to a pentecost word father balm wise twenty to with a now I sort should protection so angel and a our Rome of a Christian O are Hath I to it him low cannot the my sweet more and come to damned gifts to on an use done in a the tardy deadly Upon Grumio as those me for their if wish Their Good the holy which not thy your house whispering and treble deceived and to the although my I prophet me hath leisure I and before a D

acá empieza a salirse un poco del sentido de la primer frase

In [None]:
start = time.time()
states = None
next_word = tf.constant(['VOLUMNIA:\nAy, worthy Menenius; and with most prosperous approbation.'])
#next_word = tf.strings.split(next_word)
# Convertir a lista (opcional) para trabajar con la función
#words = words.numpy().tolist()[0]  # Convertir el resultado a lista de palabras
result = [next_word]

for n in range(200):
  next_word, states = one_step_model.generate_one_step(next_word, states=states)
  result.append(next_word)

result = tf.strings.join(result, separator=' ')
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n')
print('\nRun time:', end - start)

VOLUMNIA:
Ay, worthy Menenius; and with most prosperous approbation. lady you lips purchased proclaim not would been the LADY That summer him Therefore the to the paucas the own long is up their power and what I and wilt make with absolute To I from manner and I have not am the stand Welcome to I in the toes from sent RICHARD Roger lord edge prepared towards shalt eternity will seen sun can but the poor joint beard to melancholy proper to break the the never injuries for a DUKE head small slay the give are the wert to royal fair and me familiar to life to ever nor the dreadful five hadst all If Here of the It Will being Bear am County a king will never of the dreadful shalt that she have I have Speak lord greet seest pluck here shall say be that Will do thee that so be praises from a bewitchment is the ask one They think reason and Fear our nothing one for to my Christian has there sluggard thanks access learn were save puissant one Prove an be Shall to a here am not and talked and now

acá cambiamos el texto de entrada y sigue manteniendo coherencia el texto

In [None]:
start = time.time()
states = None
next_word = tf.constant(['VOLUMNIA:\nAy, worthy Menenius; and with most prosperous approbation.'])
#next_word = tf.strings.split(next_word)
# Convertir a lista (opcional) para trabajar con la función
#words = words.numpy().tolist()[0]  # Convertir el resultado a lista de palabras
result = [next_word]

for n in range(50):
  next_word, states = one_step_model.generate_one_step(next_word, states=states)
  result.append(next_word)

result = tf.strings.join(result, separator=' ')
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n')
print('\nRun time:', end - start)

VOLUMNIA:
Ay, worthy Menenius; and with most prosperous approbation. father that not not be the bought the business by a Contract dreamt it makes is his That as you will our sudden comes size of the means to sweetly bent may beyond a profound my brawling Seduced the better your the old DUKE my one look through a Shall 



Run time: 0.25760865211486816


en este la secuencia es mas corta pero bastante mas coherente

Probamos con distintas temperaturas para ver que pasa con las predicciones

In [None]:
one_step_model = OneStep(model, words_from_ids, ids_from_words, temperature =0.4)

In [None]:
start = time.time()
states = None
next_word = tf.constant(['First Citizen:\nBefore we proceed any further, hear me speak.'])
#next_word = tf.strings.split(next_word)
result = [next_word]

for n in range(100):
  next_word, states = one_step_model.generate_one_step(next_word, states=states)
  result.append(next_word)

result = tf.strings.join(result, separator=' ')
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n')
print('\nRun time:', end - start)

First Citizen:
Before we proceed any further, hear me speak. I of the the in his I hath a the the a a the day of the the the my the the the the the as too the my the the and is the But be the I of the the the the the the the the I with the the to the the a the the I of a in the as you to the with thy the the the my not the the and a true and the a the the that the the the the the in a the the the art the the he to a 



Run time: 2.234001874923706


cuando bajamos la temperatura le da mucho peso a la palabra "the", que debe ser la que mas probabilidades tiene de ser la siguiente palabra

In [None]:
one_step_model = OneStep(model, words_from_ids, ids_from_words, temperature =2)

In [None]:
start = time.time()
states = None
next_word = tf.constant(['First Citizen:\nBefore we proceed any further, hear me speak.'])
#next_word = tf.strings.split(next_word)
result = [next_word]

for n in range(100):
  next_word, states = one_step_model.generate_one_step(next_word, states=states)
  result.append(next_word)

result = tf.strings.join(result, separator=' ')
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n')
print('\nRun time:', end - start)

First Citizen:
Before we proceed any further, hear me speak. forbid amend his bed if private crowning clouded worth foul thin thronging iron Neglected think intelligencing pox virtuous assign brief Looks escape am hilt unkindness chew daylight decree led luckless most foolish his happy Mark lips wind Gentlemen amongst stink King bleeding any so required senses will done cities bears place minds Than requisite correction paint not Ireland Digressing gall reeking together Canst well Mercutio sweet before For doubt what effuse due pines rivals bite side Write crept medlar dive toys affords ashamed Reproach foot eagle or unseen estimation bountiful from masters fed Bid supposed woo aspired frighted whilst hop 



Run time: 3.0675671100616455


lo mismo que pasaba en caracter a caracter, cuando subimos la temperatura empieza a usar palabras menos probables y el texto es un poco mas "complejo".

vemos diferentes longitudes de frecuencia

In [None]:
one_step_model = OneStep(model, words_from_ids, ids_from_words)

In [None]:
start = time.time()
states = None
next_word = tf.constant(['First Citizen:\nBefore we proceed any further, hear me speak.'])
#next_word = tf.strings.split(next_word)
result = [next_word]

for n in range(10):
  next_word, states = one_step_model.generate_one_step(next_word, states=states)
  result.append(next_word)

result = tf.strings.join(result, separator=' ')
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n')
print('\nRun time:', end - start)

First Citizen:
Before we proceed any further, hear me speak. Nor degenerate neither me his Will the measure of such 



Run time: 1.277644157409668


vemos que con una longitud de secuencia de 10 tiene bastante sentido el texto que devuelve.

In [None]:
start = time.time()
states = None
next_word = tf.constant(['First Citizen:\nBefore we proceed any further, hear me speak.'])
#next_word = tf.strings.split(next_word)
result = [next_word]

for n in range(500):
  next_word, states = one_step_model.generate_one_step(next_word, states=states)
  result.append(next_word)

result = tf.strings.join(result, separator=' ')
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n')
print('\nRun time:', end - start)

First Citizen:
Before we proceed any further, hear me speak. Signior a I him Have was of order Duke shall Of the thou bosom have to mine Bona me gods this villages Will in isle to yet noble state so master angle for Mistress Before you out it cause to And they haunts he twenty cannot be the you one the sound so lend weep change profanation EDWARD to no That they to gods fat might bark you Richmond am be this Within not Methinks a that I the is fit and pitied deceit it but not not left or they she jealousies call is if lay honour us will now leave doth her sparing speaks our where to my a so an such make heavy must save the you these uncle the The this thunder to in me with the man more sceptres this ever not Nor a dearest His own It your sight and compounded LADY the wind and a state that what a mischiefs do be parts thine thing am you speed of pardon EDWARD the roused of your tear On the His your joy Of keep counsel not greet thy LADY kingdom of all our comest my Taste is the Your and

si le pedimos una secuencia demasiado larga, termina devolviendo un texto que al final ya no sigue el hilo del inicio.

# Conclusiones Finales

La primer conclusión que sacamos es que los modelos tardan mucho tiempo en correr, incluso usando la GPU de colab. Por ende no pudimos explorar muchos modelos (como aumentar las epocas a mas de 30 en el caso de caracter a caracter o a mas de 20 en palabra a palabra, o aumentar el largo de secuencia a mas de 100)

Si comparamos ambos modelos, el de caracter a caracter con temperatura = 1 y 30 epocas, tiene mas coherencia que el de palabra a palabra con temperatura = 1 y 20 épocas, además que es capaz de imitar el verso y la prosa de esa época, respetando los dáilogos. Justo para este texto en el que el lenguaje es muy complejo y está escrito en prosa, el de caracter a caracter funciona mejor.

Con respecto a la temperatura podemos ver en ambos casos que cuando es menor a 1 el texto es mas estricto con lo que escribe (se debe a que le da mas peso a las palabras o caracteres que tienen mas probabilidades de salir) mientras que cuando es mayor a 1 el texto es mas "volado" empiezan a aparecer palabras menos probables, o en el caso de caracter a caracter palabras que no existen.

Por último, la longitud de secuencia; en este texto tan complejo no logramos ver que cambie mucho a medida que aumenta, pero cuando esto ocurre empieza a alejarse del contexto de la primera frase que le proveemos.