# Character Generation ✍

[Referencia](https://colab.research.google.com/drive/1ysEKrw_LE2jMndo1snrZUh5w87LQsCxk#forceEdit=true&sandboxMode=true&scrollTo=Fo3WY-e86zX2)

### 🎭Shakespeare play generator:

In [19]:
import keras
import tensorflow as tf
import os
import numpy as np

## Datos:

### Importandolo de una web:

In [20]:
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')

### Importandolo localmente:

In [21]:
# from google.colab import files
# path_to_file = list(files.upload().keys())[0]

### Leo los datos

In [22]:
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')

In [43]:
print(text[:250])

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



## Vectorizar el texto

### Encoding
En este caso, el encoding lo armo por orden propio del caracter (ej. 'a' < 'b' < 'c')

In [23]:
vocab = sorted(set(text))
idx2char = np.array(vocab)

char2idx = {u:i for i, u in enumerate(vocab)}

def text_to_int(text):
  return np.array([char2idx[c] for c in text])

text_as_int = text_to_int(text)

In [65]:
ids_from_chars = tf.keras.layers.StringLookup(vocabulary=list(vocab), mask_token=None)

### Decoding

In [66]:
chars_from_ids = tf.keras.layers.StringLookup(vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None)

In [24]:
def int_to_text(ints):
  try:
    ints = ints.numpy()
  except:
    pass
  return ''.join(idx2char[ints])

text == int_to_text(text_to_int(text))

True

In [67]:
def text_from_ids(ids):
  return tf.strings.reduce_join(chars_from_ids(ids), axis=-1)

In [71]:
all_ids = ids_from_chars(tf.strings.unicode_split(text, 'UTF-8'))
all_ids

<tf.Tensor: shape=(1115394,), dtype=int64, numpy=array([19, 48, 57, ..., 46,  9,  1], dtype=int64)>

## Creacion de Training Examples

Remember our task is to feed the model a sequence and have it return to us the next character. This means we need to split our text data from above into many shorter sequences that we can pass to the model as training examples. 

The training examples we will prepapre will use a *seq_length* sequence as input and a *seq_length* sequence as the output where that sequence is the original sequence shifted one letter to the right. For example:

```input: Hell | output: ello```

Our first step will be to create a stream of characters from our text data.

In [25]:
seq_length = 100  # length of sequence for a training example
examples_per_epoch = len(text)//(seq_length+1)

char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)
sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

In [26]:
def split_input_target(chunk):  # for the example: hello
    input_text = chunk[:-1]  # hell
    target_text = chunk[1:]  # ello
    return input_text, target_text  # hell, ello

dataset = sequences.map(split_input_target)  # we use map to apply the above function to every entry

In [27]:
# Buffer size to shuffle the dataset
# (TF data is designed to work with possibly infinite sequences,
# so it doesn't attempt to shuffle the entire sequence in memory. Instead,
# it maintains a buffer in which it shuffles elements).
BUFFER_SIZE = 10000
BATCH_SIZE = 64

data = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

### Ejemplo del la division en batch

In [49]:
for input_example_batch, target_example_batch in dataset.take(1):
  print()
  print(input_example_batch.shape, target_example_batch.shape)
  
  print()  
  print("INPUT")
  print(int_to_text(input_example_batch))
  print("\nOUTPUT")
  print(int_to_text(target_example_batch))


(100,) (100,)

INPUT
First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You

OUTPUT
irst Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You 


## Creando el modelo

In [28]:
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.LSTM(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

VOCAB_SIZE = len(vocab)  # vocab is number of unique characters
EMBEDDING_DIM = 256
RNN_UNITS = 1024

model = build_model(VOCAB_SIZE,EMBEDDING_DIM, RNN_UNITS, BATCH_SIZE)
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1 (Embedding)     (64, None, 256)           16640     
                                                                 
 lstm_1 (LSTM)               (64, None, 1024)          5246976   
                                                                 
 dense_1 (Dense)             (64, None, 65)            66625     
                                                                 
Total params: 5330241 (20.33 MB)
Trainable params: 5330241 (20.33 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


## Loss Function

Now we are going to create our own loss function for this problem. This is because our model will output a (64, sequence_length, 65) shaped tensor that represents the probability distribution of each character at each timestep for every sequence in the batch.

In [29]:
def loss(labels, logits):
  return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

## Compilo el modelo

In [30]:
model.compile(optimizer='adam', loss=loss)

## Train model

In [55]:
for input_example_batch, target_example_batch in data.take(1):
  print(input_example_batch)
  example_batch_predictions = model(input_example_batch)  # ask our model for a prediction on our first batch of training data (64 entries)

# pred
pred = example_batch_predictions[0]

# If we want to determine the predicted character we need to sample the output distribution (pick a value based on probabillity)
sampled_indices = tf.random.categorical(pred, num_samples=1)

# now we can reshape that array and convert all the integers to numbers to see the actual characters
sampled_indices = np.reshape(sampled_indices, (1, -1))[0]
predicted_chars = int_to_text(sampled_indices)

predicted_chars  # and this is what the model predicted for training sequence 1

tf.Tensor(
[[53  1 44 ... 43  1 21]
 [ 1 47 58 ... 39 57  1]
 [63  1 58 ... 39 58 47]
 ...
 [15 47 58 ...  1 39 52]
 [52 47 58 ... 47 58 46]
 [40 56 53 ...  1 47 44]], shape=(64, 100), dtype=int32)


"YV.&e'l-W\nHFRSq;RkVw,a'VEJSF-oRIypfWFQNV!tKrtQHOzOGABm.FFjvHRzjhDqOFIQ\nx!r$f,Mlizs-YJC'Gngk$t?Yx$je "

Antes de entrenarlo, vamos a hacer que el modelo guarde checkpoints a medida que se entrena, para poder cargar el modelo y reentrenarlo luego.

In [53]:
# Directory where the checkpoints will be saved
checkpoint_dir = './training_checkpoints'

# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

In [15]:
history = model.fit(data, epochs=50, callbacks=[checkpoint_callback])

### Levanto modelo por checkpoints

In [54]:
checkpoint = "./training_checkpoints/ckpt_50"
model.load_weights(tf.train.latest_checkpoint(checkpoint))
# model.build(tf.TensorShape([1, None]))

AttributeError: 'NoneType' object has no attribute 'endswith'

## Generando texto

In [62]:
def generate_text(model, start_string):
  
  # Evaluation step (generating text using the learned model)

  # Number of characters to generate
  num_generate = 800

  # Converting our start string to numbers (vectorizing)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # Empty string to store our results
  text_generated = []

  # Low temperatures results in more predictable text.
  # Higher temperatures results in more surprising text.
  # Experiment to find the best setting.
  temperature = 1.0

  # Here batch size == 1
  model.reset_states()
  for i in range(num_generate):
      predictions = model(input_eval)
      # remove the batch dimension
    
      predictions = tf.squeeze(predictions, 0)

      # using a categorical distribution to predict the character returned by the model
      predictions = predictions / temperature
      predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

      # We pass the predicted character as the next input to the model
      # along with the previous hidden state
      input_eval = tf.expand_dims([predicted_id], 0)

      text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

start_string = "ROMEO: Im fan of Boca Juniors, I love to play football and I like to eat"
# print(generate_text(model, start_string))

In [64]:
text_generated = []
num_generate = 800

# Converting our start string to numbers (vectorizing)
input_eval = [char2idx[s] for s in start_string]
input_eval = tf.expand_dims(input_eval, 0)

print(input_eval.shape)

predictions = model(input_eval)
pred = predictions[0]

sampled_indices = tf.random.categorical(pred, num_samples=1)
sampled_indices = np.reshape(sampled_indices, (1, -1))[0]
predicted_chars = int_to_text(sampled_indices)

temperature = 1
predictions = predictions / temperature

predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
input_eval = tf.expand_dims([predicted_id], 0)

text_generated.append(idx2char[predicted_id])

(1, 72)


ValueError: Exception encountered when calling layer 'sequential_1' (type Sequential).

Input 0 of layer "lstm_1" is incompatible with the layer: expected shape=(64, None, 256), found shape=(1, 72, 256)

Call arguments received by layer 'sequential_1' (type Sequential):
  • inputs=tf.Tensor(shape=(1, 72), dtype=int32)
  • training=None
  • mask=None