# Generation

## Setup

In [None]:
from google.colab import drive

drive.mount('/content/drive/')
%cd '/content/drive/My Drive/Deep Comedy/src'

Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).
/content/drive/My Drive/Deep Comedy/src


In [None]:
import logging
import re
import time

import numpy as np
import tensorflow as tf

from masking import create_masks
from transformer import Transformer
from learning_rate_scheduler import CustomSchedule
from syllabification import syllabify, get_tokenizers

In [None]:
logging.getLogger('tensorflow').setLevel(logging.ERROR)  # suppress warnings

## Prepare the dataset

### Download and collect

Download the syllabified Divine Comedy text from [[1]](#asperti).

In [None]:
url = 'https://raw.githubusercontent.com/asperti/Dante/main'

names = ['inferno_syllnew.txt', 'purgatorio_syllnew.txt', 'paradiso_syllnew.txt']

paths = [tf.keras.utils.get_file(name, origin=f'{url}/{name}') for name in names]

Downloading data from https://raw.githubusercontent.com/asperti/Dante/main/inferno_syllnew.txt
Downloading data from https://raw.githubusercontent.com/asperti/Dante/main/purgatorio_syllnew.txt
Downloading data from https://raw.githubusercontent.com/asperti/Dante/main/paradiso_syllnew.txt


In [None]:
def cleanup(verse):
    verse = re.sub(r'[0-9]+', '', verse)  # remove verse numeration
    verse = re.sub(r'[!"(),-.:;?«»—‘“”]+', '', verse)  # remove "special" characters
    verse = verse.strip()
    return f'<v>{verse}|</v>'

> `<v>` and `</v>` are the **_begin-of-verse_** and **_end-of-verse_** tokens, respectively.

In [None]:
def collect_verses(path):
    with open(path) as f:
        # remove blank lines
        return [cleanup(line) for line in f if line != '\n']

In [None]:
verses = [verse for path in paths for verse in collect_verses(path)]

In [None]:
verse_inside_canto_count = 0

for i in range(len(verses)):
    if '•' in verses[i]:
        # current verse is a header:
        # put a </t> at the end of the previous verse,
        # which is the last verse of a canto,
        # and as such it will be considered as part
        # of a "special" single-verse tercet.
        if i != 0: 
            verses[i-1] = f'{verses[i-1]}|</t>'
        verse_inside_canto_count = 0  # reset when at the beginning of a new canto
    else:
        if verse_inside_canto_count % 3 == 0:
            # first verse of a tercet
            verses[i] = f'<t>|{verses[i]}'
        if verse_inside_canto_count % 3 == 2:
            # last verse of a tercet
            verses[i] = f'{verses[i]}|</t>'
        verse_inside_canto_count += 1

> `<t>` and `</t>` are the **_begin-of-tercet_** and **_end-of-tercet_** tokens, respectively.

In [None]:
# last verse of the whole Comedy has not been taken care of,
# since there is no header after it
verses[-1] = f'{verses[-1]}|</t>'

# remove headers
verses = [verse for verse in verses if '•' not in verse]

In [None]:
verses[:6]

['<t>|<v>|Nel |mez|zo |del |cam|min |di |no|stra |vi|ta|</v>',
 '<v>|mi |ri|tro|vai |per |u|na |sel|va o|scu|ra|</v>',
 '<v>|ché |la |di|rit|ta |via |e|ra |smar|ri|ta|</v>|</t>',
 '<t>|<v>|Ahi |quan|to a |dir |qual |e|ra è |co|sa |du|ra|</v>',
 '<v>|e|sta |sel|va |sel|vag|gia e |a|spra e |for|te|</v>',
 '<v>|che |nel |pen|sier |ri|no|va |la |pa|u|ra|</v>|</t>']

In [None]:
verses[132:139]

['<t>|<v>|che |tu |mi |me|ni |là |do|v’ or |di|ce|sti|</v>',
 '<v>|sì |ch’ io |veg|gia |la |por|ta |di |san |Pie|tro|</v>',
 '<v>|e |co|lor |cui |tu |fai |co|tan|to |me|sti|</v>|</t>',
 '<t>|<v>|Al|lor |si |mos|se e |io |li |ten|ni |die|tro|</v>|</t>',
 '<t>|<v>|Lo |gior|no |se |n’ an|da|va e |l’ ae|re |bru|no|</v>',
 '<v>|to|glie|va |li a|ni|mai |che |so|no in |ter|ra|</v>',
 '<v>|da |le |fa|ti|che |lo|ro e |io |sol |u|no|</v>|</t>']

> _**Allor si mosse, e io li tenni dietro**_ 
is the last verse of the first Canto of the Inferno, and, according to our notation, it constitutes a tercet all by itself.

In [None]:
verses[274:281]

['<t>|<v>|Or |va |ch’ un |sol |vo|le|re è |d’ am|be|due|</v>',
 '<v>|tu |du|ca |tu |se|gno|re e |tu |ma|e|stro|</v>',
 '<v>|Co|sì |li |dis|si e |poi |che |mos|so |fue|</v>|</t>',
 '<t>|<v>|in|trai |per |lo |cam|mi|no al|to e |sil|ve|stro|</v>|</t>',
 '<t>|<v>|Per |me |si |va |ne |la |cit|tà |do|len|te|</v>',
 '<v>|per |me |si |va |ne |l’ et|ter|no |do|lo|re|</v>',
 '<v>|per |me |si |va |tra |la |per|du|ta |gen|te|</v>|</t>']

> _**intrai per lo cammino alto e silvestro**_ 
is the last verse of the second Canto of the Inferno, and again, according to our notation, it constitutes a tercet all by itself.

In [None]:
verses[-7:]

['<t>|<v>|ma |non |e|ran |da |ciò |le |pro|prie |pen|ne|</v>',
 '<v>|se |non |che |la |mia |men|te |fu |per|cos|sa|</v>',
 '<v>|da |un |ful|go|re in |che |sua |vo|glia |ven|ne|</v>|</t>',
 '<t>|<v>|A |l’ al|ta |fan|ta|sia |qui |man|cò |pos|sa|</v>',
 '<v>|ma |già |vol|ge|va il |mio |di|sio |e ’l |vel|le|</v>',
 '<v>|sì |co|me |ro|ta |ch’ i|gual|men|te è |mos|sa|</v>|</t>',
 '<t>|<v>|l’ a|mor |che |mo|ve il |so|le e |l’ al|tre |stel|le|</v>|</t>']

> The same holds for _**l’amor che move il sole e l’altre stelle**_, which is the last verse of the last Canto of the Paradiso, and, as such, the last verse of the whole Comedy.

In [None]:
NUM_VERSES = len(verses)

The dataset will be a collection of `(input,target)` couples, where:

* `input` are three verses from the Divine Comedy.
* `target` is the verse that follows the `input` verses.

In [None]:
input = []
target = []
for i in range(NUM_VERSES-3):
    input.append(verses[i] + '|' + verses[i+1] + '|' + verses[i+2])
    target.append(verses[i+3])

In [None]:
input[-1]

'<t>|<v>|A |l’ al|ta |fan|ta|sia |qui |man|cò |pos|sa|</v>|<v>|ma |già |vol|ge|va il |mio |di|sio |e ’l |vel|le|</v>|<v>|sì |co|me |ro|ta |ch’ i|gual|men|te è |mos|sa|</v>|</t>'

In [None]:
target[-1]

'<t>|<v>|l’ a|mor |che |mo|ve il |so|le e |l’ al|tre |stel|le|</v>|</t>'

In [None]:
FIRST_TERCET = input[0]

### Tokenize

The dataset will be tokenized by syllables.

In [None]:
tokenizer = tf.keras.preprocessing.text.Tokenizer(filters='', split='|')
tokenizer.fit_on_texts(verses)

tensor_input = tokenizer.texts_to_sequences(input)
tensor_input = tf.keras.preprocessing.sequence.pad_sequences(tensor_input, padding='post').astype(np.int64)

tensor_target = tokenizer.texts_to_sequences(target)
tensor_target = tf.keras.preprocessing.sequence.pad_sequences(tensor_target, padding='post').astype(np.int64)  

> Notice that the `Tokenizer` instance has the `lower` argument set to `True` (by default), therefore the text will be converted to lowercase.

In [None]:
tensor_input[-1]

array([   3,    1,   70,   92,   48,  307,   18,  348,  182,  181, 1495,
        318,   40,    2,    1,   36,  136,   86,  139,  810,  125,    9,
        691,  361,  346,   54,    2,    1,   22,    7,   28,   58,   48,
       1873, 1109,   71, 1558,  325,   40,    2,    4,    0,    0,    0])

In [None]:
tokenizer.sequences_to_texts([tensor_input[-1]])

['<t> <v> a  l’ al ta  fan ta sia  qui  man cò  pos sa </v> <v> ma  già  vol ge va il  mio  di sio  e ’l  vel le </v> <v> sì  co me  ro ta  ch’ i gual men te è  mos sa </v> </t>']

Let's make the output of `sequences_to_texts` a bit more readable:

In [None]:
def sequences_to_texts(sequences):
    texts = []
    for sequence in sequences:
        text = ''
        for i in sequence:
            if i != 0 and i != tokenizer.word_index['<t>'] and i != tokenizer.word_index['<v>']:
                if i == tokenizer.word_index['</t>']:
                    text += '\n'
                elif i == tokenizer.word_index['</v>']:
                    text = text[:-1]
                    text += '\n'
                else:
                    text += tokenizer.index_word[i] + '|'
        texts.append(text)
    return texts

In [None]:
print(sequences_to_texts([tensor_input[-1]])[0])

a |l’ al|ta |fan|ta|sia |qui |man|cò |pos|sa
ma |già |vol|ge|va il |mio |di|sio |e ’l |vel|le
sì |co|me |ro|ta |ch’ i|gual|men|te è |mos|sa




In [None]:
tensor_target[-1]

array([   3,    1,  148,  308,    5,   50, 1233,   21,  526,   92,  171,
        383,   54,    2,    4])

In [None]:
print(sequences_to_texts([tensor_target[-1]])[0])

l’ a|mor |che |mo|ve il |so|le e |l’ al|tre |stel|le




In [None]:
VOCAB_SIZE = len(tokenizer.word_index)  # number of tokens (= syllables) in the vocaboulary
INPUT_SEQ_SIZE = len(tensor_input[0])  # length of a sequence encoding an input data (= tercet)
TARGET_SEQ_SIZE = len(tensor_target[0])  # length of a sequence encoding a target data (= tercet + next verse)

### Create the dataset

In [None]:
BATCH_SIZE = 64
BUFFER_SIZE = len(input)

dataset = tf.data.Dataset.from_tensor_slices((tensor_input, tensor_target)).shuffle(BUFFER_SIZE)
dataset = dataset.batch(BATCH_SIZE, drop_remainder=True)

## Define the model

### Set the hyperparameters


In [None]:
NUM_LAYERS = 2
D_MODEL = 128
DFF = 256
NUM_HEADS = 2
DROPOUT_RATE = 0.1

> The values used in the base model of the original transformer [[2]](#attention) are 
```
NUM_LAYERS=6
D_MODEL=512
DFF=2048
NUM_HEADS=8
DROPOUT_RATE=0.1
```

In [None]:
generator = Transformer(
    num_layers=NUM_LAYERS,
    d_model=D_MODEL,
    num_heads=NUM_HEADS,
    dff=DFF,
    input_vocab_size=VOCAB_SIZE+1,
    target_vocab_size=VOCAB_SIZE+1,
    pe_input=1000,
    pe_target=1000,
    rate=DROPOUT_RATE)

### Choose the optimizer


In [None]:
optimizer = tf.keras.optimizers.Adam(CustomSchedule(D_MODEL), 
                                     beta_1=0.9, beta_2=0.98, epsilon=1e-9)

### Choose the metrics


In [None]:
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(
    from_logits=True, reduction='none')

> Since the target sequences are padded, it is important to apply a padding mask when calculating loss and accuracy.

In [None]:
def loss_function(real, pred):
  mask = tf.math.logical_not(tf.math.equal(real, 0))
  loss_ = loss_object(real, pred)

  mask = tf.cast(mask, dtype=loss_.dtype)
  loss_ *= mask

  return tf.reduce_sum(loss_)/tf.reduce_sum(mask)


def accuracy_function(real, pred):
  accuracies = tf.equal(real, tf.argmax(pred, axis=2))

  mask = tf.math.logical_not(tf.math.equal(real, 0))
  accuracies = tf.math.logical_and(mask, accuracies)

  accuracies = tf.cast(accuracies, dtype=tf.float32)
  mask = tf.cast(mask, dtype=tf.float32)
  return tf.reduce_sum(accuracies)/tf.reduce_sum(mask)

In [None]:
train_loss = tf.keras.metrics.Mean(name='train_loss')
train_accuracy = tf.keras.metrics.Mean(name='train_accuracy')

## Train

In the following, the target is divided into `tar_inp` and `tar_real`: 
* `tar_inp` is passed as an input to the decoder.
* `tar_real` is that same input shifted by 1: at each location in `tar_input`, `tar_real` contains the  next token that should be predicted.

The transformer is an auto-regressive model: it makes predictions one part at a time, and uses its output so far to decide what to do next. 

During training we use **_teacher-forcing_**, i.e. passing the true output to the next time step regardless of what the model predicts at the current time step.

As the transformer predicts each token, self-attention allows it to look at the previous tokens in the input sequence to better predict the next token.

To prevent the model from peeking at the expected output, the model uses a look-ahead mask.

In [None]:
EPOCHS = 80

In [None]:
train_step_signature = [
    tf.TensorSpec(shape=(BATCH_SIZE, INPUT_SEQ_SIZE), dtype=tf.int64),
    tf.TensorSpec(shape=(BATCH_SIZE, TARGET_SEQ_SIZE), dtype=tf.int64)
]

In [None]:
@tf.function(input_signature=train_step_signature)
def train_step(inp, tar):
  tar_inp = tar[:, :-1]
  tar_real = tar[:, 1:]

  enc_padding_mask, combined_mask, dec_padding_mask = create_masks(inp, tar_inp)

  with tf.GradientTape() as tape:
    predictions, _ = generator(inp, tar_inp,
                                 True,
                                 enc_padding_mask,
                                 combined_mask,
                                 dec_padding_mask)
    loss = loss_function(tar_real, predictions)

  gradients = tape.gradient(loss, generator.trainable_variables)
  optimizer.apply_gradients(zip(gradients, generator.trainable_variables))

  train_loss(loss)
  train_accuracy(accuracy_function(tar_real, predictions))

In [None]:
'''for epoch in range(EPOCHS):
  start = time.time()

  train_loss.reset_states()
  train_accuracy.reset_states()

  for (batch, (inp, tar)) in enumerate(dataset):
    train_step(inp, tar)

    if batch % 50 == 0:
      print(f'Epoch {epoch + 1} Batch {batch} Loss {train_loss.result():.4f} Accuracy {train_accuracy.result():.4f}')

  print(f'\n\t---Results Epoch {epoch + 1}---')
  print(f'Loss {train_loss.result():.4f} Accuracy {train_accuracy.result():.4f}')
    
  print(f'Time taken for 1 epoch: {time.time() - start:.2f} secs\n')'''

"for epoch in range(EPOCHS):\n  start = time.time()\n\n  train_loss.reset_states()\n  train_accuracy.reset_states()\n\n  for (batch, (inp, tar)) in enumerate(dataset):\n    train_step(inp, tar)\n\n    if batch % 50 == 0:\n      print(f'Epoch {epoch + 1} Batch {batch} Loss {train_loss.result():.4f} Accuracy {train_accuracy.result():.4f}')\n\n  print(f'\n\t---Results Epoch {epoch + 1}---')\n  print(f'Loss {train_loss.result():.4f} Accuracy {train_accuracy.result():.4f}')\n    \n  print(f'Time taken for 1 epoch: {time.time() - start:.2f} secs\n')"

In [None]:
%cd '/content/drive/My Drive/Deep Comedy'

/content/drive/My Drive/Deep Comedy


In [None]:
#generator.save_weights('generation_weights/')

In [None]:
generator.load_weights('generation_weights/')

<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f2183b40550>

## Generate

The generation process unfolds through the following steps:

1. The first tercet of the Divine Comedy is given as input to the encoder.
2. The decoder input is initialized to the start-of-tercet (`<t>`) token.
3. Calculate the padding masks and the look ahead masks.
4. The model makes predictions of the next token for each token in the output. Most of these are redundant: use the predictions from the last token.
5. Concatenate the predicted token to the decoder input and pass it to the decoder itself.
6. Once a whole verse has been generated, remove the `|` symbols from the verse and give it to the syllabifier: 
    * If the syllabifier output matches the verse as it was generated, count the syllables:
        * If there are 11 of them, go to step 7.
        * Otherwise, reset the decoder input to its first token (either `<t>` or `<v>`) and go back to step 3.
    * Otherwise, reset the decoder input to its first token (either `<t>` or `<v>`) and go back to step 3.
7. Remove the first verse from the encoder input.
8. Append the newly predicted verse to the encoder input.
9. Reset the decoder input to to its first token (either `<t>` or `<v>`) and go back to step 3.

### Load the syllabifier

In [None]:
syllabifier = Transformer(
    num_layers=2,
    d_model=128,
    num_heads=2,
    dff=256,
    input_vocab_size=80+1,
    target_vocab_size=81+1,
    pe_input=1000,
    pe_target=1000,
    rate=0.1)

In [None]:
syllabifier.load_weights('syllabification_weights/')

<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f21837c7d10>

In [None]:
tokenizer_nosyll, tokenizer_syll = get_tokenizers()

Downloading data from https://raw.githubusercontent.com/asperti/Dante/main/inferno.txt
Downloading data from https://raw.githubusercontent.com/asperti/Dante/main/purgatorio.txt
Downloading data from https://raw.githubusercontent.com/asperti/Dante/main/paradiso.txt


### Define the generation process

In [None]:
def predict(encoder_input, decoder_input, k=10):
    enc_padding_mask, combined_mask, dec_padding_mask = create_masks(encoder_input, decoder_input)

    # predictions.shape == (batch_size, seq_len, vocab_size)
    predictions, attention_weights = generator(encoder_input,
                                               decoder_input,
                                               False,
                                               enc_padding_mask,
                                               combined_mask,
                                               dec_padding_mask)

    # select the last character from the seq_len dimension
    predictions = predictions[:, -1:, :]  # (1, 1, vocab_size)
    
    # top-k sampling
    predictions = tf.squeeze(predictions)  # (,vocab_size)
    logits, indices = tf.math.top_k(predictions, k)
    probs = tf.keras.activations.softmax(tf.expand_dims(logits, 0))[0]

    indices = np.asarray(indices).astype('int32')
    probs = np.asarray(probs).astype('float32')
    predicted_id = np.random.choice(indices, p=probs)

    return tf.constant(predicted_id, dtype=tf.int64, shape=(1,1))

> The next token is predicted by adopting **_top-k sampling_** as sampling method (with `k=10` as default).

In [None]:
def has_issues(decoder_input):
    verse = sequences_to_texts(decoder_input.numpy())[0].replace('\n','')
    input_syll = f'<{verse.replace("|","")}>'
    
    output_syll = syllabify(
        input_syll, 
        syllabifier, 
        tokenizer_nosyll, 
        tokenizer_syll)[0].replace('<|','').replace('>','')

    if verse != output_syll:
        print('\nFOUND CONFLICT:')
        print(f'{"Generator output:":20s} {verse}')
        print(f'{"Syllabifier output:":20s} {output_syll}')
        return True

    num_syll = len(verse.split('|'))
    if num_syll != 11:
        print('\nFOUND WRONG NUMBER OF SYLLABLES:')
        print(f'Generator output = Syllabifier output: {verse}')
        print(f'Number of syllables: {num_syll}')
        return True
    
    return False

In [None]:
def generate(k=10):
    encoder_input = tokenizer.texts_to_sequences([FIRST_TERCET])
    encoder_input = tf.convert_to_tensor(encoder_input, dtype=tf.int64)
    
    SOT = tf.constant(tokenizer.word_index['<t>'], dtype=tf.int64)
    EOT = tf.constant(tokenizer.word_index['</t>'], dtype=tf.int64)
    SOV = tf.constant(tokenizer.word_index['<v>'], dtype=tf.int64)
    EOV = tf.constant(tokenizer.word_index['</v>'], dtype=tf.int64)
    
    decoder_input = tf.convert_to_tensor([SOT], dtype=tf.int64)
    decoder_input = tf.expand_dims(decoder_input, axis=0)
    initial_decoder_input = decoder_input
    final_decoder_input = decoder_input
    
    verse_count = 0
    strings = []
    
    while verse_count < 33:
        # concatenate the predicted_id to the output,
        # which is then given to the decoder as its own input
        predicted_id = predict(encoder_input, decoder_input, k)
        decoder_input = tf.concat([decoder_input, predicted_id], axis=-1)
        
        if predicted_id == EOV:

            if has_issues(decoder_input):
                # the syllabification produced by the syllabifier differs from that of the generator,
                # therefore discard the verse and generate a new one
                decoder_input = initial_decoder_input
                continue

            if (verse_count + 1) % 3 == 0:
                # 3 verses have been generated, i.e. we are at the end of a tercet
                eot = tf.convert_to_tensor([EOT], dtype=tf.int64)
                eot = tf.expand_dims(eot, axis=0)
                decoder_input = tf.concat([decoder_input, eot], axis=-1)

            # remove first verse from encoder input
            encoder_input = encoder_input.numpy()
            i = np.where(encoder_input[0] == EOV.numpy())[0][0]
            encoder_input = encoder_input[:, i+1:]
            encoder_input = tf.convert_to_tensor(encoder_input, dtype=tf.int64)
            
            # concatenate newly generated verse to encoder input
            encoder_input = tf.concat([encoder_input, decoder_input], axis=-1)

            # save generated verse for final display
            final_decoder_input = tf.concat([final_decoder_input, decoder_input], axis=-1)

            # restart decoder input from either SOV or SOT token
            if decoder_input[0,-1] == EOT.numpy(): 
                decoder_input = tf.convert_to_tensor([SOT], dtype=tf.int64)
            else:
                decoder_input = tf.convert_to_tensor([SOV], dtype=tf.int64)
            decoder_input = tf.expand_dims(decoder_input, axis=0)
            initial_decoder_input = decoder_input
            
            verse_count += 1
        
    return sequences_to_texts(final_decoder_input.numpy())[0]

### Display the results

In [None]:
generated_verses = generate()


FOUND WRONG NUMBER OF SYLLABLES:
Generator output = Syllabifier output: pa|re|va |già |nel |cam|bian|do |par|ran |trat|te
Number of syllables: 12

FOUND WRONG NUMBER OF SYLLABLES:
Generator output = Syllabifier output: col |ca|ri|me e|ra|no |la|vi ’l |boc|ca
Number of syllables: 10

FOUND WRONG NUMBER OF SYLLABLES:
Generator output = Syllabifier output: fu|ron |cre|a|te a |bel|la |don|na |via
Number of syllables: 10


In [None]:
print(generated_verses)

ahi |quan|to a |dir |qual |e|ra è |co|sa |du|ra
e|sta |sel|va |sel|vag|gia e |a|spra e |for|te
che |nel |pen|sier |ri|no|va |la |pa|u|ra

tan|t’ è |l’ or|ri|mi|ra |di |sé |ri|mor|te
per |che |del |po|sto |mor|tal |quan|do |pon|ti
che |per |lo |de|mo|lar |ne’ |par|ve|re o|te

o|gne |vir|tù |for|vien |che |s’ io |ri|dir|ti
dis|s’ io |a |me |tal|vol|ta |sì |com’ |a|mo
se |non |eb|be |po|suo|ta |li |sì |for|ti

vi|di |per |lo |no|stro a|mor |dis|se |co|mo
io |nol |di|scon|dea |sì |ch’ al|tra |fï|a|te
lo |ciel |che |pian|ger |più |sta|van |di |fo|mo

lo |mio |ma|e|stro |son |sì |su|so il |sguar|te
in |cac|cia|ti e |che ’l |so|lo è |sì |par|te|re
co|min|ciò |tut|ti |li al|tri |che ’l |ciel |fron|te

lo |du|ca |mio |si |vol|te in |ti|ca |fi|re
lo |fie|re |de |la |mor|te |dis|se |scoc|ca
l’ ar|ti|co |del |cam|mem|bru|to e |che |fo|re

e |nes|sun |ri|pa|re in|te|bru|na |bar|ca
li |ri|chiu|se il |fum|mo e |quin|di |ri|pa |vò
do|ve |can|ta|re |do|ve |la |com|par|ca

co|me |la |vo|ce |vo|lon|tier 

In [None]:
print(generated_verses.replace('|',''))

ahi quanto a dir qual era è cosa dura
esta selva selvaggia e aspra e forte
che nel pensier rinova la paura

tant’ è l’ orrimira di sé rimorte
per che del posto mortal quando ponti
che per lo demolar ne’ parvere ote

ogne virtù forvien che s’ io ridirti
diss’ io a me talvolta sì com’ amo
se non ebbe posuota li sì forti

vidi per lo nostro amor disse como
io nol discondea sì ch’ altra fïate
lo ciel che pianger più stavan di fomo

lo mio maestro son sì suso il sguarte
in cacciati e che ’l solo è sì partere
cominciò tutti li altri che ’l ciel fronte

lo duca mio si volte in tica fire
lo fiere de la morte disse scocca
l’ artico del cammembruto e che fore

e nessun ripare intebruna barca
li richiuse il fummo e quindi ripa vò
dove cantare dove la comparca

come la voce volontier conovo
ma ciascuna bella donna cinbile
ti faccia tutto ’l mio che ti distevo

e quella donna che mi scala insele
dinanzi che l’ altra roccia son quella
ne le gambe con le marcheruscelle

ma prima equalità v’ apparse b

## References
<a name="asperti">[1]</a> [`Dante` repository at prof. Asperti's GitHub page](https://github.com/asperti/Dante)
<br>
<a name="attention">[2]</a> [Attention Is All You Need, Vaswani et al., 2017](https://arxiv.org/abs/1706.03762)