### Lib download

Used lib [textgenrnn](https://github.com/minimaxir/textgenrnn) is avaliable to be installed via pip install, as follows:

In [1]:
!pip3 install textgenrnn



  ### Importing lib and testing sample

In [2]:
from textgenrnn import textgenrnn

textgen = textgenrnn()
textgen.generate_samples(prefix="I don't like ")

Using TensorFlow backend.


####################
Temperature: 0.2
####################
I don't like this before you can see the top of the first time in the same card and he was a serious story?

I don't like the state of the state of the same time in the first time in the same card show and all of the states of the state of the state of the state of the store to the state of the star on the state of the same series of the state of the world in the same to start and the subreddit is on the

I don't like this on the same time to start a stranger to the state of the story of the world is a life and when I was a good country to the student in the state in the first time in the sidewalk because they don't have a stream on the story of the state of the streets and the most picture of the 

####################
Temperature: 0.5
####################
I don't like blowing a parador on the united track for the first time. I didn't know what to do what the destroyers?

I don't like the position of a trailer for my boyfriend

### Defining a new model

Following model is based on the book *Dom Casmurro* from Machado de Assis. The book as a .txt file was found at his [link](https://archive.org/stream/DomCasmurro/Dom%20Casmurro_djvu.txt). It was necessery to remove blank lines, task we did automatically using Notepad++. We also changed EOL character to match Linux pattern.

In [3]:
file_name = "datasets/dom_casmurro.txt"

model_1 = textgenrnn(name="models/dom_casmurro")

model_cfg = {
    'word_level': True,# set to True if want to train a word-level model (requires more data and smaller max_length)
    'rnn_size': 128,   # number of LSTM cells of each layer (128/256 recommended)
    'rnn_layers': 3,   # number of LSTM layers (>2 recommended)
    'rnn_bidirectional': True,   # consider text both forwards and backward, can give a training boost
    'max_length': 10,  # number of tokens to consider before predicting the next (20-40 for characters, 5-10 for words recommended)
    'max_words': 2000, # maximum number of words to model; the rest will be ignored (word-level model only)
}

train_cfg = {
    'line_delimited': False,   # set to True if each text has its own line in the source file
    'num_epochs': 15,  # set higher to train the model for longer
    'gen_epochs': 5,   # generates sample text from model after given number of epochs
    'train_size': 0.8, # proportion of input data to train on: setting < 1.0 limits model from learning perfectly
    'dropout': 0.2,     # ignore a random proportion of source tokens each epoch, allowing model to generalize better
    'validation': True,# If train__size < 1.0, test on holdout dataset; will make overall training slower
    'is_csv': False     # set to True if file is a CSV exported from Excel/BigQuery/pandas
}

model_1.train_from_file(
                        file_path=file_name,
                        new_model=True,
                        num_epochs=train_cfg['num_epochs'],
                        gen_epochs=train_cfg['gen_epochs'],
                        batch_size=1024,
                        train_size=train_cfg['train_size'],
                        dropout=train_cfg['dropout'],
                        validation=train_cfg['validation'],
                        is_csv=train_cfg['is_csv'],
                        rnn_layers=model_cfg['rnn_layers'],
                        rnn_size=model_cfg['rnn_size'],
                        rnn_bidirectional=model_cfg['rnn_bidirectional'],
                        max_length=model_cfg['max_length'],
                        dim_embeddings=100,
                        word_level=model_cfg['word_level'],
                        )

5,756 texts collected.
Training new model w/ 3-layer, 128-cell Bidirectional LSTMs
Training on 70,406 word sequences.
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
####################
Temperature: 0.2
####################
nao me lembra .

nao , e que eu nao sei que o que

nao , bentinho , e que eu nao sabia ,

####################
Temperature: 0.5
####################
nao , bentinho ; mas eu ihe deu ; a

vez que era o padre cabral , e o que nao me fez lembrar a

que nao , como se alguem , o que eu

####################
Temperature: 1.0
####################
com alguma vez a bulha , a dotes que tal

padre sobre uma bela edigao . as devotas sao nada ,

olhos no quarto ? protonotario . mas de repente , onde me assustou

Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
####################
Temperature: 0.2
####################
nao , bentinho ; eu nao posso ser padre .

nao , bentinho ; eu nao me atrevo nem peraltas na vizinhanga , nem

nao , bentinho , mas eu nao me atrevo 

### Testing text generation with our model

Let's test text generation with some of the protagonists names.

In [4]:
print('[INFO] Generating text with "Capitu " as prefix...')
model_1.generate(5, prefix="Capitu")

print('\n[INFO] Generating text with "Bentinho" as prefix...')
model_1.generate(5, prefix="Bentinho")

print('\n[INFO] Generating text with "Escobar" as prefix...')
model_1.generate(5, prefix="Escobar")

[INFO] Generating text with "Capitu " as prefix...
capitu , bentinho ?

capitu nao me lembra bem .

capitu , acharam - se a primeira vez que nao

capitu .

capitu !


[INFO] Generating text with "Bentinho" as prefix...
bentinho ?

bentinho , e certo que nao restabelecemos logo a vida .

bentinho ?

bentinho , mas eu nao queria ouvir o que a

bentinho , mas o momento da hora e o trem da


[INFO] Generating text with "Escobar" as prefix...
escobar , e um gesto , mas o furor na sala , sem ver se

escobar , e o gesto foi sempre a missa .

escobar , como nos , e a morte , esperando que a pessoa que me levou o cao ao resto , me

escobar , e o gesto nao era seguro .

escobar , e eu nao desci os novos . o que e via o

