## Text Generation with Neural Network (RNN)

Acknowledgement: Max Woolf https://github.com/minimaxir/textgenrnn

#### Introduction

Text generation is a challenging problem that even the largest data science teams are still struggling with, so we'll explore some of the most common and accessible methods to solve the problem, starting at a somewhat basic level. The approach we will attempt in this notebook is:

* Character and word based RNN using the library textgenrnn

textgenrnn is a Python module on top of Keras/TensorFlow which can easily generate text using a pretrained recurrent neural network: textgenrnn is a project by Max Woolf (https://github/minimaxir/textgenrnn)         

#### Train a new model
You can train a new model using any modern RNN architecture you want by:
* calling train_new_model if supplying texts, or adding a new_model=True parameter if training from a file. If you do, the model will save a config file and a vocab file in addition to the weights, and those must be also loaded into a textgenrnn instances.

The config parameters available are:
* word_level: Whether to train the model at the word level (default: False)
* rnn_layers: Number of recurrent LSTM layers in the model (default: 2)
* rnn_size: Number of cells in each LSTM layer (default: 128)
* rnn_bidirectional: Whether to use Bidirectional LSTMs, which account for sequences both forwards and backwards. Recommended if the input text follows a specific schema. (default: False)
* max_length: Maximum number of previous characters/words to use before predicting the next token. This value should be reduced for word-level models (default: 40)
* max_words: Maximum number of words (by frequency) to consider for training (default: 10000)
* dim_embeddings: Dimensionality of the character/word embeddings (default: 100)



In [1]:
from textgenrnn import textgenrnn

Using TensorFlow backend.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])


In [4]:
## The format of the input file is simply one line per document. 
## When preparing the file, include opening and closing quotes for accurately preprocessing
## in the output, the temperature value (0 to 1) refers to the level of creativity

## At the end of the training, the model is saved to a file textgenrnn_weights.hdf5

textgen = textgenrnn()
textgen.train_from_file('./data/reflections.txt', max_length=40, word_level=True, rnn_size=64,  num_epochs=4, dim_embeddings=100, rnn_bidirectional=False)


345 texts collected.
Training on 49,613 character sequences.
Epoch 1/4
####################
Temperature: 0.2
####################
eeee to do the project and the worksheet and the worksheet to do the worksheet and we were and we did the worksheet and discuss to do the team to do and the worksheet to do the startup and the worksheet is able to completed to the worksheet and the team and the worksheet is between the worksheet an

eeeeors we were and the worksheet to do the worksheet and some to be the project to completed to do the worksheet and we did the completion of the worksheet. We all the worksheet and we were are able to completed the worksheet to do and the worksheet and the worksheet and do the project and we were

telte of the worksheet to do the worksheet and the worksheet and we were all of the worksheet and do the worksheet and the worksheet and we did the team and the worksheet and we were answers and the team team and the worksheet and the constraints that we were and the 

worksheet the resources were sure what and additional to each other we can be the project and help for the project along the project and completed with the project that we will properly progress that we were able to complete the team worksheet that I can also preparing and I have to provided to the

"I will be able to be finished it in selections and we all are all of the project and project charter when they have to achieve the final project and which was the project and splitting the team worksheet that I will not decide the project and the simple between completed the project and started to

"I have any deadline to the team worksheet and they is a completed the team worksheet that we will not get different answers within the team worksheet will go within the project are the first insights of the worksheet with the raises have a before the worksheet from a team worksheet to build a subm

####################
Temperature: 1.0
####################
Scodquesch Where i answered O. Huntase

* Now the fun part, to generate some random text

In [5]:
# generate 1 text document
textgen.generate(1)

the project and very ended the project and application the triple cost and decided the project are the person and I respected to complete the project and we shared before the project and team members of the project is able to do their project are the team to do the project and already that I can



* Generate 1 text document starting with "My team mates decide to use Google to"

In [6]:
textgen.generate(1, prefix="My team mates decide to use Google for")

My team mates decide to use Google for the project and do the worksheet from the worksheet together and informed the worksheet time as an experience of I can he to make the worksheet as the project is a sologe and this week . I have to learn to promote the answer on the project and the worksheet"



## Exercise A

As with any training of ML/DL models, a lot of work goes into tuning the parameters and using applying some intuition on what might yield an acceptable results.
For the task of using RNN/LSTM to generate text, the following parameters should result in different performance of the model.

- Are you using word-level or character-level as the input
- What is the size of the dimensions. We admit that don't know if the module textgenrnn is using word2vec or other variation
- Whether you are training forward only, or using bi-directional


#### Your task: 
Change the input parameters of the function train_from_file to try out the different values of hyperparameters
- word_level=True, word_level=False
- dim_embeddings=100, dim_embeddings=50, dim_embeddings=150
- rnn_bidirectional=False, rnn_bidirectional=True

The, rerun the model and re-generate the text.

In [10]:
textgen_new = textgenrnn()
## Hyper parameters
word_level=True
word_level=False

dim_embeddings=100
dim_embeddings=50

rnn_bidirectional=False
rnn_bidirectional=True

textgen_new.train_from_file('./data/reflections.txt', max_length=40, word_level=word_level, rnn_size=64,  num_epochs=4, dim_embeddings=dim_embeddings, rnn_bidirectional=rnn_bidirectional)
textgen_new.generate(1, prefix="My team mates decide to use Google for")

345 texts collected.
Training on 49,613 character sequences.
Epoch 1/4
####################
Temperature: 0.2
####################
communicating the team and then the project and discussion and all the project and as the team and then answers and then is the project is a project and all of the project and the project is a project and all of the project and then is the project and discussion to a project and discussion and star

communicating the team and started to ensure that the team was answers and all of the project and discussion to communicating the team and than we did it to communicate the team and then the team is the project and all the team and the project contributed to communicating the project and did the te

contributed to plan the team and started to the project and school and discussion of the project and project and started to contribute the project and all the team we had the project and answers in the project and all of the team and all a communication of the project

at the project based on the worksheet. The project contributed a give project and stakeholder and ask the worksheet contributed by the project. I will do the team member and we can increase the different team worksheet and project to be able to finish the teamwork in the project on the task grid by

Teamwork will be able to each other and the amount of the team assormans for the project. I also get a charter and therefore that we will do the team worksheet on the change. We managed to do the project to do the questions were an explain of us to do the worksheet together and what we can plan to 

an areas with the worksheet with the questions or work where to complete the project. The answers of the team worksheet. Some questions that we can project a time that my teammates we have successful parts with the questions that we submitted to do the team worksheet. We completed the worksheet tha

####################
Temperature: 1.0
####################


"My team answer planstolve so an end