Load an LSTM Model and Generate Text

This project shows how to host a handful of pre-trained text generating models on a node server. The Github repository with information on these models and how they were trained can be found here.

About the Models Used

Three different approaches were used in creating these text-generating models.

The first approach is based on the LSTM text generation model presented in this tensorflowjs example, which generates one character at a time. I will call this the char-based model. Because these text-generating models were intended for a younger audience, we found it favorable to apply some spelling correction to the generated text and some word filtering. (Todo, improve the word filtering).

The second approach uses a very similar LSTM model which generates entire words at a time instead of character by character. I will call this model the word-based model. With the word-based model, we would not need to run a spell check on the output, since the model would only choose from words which it has seen before, ideally spelled correctly in the corpus. Additionally, we hoped that the speed of this model would be faster than the char-based model. Although this was the case, the speed increase was not as dramatic as we would have liked.

The third and final approach was to stray away from LSTM models, and apply an Unsmoothed Maximum Likelihood Character Level Language Model. I will call this model the max-likelihood model. In short, this model looks at the entire corpus and tracks the probability a certain character follows the given phrase. For more detail, I recommend looking at the notebook linked above. This model is incredibly fast and will never give a spelling mistake. Syntactical and grammatical errors remain a problem though.

Running the Node Server

By default, this project will run instances of the char-based models trained on a number of corpuses in the index.js file. If you would like to run a node server which will generate text using the word-based model or the max-likelihood model, run node word-based-index.js or node max-likelihood-index.js instead. The instructions for querying the models is given in the section below.

Before you can run a node server with either the char-based or word-based models, install the tfjs-node package with npm install @tensorflow/tfjs-node. To run to project, run the command node index.js.

Before you can run a node server with the max-likelihood model, install the python shell package with npm install python-shell.

Querying the Node server

When querying the node server, you can provide various parameters such as the model you want to use, the seed text you want to provide, and the length you would like the generated text to be (in characters). The only parameter that is required is the seed text, which must meet or exceed a length of 40 characters.

Model Name

You can then specify the model you would like to load by adding model= and adding the name of the model. Names of models currently included can be found in the variable modelFileNames in the index.js file.

For the char-based and word-based models, the structure for the models is <name>_<num epochs>

Corpus	Char-Based	Word-Based	Max-Likelihood
alice-in-wonderland.txt	aliceInWonderland_0 aliceInWonderland_1 aliceInWonderalnd_5 aliceInWonderland_20	(WIP) aliceInWonderland_0 aliceinWonderland_25 aliceinWonderland_100 aliceinWonderland_500	aliceInWonderland
drseuss.txt	drSeuss_0 drSeuss_1 drSeuss_5 drSeuss_20	(WIP) drSeuss_0 drSeuss_25 drSeuss_100 drSeuss_500	drSeuss
harry-potter-1.txt	harryPotter_0 harryPotter_1 harryPotter_5 harryPotter_20	(WIP) harryPotter_0 harryPotter_25 harryPotter_100 harryPotter_500	harryPotter
nancy-drew.txt	nancy_0 nancy_1 nancy_5 nancy_20	(WIP) nancy_0 nancy_25 nancy_100 nancy_500	nancy
narnia-1.txt	narnia_1_0 narnia_1_1 narnia_1_5 narnia_1_20	narnia_0 narnia_25 narnia_100 narnia_500	narnia
tomsawyer.txt	tomSawyer_0 tomSawyer_1 tomSawyer_5 tomSawyer_20	(WIP) tomSawyer_0 tomSawyer_25 tomSawyer_100 tomSawyer_500	tomSawyer
wizard-of-oz.txt	wizardOfOz_0 wizardOfOz_1 wizardOfOz_5 wizardOfOz_20	(WIP) wizardOfOz_0 wizardOfOz_25 wizardOfOz_100 wizardOfOz_500	wizardOfOz

In addition, the max-likelihood model has the models hamlet, hungerGames, and shakespeare which you can use to generate text.

Seed Text

When querying the node server, the only required input is the seed text, inputText=. The only requirement for the seed text is that, when using the char-based model, the seed text needs to be at least 40 characters long. The other word-based and max-ikelihood models, the seed text just needs to be at least one character long.

Output Text Length

You can also specify the number of characters which you want the resulting string to be. Simply add outputLength= followed by the number of characters. The default values are:

Model Architecture	Default Output Length
Char-Based	40 characters
Word-Based	40 characters
Max-Likelihood	40 words

Spell Check [Default=1]

This setting applies only to the char-based model. You can deactivate the spellchecker for the model by setting spellcheck=0 in the query. By default, the model's output gets checked by a spell checker and takes the first suggestion for the correction.

Example URL

path-to-the-node-server:1234/?inputText=hello%20there%20how%20are%20you%20today%20because%20I%20am%20doing%20pretty%20well&model=drSeuss_20&outputLength=10

To simply test if the server is online and responding properly, the server checks to see if inputText='test'. If it does, it will return "The test returned correctly" for all of the node servers.

path-to-the-node-server:1234/?inputText=test&model=narnia_100

Overview from Original Source

This example illustrates how to use TensorFlow.js to train a LSTM model to generate random text based on the patterns in a text corpus such as Nietzsche's writing or the source code of TensorFlow.js itself.

The LSTM model operates at the character level. It takes a tensor of shape [numExamples, sampleLen, charSetSize] as the input. The input is a one-hot encoding of sequences of sampleLen characters. The characters belong to a set of charSetSize unique characters. With the input, the model outputs a tensor of shape [numExamples, charSetSize], which represents the model's predicted probabilites of the character that follows the input sequence. The application then draws a random sample based on the predicted probabilities to get the next character. Once the next character is obtained, its one-hot encoding is concatenated with the previous input sequence to form the input for the next time step. This process is repeated in order to generate a character sequence of a given length. The randomness (diversity) is controlled by a temperature parameter.

The UI allows creation of models consisting of a single LSTM layer or multiple, stacked LSTM layers.

This example also illustrates how to save a trained model in the browser's IndexedDB using TensorFlow.js's model saving API, so that the result of the training may persist across browser sessions. Once a previously-trained model is loaded from the IndexedDB, it can be used in text generation and/or further training.

This example is inspired by the LSTM text generation example from Keras: https://github.com/keras-team/keras/blob/master/examples/lstm_text_generation.py

Usage

yarn && node index.js

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
char-based-models		char-based-models
datasets		datasets
dictionaries		dictionaries
max-likelihood-models		max-likelihood-models
word-based-models		word-based-models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
favicon.ico		favicon.ico
index.js		index.js
max-likelihood-index.js		max-likelihood-index.js
max-likelihood.py		max-likelihood.py
package-lock.json		package-lock.json
package.json		package.json
word-based-charsets.json		word-based-charsets.json
word-based-index.js		word-based-index.js

License

jessvb/lstm_web_server

Folders and files

Latest commit

History

Repository files navigation

Load an LSTM Model and Generate Text

About the Models Used

Running the Node Server

Querying the Node server

Model Name

Seed Text

Output Text Length

Spell Check [Default=1]

Example URL

Overview from Original Source

Usage

About

Resources

License

Stars

Watchers

Forks

Languages