Skip to content

jessvb/lstm_web_server

Repository files navigation

Load an LSTM Model and Generate Text

This project shows how to host a handful of pre-trained text generating models on a node server. The Github repository with information on these models and how they were trained can be found here.

About the Models Used

Three different approaches were used in creating these text-generating models.

The first approach is based on the LSTM text generation model presented in this tensorflowjs example, which generates one character at a time. I will call this the char-based model. Because these text-generating models were intended for a younger audience, we found it favorable to apply some spelling correction to the generated text and some word filtering. (Todo, improve the word filtering).

The second approach uses a very similar LSTM model which generates entire words at a time instead of character by character. I will call this model the word-based model. With the word-based model, we would not need to run a spell check on the output, since the model would only choose from words which it has seen before, ideally spelled correctly in the corpus. Additionally, we hoped that the speed of this model would be faster than the char-based model. Although this was the case, the speed increase was not as dramatic as we would have liked.

The third and final approach was to stray away from LSTM models, and apply an Unsmoothed Maximum Likelihood Character Level Language Model. I will call this model the max-likelihood model. In short, this model looks at the entire corpus and tracks the probability a certain character follows the given phrase. For more detail, I recommend looking at the notebook linked above. This model is incredibly fast and will never give a spelling mistake. Syntactical and grammatical errors remain a problem though.

Running the Node Server

By default, this project will run instances of the char-based models trained on a number of corpuses in the index.js file. If you would like to run a node server which will generate text using the word-based model or the max-likelihood model, run node word-based-index.js or node max-likelihood-index.js instead. The instructions for querying the models is given in the section below.

Before you can run a node server with either the char-based or word-based models, install the tfjs-node package with npm install @tensorflow/tfjs-node. To run to project, run the command node index.js.

Before you can run a node server with the max-likelihood model, install the python shell package with npm install python-shell.

Querying the Node server

When querying the node server, you can provide various parameters such as the model you want to use, the seed text you want to provide, and the length you would like the generated text to be (in characters). The only parameter that is required is the seed text, which must meet or exceed a length of 40 characters.

Model Name

You can then specify the model you would like to load by adding model= and adding the name of the model. Names of models currently included can be found in the variable modelFileNames in the index.js file.

For the char-based and word-based models, the structure for the models is <name>_<num epochs>

Corpus Char-Based Word-Based Max-Likelihood
alice-in-wonderland.txt aliceInWonderland_0
aliceInWonderland_1
aliceInWonderalnd_5
aliceInWonderland_20
(WIP)
aliceInWonderland_0
aliceinWonderland_25
aliceinWonderland_100
aliceinWonderland_500
aliceInWonderland
drseuss.txt drSeuss_0
drSeuss_1
drSeuss_5
drSeuss_20
(WIP)
drSeuss_0
drSeuss_25
drSeuss_100
drSeuss_500
drSeuss
harry-potter-1.txt harryPotter_0
harryPotter_1
harryPotter_5
harryPotter_20
(WIP)
harryPotter_0
harryPotter_25
harryPotter_100
harryPotter_500
harryPotter
nancy-drew.txt nancy_0
nancy_1
nancy_5
nancy_20
(WIP)
nancy_0
nancy_25
nancy_100
nancy_500
nancy
narnia-1.txt narnia_1_0
narnia_1_1
narnia_1_5
narnia_1_20
narnia_0
narnia_25
narnia_100
narnia_500
narnia
tomsawyer.txt tomSawyer_0
tomSawyer_1
tomSawyer_5
tomSawyer_20
(WIP)
tomSawyer_0
tomSawyer_25
tomSawyer_100
tomSawyer_500
tomSawyer
wizard-of-oz.txt wizardOfOz_0
wizardOfOz_1
wizardOfOz_5
wizardOfOz_20
(WIP)
wizardOfOz_0
wizardOfOz_25
wizardOfOz_100
wizardOfOz_500
wizardOfOz

In addition, the max-likelihood model has the models hamlet, hungerGames, and shakespeare which you can use to generate text.

Seed Text

When querying the node server, the only required input is the seed text, inputText=. The only requirement for the seed text is that, when using the char-based model, the seed text needs to be at least 40 characters long. The other word-based and max-ikelihood models, the seed text just needs to be at least one character long.

Output Text Length

You can also specify the number of characters which you want the resulting string to be. Simply add outputLength= followed by the number of characters. The default values are:

Model ArchitectureDefault Output Length
Char-Based40 characters
Word-Based40 characters
Max-Likelihood40 words

Spell Check [Default=1]

This setting applies only to the char-based model. You can deactivate the spellchecker for the model by setting spellcheck=0 in the query. By default, the model's output gets checked by a spell checker and takes the first suggestion for the correction.

Example URL

path-to-the-node-server:1234/?inputText=hello%20there%20how%20are%20you%20today%20because%20I%20am%20doing%20pretty%20well&model=drSeuss_20&outputLength=10

To simply test if the server is online and responding properly, the server checks to see if inputText='test'. If it does, it will return "The test returned correctly" for all of the node servers.

path-to-the-node-server:1234/?inputText=test&model=narnia_100

Overview from Original Source

This example illustrates how to use TensorFlow.js to train a LSTM model to generate random text based on the patterns in a text corpus such as Nietzsche's writing or the source code of TensorFlow.js itself.

The LSTM model operates at the character level. It takes a tensor of shape [numExamples, sampleLen, charSetSize] as the input. The input is a one-hot encoding of sequences of sampleLen characters. The characters belong to a set of charSetSize unique characters. With the input, the model outputs a tensor of shape [numExamples, charSetSize], which represents the model's predicted probabilites of the character that follows the input sequence. The application then draws a random sample based on the predicted probabilities to get the next character. Once the next character is obtained, its one-hot encoding is concatenated with the previous input sequence to form the input for the next time step. This process is repeated in order to generate a character sequence of a given length. The randomness (diversity) is controlled by a temperature parameter.

The UI allows creation of models consisting of a single LSTM layer or multiple, stacked LSTM layers.

This example also illustrates how to save a trained model in the browser's IndexedDB using TensorFlow.js's model saving API, so that the result of the training may persist across browser sessions. Once a previously-trained model is loaded from the IndexedDB, it can be used in text generation and/or further training.

This example is inspired by the LSTM text generation example from Keras: https://github.com/keras-team/keras/blob/master/examples/lstm_text_generation.py

Usage

yarn && node index.js

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published