Basic PyTorch text generator

A basic (and heavily commented) word level RNN text generator created for pedagogical purposes.

The notebook implements a simple RNN language model that predicts that the next word given the previous word.

Prerequisites

Python 3.6 or higher
pytorch
textblob
joblib
sklearn

Usage

Running the first two code cells sets up the needed classes and downloads the required data. From there, you can skip down and just load the pretained model, or you can play with the options and train your own. Options to play with:

Replace the texts in the data/texts with texts of your own.
The size of the recurrent layer.
The number of recurrent layers.
The type of recurrent layer (LSTM, GRU, RNN).
The batch size.
Different optimizers and learning rates.
The length of the training sequences.

Known issues

There is no GPU support.
I'm just predicting the most probable next word, which means that the model can get stuck in a loop repeating similar passages of text. A more complex approach would be to randomly sample the next word based on the predicted probabilities.
Currently, each batch contains batch_size sequences of token ids. These sequences are randomly shuffled between batches, so we throw away the hidden state between batches because the ith sequence in batch 1 isn't contiguous with the ith sequence in batch 2. Ideally, we'd cut up the input text so that these were contiguous, and then the hidden state could be retained between batches, and the network could better learn to handle older hidden states (currently, it's only learning to deal with hidden states that have been used for 50 words, the length of the training sequences). Be careful here to detach the hidden states between batches (something like hidden = hidden.detach()) or it will try and backprop all the way back to the first input.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
word_level_text_generation.ipynb		word_level_text_generation.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Basic PyTorch text generator

Prerequisites

Usage

Known issues

About

Uh oh!

Releases

Packages

Languages

nbeshouri/basic_pytorch_language_model

Folders and files

Latest commit

History

Repository files navigation

Basic PyTorch text generator

Prerequisites

Usage

Known issues

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages