Skip to content

Implementations of vanilla RNN and LSTM from scratch.

Notifications You must be signed in to change notification settings

experiencor/basic-rnn-lstm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

These are the barebone implementations of vanilla RNN in numpy and LSTM in tensorflow to highlight the process of backpropagation through time [1]. They are then demonstrated for both character-based and word-based text generation. The backpropagation through time (BPTT) of RNN follows the manually derived derivatives while LSTM relies the auto-differentiation capability of tensorflow. The BPTT of RNN can be truncated. However, the BPTT of LSTM cannot be truncated due to the limitation of tensorflow's auto-differentiation in looping control.

Example character-based texts generated by LSTM:

he will best accusal hears: art not? nor doth distriam: he radre sweet true.

leave on me; you shall I think frown.

The network does learn basic spelling and to end a sentence by a dot. The sentences don't make a lot of sense though. A lot more training and experimentation is needed to produce better results. Lesson learned: implementing and training recurrent networks are much trickier than feed-forward networks.

Resources for conceptual explainations:

  1. http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/
  2. http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Usage

  1. "Basic RNN.ipynb" => walk-through implementation of vanilla RNN in numpy, user-defined truncated BPTT
  2. "Basic LSTM.ipynn" => walk-through implementation of LSTM in tensorflow, only full BPTT implemented due to limitation of tensorflow

About

Implementations of vanilla RNN and LSTM from scratch.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published