Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

tensor-poet: a Tensorflow char-rnn implementation


alt text alt text

These are tensorflow implemention along the ideas of Andrej Karpathy's char-rnn as described in 'The Unreasonable Effectiveness of Recurrent Neural Networks'.


These Jupyter notebooks for Tensorflow 1.x and 2.x trains multi-layer LSTMs on a library of texts and then generate new text from the neural model. Through color-highlighting, source-references within the text generated by the model are used to link to the original sources. This visualizes how similar the generated and original texts are.

Run notebook in Google Colab

Some features

  • tensor_poet uses the Tensorflow 1.x API with nexted LSTMs via dynamic_rnn.
  • eager_poet uses the Tensorflow 2 API (beta at the time of this writing)
  • Generates samples periodically, including source-markup.
  • Saves model training data periodically, allows restarts.
  • Tensorboard support
  • Support for dialog with the generative model


  • 2020-03-18: TPU training on colab now works.
  • 2020-02-11: TF 2.1 colab now does things with TPU. The secret was to move the embeddings layer to cpu. Unfortunately, the result is just super-slow.
  • 2019-11-20: TF 2.0 gpu nightly: No visible TPU in colab support progresses so far. still crashes, currently Tensorboard broken with nightly too. TF 1 version: Make sure, tf 1.x is selected in colab.
  • 2019-08-26: TPU/colab now at least initializes the TPU hardware, but Keras fit() still crashes.
  • 2019-06-15: TPU tests with Tensorflow 2 beta, allocation of TPUs works, training errors out with recursion error.
  • 2019-05-16: First (unfinished) test version for Tensorflow 2 alpha.
  • 2019-05-16: Last tensorflow 1.x version, testet with 1.13.
  • 2018-10-01: Adapted for tensorflow 1.11, support for Google Colab.
  • 2018-05-13: Retested with tensorflow 1.8.
  • 2018-03-02: Adapted for tensorflow 1.6, upcoming change to tf.nn.softmax_cross_entropy_with_logits_v2
  • 2017-07-31: tested against tensorflow 1.3rc1: worked ok, for the first time the tf api did not change.
  • 2017-05-19: adapted for tensorflow 1.2rc0: batch_size can't be given as tensor and used as scalar in tf-apis.
  • 2017-04-12: adapted for tensorflow 1.1 changes: definition of multi-layer LSTMs changed

Sample model

A sample model (8 layers of LSTMs with 256 neurons) was trained for 20h on four texts from Project Gutenberg: Pride and Prejudice_ by Jane Austen, Wuthering Heights by Emily Brontë, The Voyage Out by Virginia Woolf and Emma_by Jane Austen

Intermediate results after 20h of training on an NVIDIA GTX 980 Ti:

Epoch: 462.50, iter: 225000, cross-entropy: 0.378, accuracy: 0.88851


The highlighters show passages of minimum 20 characters that are verbatim copies from one of the source texts.


  • Based on the efficient implementation of LSTMs in Tensorflow 1.x
  • A single model is used for training and text-generation, since dynamic_rnns became flexible enough for this
  • Tensorflow 1.x has nice performance improvements for deeply nested LSTMs both on CPU and GPU (the code runs completely on GPU, if on is available). Even a laptop without GPU starts generating discernable text within a few minutes.
  • Deeply nested LSTMs (e.g. 10 layers) are supported.
  • Multiple source-text-files can be given for training. After text generation, color-highlighting is used to show, where the generated text is equal to some text within the source. Thus one can visualize, how free or how close the generated text follows the original training material.
  • Support for different temperatures during text generation
  • Tensorboard support


  • Tensorflow 1.x API
  • Python 3
  • Jupyter Notebook


Shown are the training labels (y:) and the prediction by the model (yp:)

Epoch: 0.00, iter: 0, cross-entropy: 4.085, accuracy: 0.07202
   y:  doing them neither | good nor harm: but he seeks their hate with
  yp: zziiipppppppppppppppprrrrrpp               nn
Epoch: 0.37, iter: 100, cross-entropy: 2.862, accuracy: 0.24243
   y: erused the note. | Hark you, sir: I'll have them very fairly bound
  yp: a      the ae    |  | AI  e    aan  a    aeee ahe  aeee aars   aeu

At the beginning of the training, the model bascially guesses spaces, 'a' and 'e'. After a few iterations, things start to improve:

Epoch: 27.54, iter: 5000, cross-entropy: 1.067, accuracy: 0.66178
   y:  like a babe. |  | BAPTISTA: | Well mayst thou woo, and happy be thy speed! | But be thou arm'd for some
  yp: htive a clce  |  | PRPTISTA: | Ihll,hay t thou tio  and wevly trethe fteacy |  | ut wy theu srt'd aor hume

Then, the model generates samples, and highlighting references to the original training text:


This improves over time.

Parameter changes

To generate higher quality text, use the param dict:

params = {
  "vocab_size": len(textlib.i2c),
  "neurons": 128,
  "layers": 2,
  "learning_rate": 1.e-3,
  "steps": 64,}

Increasing neurons to 512, layers to 5 and steps to 100 will yield significant higher quality output.

You can add multiple text sources, by including additional file references in:

textlib = TextLibrary([  # add additional texts, to train concurrently on multiple srcs:

Upon text generation, the original passages from the different sources are marked with different highlighting.

If your generated text becomes a single highlighted quote, then your network is overfitting (or plagiarizing the original). In our cause, plagiarizing can be addressed by reducing the net's capacity (fewer neurons), or by adding more text.


Tensorflow 2 sources


Tensorflow jupyter-notebook for visualization of text-generation from multiple sources with deep LSTMs





No releases published


No packages published
You can’t perform that action at this time.