# Character-level text generation with LSTM

**Source Citation*** https://keras.io/examples/generative/lstm_character_level_text_generation/<br>
**Author:** [fchollet](https://twitter.com/fchollet)<br>
**Date created:** 2015/06/15<br>
**Last modified:** 2020/04/30<br>
**Description:** Generate text from Nietzsche's writings with a character-level LSTM.

## Introduction

This example demonstrates how to use a LSTM model to generate
text character-by-character.

At least 20 epochs are required before the generated text
starts sounding locally coherent.

It is recommended to run this script on GPU, as recurrent
networks are quite computationally intensive.

If you try this script on new data, make sure your corpus
has at least ~100k characters. ~1M is better.


## Setup


In [1]:
from tensorflow import keras
from tensorflow.keras import layers

import numpy as np
import random
import io


## Prepare the data


In [2]:
path = keras.utils.get_file(
    "input.txt", origin="https://github.com/karpathy/char-rnn/tree/master/data/tinyshakespeare/input.txt"
)
with io.open(path, encoding="utf-8") as f:
    text = f.read().lower()
text = text.replace("\n", " ")  # We remove newlines chars for nicer display
print("Corpus length:", len(text))

chars = sorted(list(set(text)))
print("Total chars:", len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i : i + maxlen])
    next_chars.append(text[i + maxlen])
print("Number of sequences:", len(sentences))

x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1



Corpus length: 101675
Total chars: 68
Number of sequences: 33879


## Build the model: a single LSTM layer


In [3]:
model = keras.Sequential(
    [
        keras.Input(shape=(maxlen, len(chars))),
        layers.LSTM(128),
        layers.Dense(len(chars), activation="softmax"),
    ]
)
optimizer = keras.optimizers.RMSprop(learning_rate=0.01)
model.compile(loss="categorical_crossentropy", optimizer=optimizer)


## Prepare the text sampling function


In [4]:

def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype("float64")
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)



## Train the model


In [5]:
epochs = 40
batch_size = 128

for epoch in range(epochs):
    model.fit(x, y, batch_size=batch_size, epochs=1)
    print()
    print("Generating text after epoch: %d" % epoch)

    start_index = random.randint(0, len(text) - maxlen - 1)
    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print("...Diversity:", diversity)

        generated = ""
        sentence = text[start_index : start_index + maxlen]
        print('...Generating with seed: "' + sentence + '"')

        for i in range(400):
            x_pred = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x_pred[0, t, char_indices[char]] = 1.0
            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]
            sentence = sentence[1:] + next_char
            generated += next_char

        print("...Generated: ", generated)
        print()



Generating text after epoch: 0
...Diversity: 0.2
...Generating with seed: "t;originating_url&quot;:&quot;https://gi"
...Generated:  thtry/char-rrar-runt-rulinen data-link-sulinen d-lone d-lo-serine der-inon-rinen" dith="16" vide="16" hrick="16" viath="boxter" data-huble=">                                                                                                                                                                                                                                                                   

...Diversity: 0.5
...Generating with seed: "t;originating_url&quot;:&quot;https://gi"
...Generated:  thrign-ssrarchub.com/srarirtht/char-rank-run" class="kdet" data-sule=" mrick="hvight="b.5 0 0 0171.75.75 0 011.5 0 011.00 1.75 0 001.15 1 0 01.01 0 0 01.75 0 13.01.5 2.75 2 0111.16 1.75.75 0 011 0 10 1.5.0 1.001.18 0 11.0 1.772 0 0011.0 0 11 1.5 1.01 0 011.5 0 01.17 1.74.75 0 011.04-1.75.75 0 011.75 0 01.01.11 1.75.75.75.75.75 0 0011.01 1.75.75 0 011.01-1.5 0 01


Generating text after epoch: 4
...Diversity: 0.2
...Generating with seed: "m_create_first_classroom&quot;}, {&quot;"
...Generated:  experimentids&quot;: [], &quot;id&quot;: &quot;18252125125&quot;, &quot;key&quot;: &quot;18225710185&quot;, &quot;key&quot;: &quot;182825125&quot;, &quot;key&quot;: &quot;182221015375&quot;, &quot;key&quot;: &quot;18252212136525&quot;, &quot;key&quot;: &quot;182575125325&quot;, &quot;key&quot;: &quot;18222125&quot;, &quot;key&quot;: &quot;18282125&quot;, &quot;key&quot;: &quot;182221195325&quot;, 

...Diversity: 0.5
...Generating with seed: "m_create_first_classroom&quot;}, {&quot;"
...Generated:  experimentids&quot;: [], &quot;id&quot;: &quot;192541345337425&quot;, &quot;key&quot;: &quot;1822212150&quot;, &quot;key&quot;: &quot;18182555125&quot;, &quot;key&quot;: &quot;18257119253495&quot;, &quot;key&quot;: &quot;18222502125&quot;, &quot;key&quot;: &quot;18254670117525&quot;, &quot;key&quot;: &quot;182701085995325&quot;, &quot;key&quot;: &quot;182575575&


Generating text after epoch: 8
...Diversity: 0.2
...Generating with seed: "                  id="context-commitish-"
...Generated:  scrink-0 js-jump-to-scrone flex-shrink-0 js-jump-to-scropo d-boderlage">                                                                                                                                                                                                                                                                                                                                        

...Diversity: 0.5
...Generating with seed: "                  id="context-commitish-"
...Generated:  scrink-0 js-js-selected-navigation-item flex-shriner mo-ulecthen no-grarca js-jump-to-octicon-shrinc-starta js-js-repoopoopt-bottom js-jump-to-sugged-ult-sedrian-idet flex-shrins-stron" data-selected-links=" data-hydeflex-self-contrinin                                            </li>                                                                               


Generating text after epoch: 12
...Diversity: 0.2
...Generating with seed: "3">&rarr;</span></a></li>               "
...Generated:                                                                                                                                                                                                                                                                                                                                                                                                                  

...Diversity: 0.5
...Generating with seed: "3">&rarr;</span></a></li>               "
...Generated:                      <li class="edge-item-fix"><a href="/contribuses" class="bump-link-symbol float-right text-normal text-gray-light pr-3">&rarr;</span></a></li>                                                                                                                                                                                                          


Generating text after epoch: 16
...Diversity: 0.2
...Generating with seed: "             why github?                "
...Generated:                                                                                                                                                                                                                                                                                                                                                                                                                  

...Diversity: 0.5
...Generating with seed: "             why github?                "
...Generated:                         </span>     <span data-pjax-transient="true" <span class="deflex-selected-navigation-item dropdown-folue "             alication="t0p6" class="octicon octicon-shary px-1 text-gray-light text-brop col-to-lestran" aria-label="project mlight-search"                                                                                              


Generating text after epoch: 20
...Diversity: 0.2
...Generating with seed: "ls-overlay details-reset width-full">   "
...Generated:                                    </div>                                       <li class="edge-item-fixlatem" data-scondem="ingiteplication.js" data-src="https://github.com/karpathy/char-rnn" href="/karpathy/char-rnn" href="/karpathy/char-rnn" href="/karpathy/char-rnn" href="/karpathy/char-rnn" href="/karpathy/char-rnn" href="/karpathy/char-rnn" href="/karpathy/char-rnn" href="/karpathy/char-rnn" 

...Diversity: 0.5
...Generating with seed: "ls-overlay details-reset width-full">   "
...Generated:                              <heta name="octorymast" stroke-orndef="                      <span data-content="sharch" data-src="https://github.com" aria-label="repository" aria-label="tope" class="d-flex flex-items-center position-relative flex-shrink-0 js-jump-to-badgers" data-scclick="{&quot;enen>                          </a>            </div> </div> </heverl


Generating text after epoch: 24
...Diversity: 0.2
...Generating with seed: "              <a role="menuitem" class=""
...Generated:  d-flex flex-items-center position-relative overflow" data-ga-click="footer, relesetor, text:serectorylink-symenu-input flex-shrink-0 mr-1 mr-0 mr-0 mt-0 lh-condenter-blocke-indit-prijent-botfoum" data-ga-click="foly to storts /svgerlect <span class="bump-link-symbol float-right text-normal text-gray-light pr-3">&rarr;</span></a></li>                               <li class="d-flex">     <div class

...Diversity: 0.5
...Generating with seed: "              <a role="menuitem" class=""
...Generated:  d-flex flex-items-center position-relative overflow" data-selected-links=" data-ga-click="folig, <span class="bump-link--hover" data-ga-click="(logged out) header, go to container" role="option">   <a href="hodden" class="d-flex flex-items-center text-wransf-me"underole-item hx_underlinenav-item hx_underlinenav-item no-wrap js-responsive-underlinenav-item" data


Generating text after epoch: 28
...Diversity: 0.2
...Generating with seed: "ght pr-3">&rarr;</span></a></li>        "
...Generated:                                                                                                                                                                                                                                                                                                                                                                                                                  

...Diversity: 0.5
...Generating with seed: "ght pr-3">&rarr;</span></a></li>        "
...Generated:                                                                </li>                                                                                                                                                                                      <li class="d-flexmed-selected d-none d-lg-indens-col-ralg-block no-underline f5 bump-link--hav>                   


Generating text after epoch: 32
...Diversity: 0.2
...Generating with seed: "te="list"           aria-controls="jump-"
...Generated:  to-row-coumes" data-src="https://github.githubassets.com/assets/chunk-emptiopg-security/seare"><path fill-rule="evenodd" d="m1.5 1.5 0 000 1.5zm1 tar.eithibsitem class="details-overlay details-overlay details-overlay details-overlay details-overlay details-overlay details-overlay details-overlay details-overlay details-overlay details-overlay details-overlay details-overlay details-overlay details

...Diversity: 0.5
...Generating with seed: "te="list"           aria-controls="jump-"
...Generated:  to-suggestions-demav d-none d-blocks-menu-data-target"><path></svg>                               </div>                      <div class="d-flex flex-items-center pt-3 px-lg-0 text-gray-light text-normal text-gray-light ml-1 f6 d-block link--secondary no-underline f5 bump-link--hover" data-ga-click="(logged out) header, go to man for box-.js"></span>           


Generating text after epoch: 36
...Diversity: 0.2
...Generating with seed: "ex-items-center ">                 <deta"
...Generated:  ils crespan><a copositor-p5p1 8ixl"><upate pros class="octicon octicon-shary px-2 px-lg-4 pot;rop-dear-flex flex-items-selected-item" data-selected-litks="repository">                                     <hetk data-menule="ropensuments fork flex-shrink-0 js-jump-to-badge-seveitem" data-hydro-click-hmac="d-sclick-defalights prome to com1leane oction">                           <svg height="16" view

...Diversity: 0.5
...Generating with seed: "ex-items-center ">                 <deta"
...Generated:  ils crespan><span>     <div class="box header renu">         <span href="https://github.com/karpathy/char-rnn/input.txt&quot;,&quot;ntentication.click&quot;,&quot;panitinggshined" href="https://github.com/karpathy/char-rnn/ing/startshttm/serrar-search.js" data-src="octorylinge-site-veriation" href="/karpathy/char-rnn" href-selector="oftrment" type="brons-width-