# Star Wars Generated Text

This file builds a LSTM model that is fed the scripts from the
Original Star Wars Trilogy, and then generates text based off
of what it was fed.

This notebook uses code from Chollet, found at:
https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/8.1-text-generation-with-lstm.ipynb

## Preparing The Data

In [1]:
import keras
import numpy as np

text = open("/home/trp22/CS/344/cs344/Project/OriginalTrilogy_script.txt").read().lower()
print('Corpus length:', len(text))

Using TensorFlow backend.


Corpus length: 494395


In [2]:
# Length of extracted character sequences
maxlen = 60

# We sample a new sequence every `step` characters
step = 3

# This holds our extracted sequences
sentences = []

# This holds the targets (the follow-up characters)
next_chars = []

for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('Number of sequences:', len(sentences))

# List of unique characters in the corpus
chars = sorted(list(set(text)))
print('Unique characters:', len(chars))
# Dictionary mapping unique characters to their index in `chars`
char_indices = dict((char, chars.index(char)) for char in chars)

# Next, one-hot encode the characters into binary arrays.
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Number of sequences: 164779
Unique characters: 60
Vectorization...


After playing around with many different configurations, a maxlen
of 60, and a step size of 3 worked the best for this text.

## Building the network

In [3]:
from keras import layers

model = keras.models.Sequential()
model.add(layers.LSTM(128, input_shape=(maxlen, len(chars))))
model.add(layers.Dense(len(chars), activation='softmax'))

In [4]:
optimizer = keras.optimizers.RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

## Training the model

In [5]:
'''
sample() recieves a probability distribution from the model,
reweights it according to a given temperature,
and returns the index value of the next character.
'''
def sample(preds, temperature=1.0):
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [6]:
import random
import sys

for epoch in range(1, 61):
    print('epoch', epoch)
    # Fit the model for 1 epoch on the available training data
    model.fit(x, y,
              batch_size=128,
              epochs=1)

    # text is generated only every 5 epochs.
    if epoch % 5 == 0:
        # Select a text seed at random
        start_index = random.randint(0, len(text) - maxlen - 1)
        generated_text = text[start_index: start_index + maxlen]
        print('--- Generating with seed: "' + generated_text + '"')

        for temperature in [0.2, 0.5]:
            print('------ temperature:', temperature)
            sys.stdout.write(generated_text)

            # We generate 600 characters
            for i in range(600):
                sampled = np.zeros((1, maxlen, len(chars)))
                for t, char in enumerate(generated_text):
                    sampled[0, t, char_indices[char]] = 1.

                preds = model.predict(sampled, verbose=0)[0]
                next_index = sample(preds, temperature)
                next_char = chars[next_index]

                generated_text += next_char
                generated_text = generated_text[1:]

                sys.stdout.write(next_char)
                sys.stdout.flush()
            print()


epoch 1
Epoch 1/1
epoch 2
Epoch 1/1
epoch 3
Epoch 1/1
epoch 4
Epoch 1/1
epoch 5
Epoch 1/1
--- Generating with seed: "nto the bottomless shaft.

the emperor's body spins helpless"
------ temperature: 0.2
nto the bottomless shaft.

the emperor's body spins helpless ship ship and the ship ship something the ship ship ship ship ship ship and ship ship ship and a second through the death star and starts to the ship ship ship.

the ship ship ship ship and the ship ship ship and start to the ship ship and ship and his ship.

leia
(over comlink)
what is ship and stand through the ship ship ship is a see the ship ship and starts to the ship is a second the ship and the ship ship something to his hand a second through the ship ship starts to the bar begin to the and the rebel ship and the troopers to the ship and ship and the ship ship ship ship.

leia
(into th
------ temperature: 0.5
he ship and ship and the ship ship ship ship.

leia
(into the droids and the from the went of 
the farted ships 

  
