Accompanying code for blog post: Recurring Neural Networks and Star Trek.
Data
Data source: http://www.st-minutiae.com/resources/scripts/#thenextgeneration
Once zip file has been downloaded an unzipped, to combine all scripts into a single file:
python combine_scripts.py
Markov Chain Models
For finding character or n-gram pairs based on training text:
python markov_train.py
Flags:
- -f, Path to file containing training data
- -l, Number of characters to use, i.e. 1 = single characters, 2 = bi-grams etc.
Outputs dictionary of pairs to json file, with:
key = character n-gram
value = list of n-grams
For generating new text:
python markov_generate.py
Flags:
- -f, Path to file containing transition frequencies
- -s, Optional seed for getting started
- -l, Output length, number of characters
RNN Models
To install the Torch library:
git clone https://github.com/jcjohnson/torch-rnn
Instruction for use along with available options are described in detail on the github page.
Training Params Used:
Attempt 1:
th train.lua -input_h5 data/star_trek.h5 -input_json data/star_trek.json
-model_type rnn -num_layers 2 -rnn_size 128
-seq_length 100 -lr_decay_every 10
-lr_decay_factor 0.95
Attempt 2:
th train.lua -input_h5 data/star_trek.h5 -input_json data/star_trek.json
-model_type lstm -num_layers 2 -rnn_size 128
-seq_length 100 -lr_decay_every 10
-lr_decay_factor 0.95
Attempt 3:
th train.lua -input_h5 data/star_trek.h5 -input_json data/star_trek.json -model_type lstm -num_layers 3
Attempt 4:
th train.lua -input_h5 data/star_trek.h5 -input_json data/star_trek.json
-seq_length 200 -rnn_size 256
-model_type lstm -num_layers 3
Output log files are in Logs/
iPython Notebook for visualizing training curves: Learning_vis.ipynb
Text Generation
Examples of generated text are in Results/
, based on the following flags:
File Name | Temperature | Seed |
---|---|---|
Sample.txt | 1 | None |
Sample2.txt | 0 | None |
Sample3.txt | 0.5 | None |
Sample4.txt | 0.7 | None |
Sample5.txt | 0.8 | None |
Sample6.txt | 0.25 | None |
Sample7.txt | 1 | "Captain" |
Sample8.txt | 0.8 | "1 INT. MAIN BRIDGE" |
Sample10.txt | 0.7 | Paramount |
Sample11.txt | 0.1 | None |
Sample13.txt | 0.75 | None |