Skip to content

Leedehai/RNN-seq2seq

Repository files navigation

CS224N Project

Project Name

For CS224N: Natural Language Processing with Deep Learning, with Prof. C. Manning and R. Socher

Python 2.7 with TensorFlow 0.12.1, also relies on other modules, including NumPy 1.12.0, and pickle. It is advised to run on macOS or Linux.

Team: Haihong (@Leedehai), W.Z.Zhou, Q.W.Fu

Project mentor: Prof. C. Manning

March 2017 at Stanford University

Model(s):

Sequence-to-sequence model. Single-layered RNN or stacked RNN. The numerical optimizer can be chosen among RMSprop, Adam, and vanilla Gradient Descent.

Vanilla LSTM Translation Model (V)

Nothing much to say..

Vanilla LSTM Translation Model with Attetion (AV)

Still nothing much to say..

Something else ..

Files:

It needs two directories: (1) Code file code. Source files listed below all reside in it. (2) Data file data. Files generated during training and testing are in the directory RUN_V. These directories are on the same level.

It writes these files in the aforementioned directories: (1) Checkpoint files. You don't need to modify these files. Among them there is a file called "checkpoint", which serves as an index for TensorFlow to keep track of the all the checkpoint files. (2) log.txt: it records the entries printed on screen. (3) reduced_log.txt: it is a reduced version of log.txt. Its format is: epoch No. total_epochs step No. total_steps train_loss. (3) test_results.txt: the predicted tokens for each input. (4) test_log.txt: total loss and loss on each sequence in the testing phase.

run.py

  • Run the model's training.

  • You may modify the argument settings for artificial data (for debugging) and real data (for real work) here.

  • Debugging: (1) first time training: python run.py -t (before executing this line you need to delete the log file directory RUN_.. in the parent directory); (2) conitinuing training: python run.py -t -c (it needs the existence of previous training checkpoint file, which holds the hyperparameters and checkpoints).

  • Real work: drop the -t flag from the commands listed above.

test.py

  • Test the model.

  • Testing: python test.py (it needs the existence of previous training checkpoint file, which holds the hyperparameters).

  • The test results and log are written to file test_results.txt and test_log.txt.

display_translations.py

  • Translate word indecies to actual words in human language, according to the dictionary.

  • It reads in the text file test_results.txt generated by test.py. Before that you need to make sure the path is right and that text file does not contain any header line (if it does, the program will complain).

Trainer.py

  • Training the model. It can either train a model for the first time or continue training a trained model whose performance is not good enough.

  • You can run python Trainer.py to run the unit test. This test does not feed actual data to the model, though. It just calls upon Dataloader.py and feed in random number. So the loss value is quite high.

  • You can also use -v flag to turn on verbosity.

  • You can use -h too see how other model arguments are set.

Tester.py

  • Test the model.

  • It reloads learned model parameter and reads in test set to perform testing.

VanillaLSTMTransModel.py

  • Standard LSTM Seq2Seq model.

  • You can run python VanillaLSTMSeq2SeqModel.py -t to run the unit test. This test does not feed actual data to the model, but it merely construct the graph using arguments given in the unit test code. If the last line printed says "This module ... is functioning." then things are good.

  • You can also use -v flag to turn on verbosity.

  • You can use -h too see how other model arguments are set.

RNNChain.py

  • RNN Chain, i.e. a series of homogeneous RNN cells linked in series. All links are monodirectional. RNN can be single- or multi-layer.

  • Construct an RNN chain instance: rc = RNNChain(cell, name, scope), where name and scope are optional but it is advised to specify the scope.

  • Run the RNN chain: outputs, final_state = rc.run(, inputs, chain_length, initial_state, cell_input_size, feed_previous, loop_func, verbose). See the code for more API specifications; it is a little bit intricate and contains some hacks.

  • You can run python RNNChain.py to run the unit test. In the unit test, three RNNChain instances are created in different scopes. If the last line printed says "This module ... is functioning." then things are good.

Utility modules

Dataloader.py

  • Load data, including embedding datasets (we used Glove) and text corpus.

  • After instantiating a data loader object with proper parameters (see the function description in its constructor function code), you may call its next_batch() method to acquire a tuple of 3D tensors (x, y), where the former is input batch and the latter is target batch. You can get the number of total batches by calling its get_num_batches() method.

verbose_print.py

  • Conditional printing. Its API is vprint(verbosity, s, color), where verbosity is a boolean, s is a string or a variable/literal that can be converted to a string, and color is a string to specify the color in which you want the content s to be printed in terminal.

  • Colors: "GRAY", "GREY" = "GRAY", "RED" ("r"), "GREEN" ("g"), "YELLOW", "BLUE" ("b"), "MAGENTA" ("m"), "MAG" = "MAGENTA", "CYAN" ("c"), "WHITE" ("w")

  • This module is not essential but I do like colors and so does virtually everybody else :)

  • NOTE: this model is only guaranteed to work as intended on terminals on macOS or Linux, since it uses prefixes and suffixes as in \033[31m Shall I compare thee to a summer’s day? \033[m to set the color (in this case, red).

About

Recurrent Neural Network sequence-to-sequence model (Natural Language Processing) with pedagogically verbose comments, EXCERPT from course project

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published