Reformatted doc

neulab · May 29, 2017 · dc42983 · dc42983
1 parent f7897bb
commit dc42983
Show file tree

Hide file tree

Showing 6 changed files with 209 additions and 43 deletions.
diff --git a/doc/source/api_doc.rst b/doc/source/api_doc.rst
@@ -0,0 +1,4 @@
+API Doc
+=======
+
+TODO
diff --git a/doc/source/experiment_config_files.rst b/doc/source/experiment_config_files.rst
@@ -0,0 +1,109 @@
+Experiment Configuration File Format
+------------------------------------
+
+Configuration files are in `YAML dictionary format <https://docs.ansible.com/ansible/YAMLSyntax.html>`_.
+
+Top-level entries in the file correspond to individual experiments to run. Each
+such entry must have four subsections: ``experiment``, ``train``, ``decode``,
+and ``evaluate``. Options for each subsection are listed below.
+
+There can be a special top-level entry named ``defaults``; if it is
+present, parameters defined in it will act as defaults for other experiments
+in the configuration file.
+
+If any string option includes "<EXP>" this will be over-written by the name of the experiment.
+
+Option Tables
+=============
+
+**experiment**
+
++--------------------+-----------------------------------------------------------------+------+-----------+
+| Name               | Description                                                     | Type | Default   |
++====================+=================================================================+======+===========+
+| model_file         | Location to write the model file                                | str  | <EXP>.mod |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| hyp_file           | Location to write decoded output for evaluation                 | str  | <EXP>.hyp |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| out_file           | A file for writing stdout logging output                        | str  | <EXP>.out |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| err_file           | A file for writing stderr logging errput                        | str  | <EXP>.err |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| eval_metrics       | Comma-separated list of evaluation metrics (bleu/wer/cer)       | str  | bleu      |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| **run_for_epochs** | How many epochs to run each test for                            | int  |           |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| eval_every         | Evaluation period in iters, or 0 for never evaluating.          | int  | 0         |
++--------------------+-----------------------------------------------------------------+------+-----------+
+
+**decode**
+
++--------------------+-----------------------------------------------------------------+------+-----------+
+| Name               | Description                                                     | Type | Default   |
++====================+=================================================================+======+===========+
+| **source_file**    | path of input source file to be translated                      | str  |           |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| input_format       | format of input data: text/contvec                              | str  | text      |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| post_process       | post-processing of translation outputs: none/join-char/join-bpe | str  | none      |
++--------------------+-----------------------------------------------------------------+------+-----------+
+
+**evaluate**
+
++--------------------+-----------------------------------------------------------------+------+-----------+
+| Name               | Description                                                     | Type | Default   |
++====================+=================================================================+======+===========+
+| **ref_file** | path of the reference file | str |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+
+**train**
+
++--------------------+-----------------------------------------------------------------+------+-----------+
+| Name               | Description                                                     | Type | Default   |
++====================+=================================================================+======+===========+
+| eval_every |  | int | 1000 |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| batch_size |  | int | 32 |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| batch_strategy |  | str | src |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| **train_source** |  | str |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| **train_target** |  | str |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| **dev_source** |  | str |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| **dev_target** |  | str |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| pretrained_model_file | Path of pre-trained model file | str |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| input_format | Format of input data: text/contvec | str | text |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| default_layer_dim | Default size to use for layers if not otherwise overridden | int | 512 |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| input_word_embed_dim |  | int |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| output_word_embed_dim |  | int |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| output_state_dim |  | int |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| output_mlp_hidden_dim |  | int |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| attender_hidden_dim |  | int |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| encoder_hidden_dim |  | int |  |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| trainer |  | str | sgd |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| eval_metrics |  | str | bleu |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| encoder_layers |  | int | 2 |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| decoder_layers |  | int | 2 |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| encoder_type |  | str | BiLSTM |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| decoder_type |  | str | LSTM |
++--------------------+-----------------------------------------------------------------+------+-----------+
+| residual_to_output | If using residual networks, whether to add a residual connection to the output layer | bool | True |
++--------------------+-----------------------------------------------------------------+------+-----------+
diff --git a/doc/source/getting_started.rst b/doc/source/getting_started.rst
@@ -0,0 +1,27 @@
+Getting Started
+===============
+
+Prerequisites
+-------------
+
+Before running ``xnmt`` you must install the Python bindings for
+`DyNet <http://github.com/clab/dynet>`_.
+
+Training/testing a Model
+------------------------
+
+If you want to try to run a simple experiment, you can do so using sample 
+configurations in the ``examples`` directory. For example, if you wnat to try
+the default configuration file, which trains an attentional encoder-decoder model,
+you can do so by running::
+
+    python xnmt/xnmt_run_experiments.py examples/standard.yaml
+
+The various examples that you can use are:
+
+- ``examples/standard.yaml``: A standard neural MT model
+- ``examples/speech.yaml``: An example of speech-to-text translation
+- ``examples/debug.yaml``: A simple debugging configuration that should run super-fast
+
+See ``experiments.md`` for more details about writing experiment configuration files
+that allow you to specify the various 
diff --git a/doc/source/index.rst b/doc/source/index.rst
@@ -9,52 +9,14 @@ eXtensible Neural Machine Translation
 This is a repository for the extensible neural machine translation toolkit ``xnmt``.
 It is coded in Python based on `DyNet <http://github.com/clab/dynet>`_.
 
-Usage Directions
-----------------
-
-If you want to try to run an experiment, you can do so using sample configurations in the ``examples``
-directory. For example, if you wnat to try the default configuration file,
-which trains an attentional encoder-decoder model, you can do so by running
-
-    python xnmt/xnmt_run_experiments.py examples/standard.yaml
-
-There are other examples here:
-
-- ``examples/standard.yaml``: A standard neural MT model
-- ``examples/debug.yaml``: A simple debugging configuration that should run super-fast
-- ``examples/speech.yaml``: An example of speech-to-text translation
-
-See ``experiments.md`` for more details about writing experiment configuration files.
-
-Programming Style
------------------
-
-The over-arching goal of ``xnmt`` is that it be easy to use for research. When implementing a new
-method, it should require only minimal changes (e.g. ideally the changes will be limited to a
-single file, over-riding an existing class). Obviously this ideal will not be realizable all the
-time, but when designing new functionality, try to think of this goal.
-
-There are also a minimal of coding style conventions:
-
-- Follow Python conventions, and be Python2/3 compatible.
-- Functions should be snake case.
-- Indentation should be two whitespace characters.
-
-We will aim to write unit tests to make sure things don't break, but these are not implemented yet.
-
-In variable names, common words should be abbreviated as:
-
-- source -> src
-- target -> trg
-- sentence -> sent
-- hypothesis -> hyp
-
-
 .. toctree::
    :maxdepth: 2
-   :caption: Contents:
-
 
+   getting_started
+   experiment_config_files
+   translator_structure
+   api_doc
+   programming_style
 
 Indices and tables
 ==================

diff --git a/doc/source/programming_style.rst b/doc/source/programming_style.rst
@@ -0,0 +1,42 @@
+
+Programming Style
+=================
+
+Philosphy
+---------
+
+The over-arching goal of ``xnmt`` is that it be easy to use for research. When implementing a new
+method, it should require only minimal changes (e.g. ideally the changes will be limited to a
+single file, over-riding an existing class). Obviously this ideal will not be realizable all the
+time, but when designing new functionality, try to think of this goal. If there are tradeoffs,
+the following is the order of priority (of course getting all is great!):
+
+1. Code Correctness
+2. Extensibility and Readability
+3. Accuracy and Effectiveness of the Models
+4. Efficiency
+
+Coding Conventions
+------------------
+
+There are also a minimal of coding style conventions:
+
+- Follow Python conventions, and be Python2/3 compatible.
+- Functions should be snake case.
+- Indentation should be two whitespace characters.
+- Docstrings should be made in reST format (e.g. ``:param param_name:``, ``:returns:`` etc.)
+
+We will aim to write unit tests to make sure things don't break, but these are not implemented yet.
+
+In variable names, common words should be abbreviated as:
+
+- source -> src
+- target -> trg
+- sentence -> sent
+- hypothesis -> hyp
+
+Contributing
+------------
+
+Go ahead and send a pull request! If you're not sure whether something will be useful and
+want to ask beforehand, feel free to open an issue on the github.
diff --git a/doc/source/translator_structure.rst b/doc/source/translator_structure.rst
@@ -0,0 +1,22 @@
+Translator Structure
+====================
+
+If you want to dig in to using ``xnmt`` for your research it is necessary to understand
+the overall structure. Basically it consists of 5 major components:
+
+1. **Embedder:** This converts input symbols into continuous-space vectors. Usually this
+                 is done by looking up the word in a lookup table, but it could be done
+                 any other way.
+2. **Encoder:**  Takes the embedded input and encodes it, for example using a bi-directional
+                 LSTM to calculate context-sensitive embeddings.
+3. **Attender:** This is the "attention" module, which takes the encoded input and decoder
+                 state, then calculates attention.
+4. **Decoder:**  This calculates a probability distribution over the words in the output,
+                 either to calculate a loss function during training, or to generate outputs
+                 at test time.
+5. **SearchStrategy:** This takes the probabilities calculated by the decoder and actually
+                 generates outputs at test time.
+
+There are a bunch of auxiliary classes as well to handle saving/loading of the inputs,
+etc. However, if you're interested in using ``xnmt`` to develop a new method, most of your
+work will probably go into one or a couple of the classes listed above.