-
Notifications
You must be signed in to change notification settings - Fork 44
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
6 changed files
with
209 additions
and
43 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
API Doc | ||
======= | ||
|
||
TODO |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
Experiment Configuration File Format | ||
------------------------------------ | ||
|
||
Configuration files are in `YAML dictionary format <https://docs.ansible.com/ansible/YAMLSyntax.html>`_. | ||
|
||
Top-level entries in the file correspond to individual experiments to run. Each | ||
such entry must have four subsections: ``experiment``, ``train``, ``decode``, | ||
and ``evaluate``. Options for each subsection are listed below. | ||
|
||
There can be a special top-level entry named ``defaults``; if it is | ||
present, parameters defined in it will act as defaults for other experiments | ||
in the configuration file. | ||
|
||
If any string option includes "<EXP>" this will be over-written by the name of the experiment. | ||
|
||
Option Tables | ||
============= | ||
|
||
**experiment** | ||
|
||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| Name | Description | Type | Default | | ||
+====================+=================================================================+======+===========+ | ||
| model_file | Location to write the model file | str | <EXP>.mod | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| hyp_file | Location to write decoded output for evaluation | str | <EXP>.hyp | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| out_file | A file for writing stdout logging output | str | <EXP>.out | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| err_file | A file for writing stderr logging errput | str | <EXP>.err | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| eval_metrics | Comma-separated list of evaluation metrics (bleu/wer/cer) | str | bleu | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| **run_for_epochs** | How many epochs to run each test for | int | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| eval_every | Evaluation period in iters, or 0 for never evaluating. | int | 0 | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
|
||
**decode** | ||
|
||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| Name | Description | Type | Default | | ||
+====================+=================================================================+======+===========+ | ||
| **source_file** | path of input source file to be translated | str | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| input_format | format of input data: text/contvec | str | text | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| post_process | post-processing of translation outputs: none/join-char/join-bpe | str | none | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
|
||
**evaluate** | ||
|
||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| Name | Description | Type | Default | | ||
+====================+=================================================================+======+===========+ | ||
| **ref_file** | path of the reference file | str | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
|
||
**train** | ||
|
||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| Name | Description | Type | Default | | ||
+====================+=================================================================+======+===========+ | ||
| eval_every | | int | 1000 | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| batch_size | | int | 32 | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| batch_strategy | | str | src | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| **train_source** | | str | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| **train_target** | | str | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| **dev_source** | | str | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| **dev_target** | | str | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| pretrained_model_file | Path of pre-trained model file | str | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| input_format | Format of input data: text/contvec | str | text | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| default_layer_dim | Default size to use for layers if not otherwise overridden | int | 512 | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| input_word_embed_dim | | int | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| output_word_embed_dim | | int | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| output_state_dim | | int | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| output_mlp_hidden_dim | | int | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| attender_hidden_dim | | int | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| encoder_hidden_dim | | int | | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| trainer | | str | sgd | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| eval_metrics | | str | bleu | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| encoder_layers | | int | 2 | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| decoder_layers | | int | 2 | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| encoder_type | | str | BiLSTM | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| decoder_type | | str | LSTM | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ | ||
| residual_to_output | If using residual networks, whether to add a residual connection to the output layer | bool | True | | ||
+--------------------+-----------------------------------------------------------------+------+-----------+ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
Getting Started | ||
=============== | ||
|
||
Prerequisites | ||
------------- | ||
|
||
Before running ``xnmt`` you must install the Python bindings for | ||
`DyNet <http://github.com/clab/dynet>`_. | ||
|
||
Training/testing a Model | ||
------------------------ | ||
|
||
If you want to try to run a simple experiment, you can do so using sample | ||
configurations in the ``examples`` directory. For example, if you wnat to try | ||
the default configuration file, which trains an attentional encoder-decoder model, | ||
you can do so by running:: | ||
|
||
python xnmt/xnmt_run_experiments.py examples/standard.yaml | ||
|
||
The various examples that you can use are: | ||
|
||
- ``examples/standard.yaml``: A standard neural MT model | ||
- ``examples/speech.yaml``: An example of speech-to-text translation | ||
- ``examples/debug.yaml``: A simple debugging configuration that should run super-fast | ||
|
||
See ``experiments.md`` for more details about writing experiment configuration files | ||
that allow you to specify the various |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
|
||
Programming Style | ||
================= | ||
|
||
Philosphy | ||
--------- | ||
|
||
The over-arching goal of ``xnmt`` is that it be easy to use for research. When implementing a new | ||
method, it should require only minimal changes (e.g. ideally the changes will be limited to a | ||
single file, over-riding an existing class). Obviously this ideal will not be realizable all the | ||
time, but when designing new functionality, try to think of this goal. If there are tradeoffs, | ||
the following is the order of priority (of course getting all is great!): | ||
|
||
1. Code Correctness | ||
2. Extensibility and Readability | ||
3. Accuracy and Effectiveness of the Models | ||
4. Efficiency | ||
|
||
Coding Conventions | ||
------------------ | ||
|
||
There are also a minimal of coding style conventions: | ||
|
||
- Follow Python conventions, and be Python2/3 compatible. | ||
- Functions should be snake case. | ||
- Indentation should be two whitespace characters. | ||
- Docstrings should be made in reST format (e.g. ``:param param_name:``, ``:returns:`` etc.) | ||
|
||
We will aim to write unit tests to make sure things don't break, but these are not implemented yet. | ||
|
||
In variable names, common words should be abbreviated as: | ||
|
||
- source -> src | ||
- target -> trg | ||
- sentence -> sent | ||
- hypothesis -> hyp | ||
|
||
Contributing | ||
------------ | ||
|
||
Go ahead and send a pull request! If you're not sure whether something will be useful and | ||
want to ask beforehand, feel free to open an issue on the github. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
Translator Structure | ||
==================== | ||
|
||
If you want to dig in to using ``xnmt`` for your research it is necessary to understand | ||
the overall structure. Basically it consists of 5 major components: | ||
|
||
1. **Embedder:** This converts input symbols into continuous-space vectors. Usually this | ||
is done by looking up the word in a lookup table, but it could be done | ||
any other way. | ||
2. **Encoder:** Takes the embedded input and encodes it, for example using a bi-directional | ||
LSTM to calculate context-sensitive embeddings. | ||
3. **Attender:** This is the "attention" module, which takes the encoded input and decoder | ||
state, then calculates attention. | ||
4. **Decoder:** This calculates a probability distribution over the words in the output, | ||
either to calculate a loss function during training, or to generate outputs | ||
at test time. | ||
5. **SearchStrategy:** This takes the probabilities calculated by the decoder and actually | ||
generates outputs at test time. | ||
|
||
There are a bunch of auxiliary classes as well to handle saving/loading of the inputs, | ||
etc. However, if you're interested in using ``xnmt`` to develop a new method, most of your | ||
work will probably go into one or a couple of the classes listed above. |