Skip to content

Commit

Permalink
Generalized translator and inference (#429)
Browse files Browse the repository at this point in the history
* decoder interface and type annotations

* add inference base clase; rename SimpleInference to SequenceInference; some refactoring to outputs and SequenceInference

* add sequence labeler and classifier

* move output processing from generator models to inference

* rename SequenceInference to AutoRegressiveInference

* improve doc for inference

* fix some serialization issues

* remove some model-specific code

* cleanup: unused inference src_mask, inconsistent input reader read_sent

* rename ClassifierInference to IndependentOutputInference

* update generate, use independent inference for seq labeler

* fix warning

* refactor MLE loss to be agnostic of translator internals, rename to AutoRegressiveMLELoss

* fix single-quote docstring warning

* fix further warnings

* some renaming plus doc updates for translator

* refactor MLP class

* un-comment sentencepiece import

* avoid code duplication between calc_loss and generate methods

* share code between inference classes

* IndependentOutputInference supports forced decoding etc

* small cleanup

* support forced decoding and fix batch loss for sequence labeler

* forced decoding for classifier

* rename transducer.__call__() to transduce() to simplify multiple inheritance

* more principled use of batcher in inference

* batch decoding for sequence classifier

* DefaultTranslator: fix masking for (looped) batch decoding

* made parameters of generate() clearer + other minor doc fixes

* Added type annotation for transducers

* clean up output interface

* some fixes related to reporting

* fix unit tests

* Separated softmax and projection (#440)

* Started separating out softmax

* Started fixing tests

* Fixed more tests

* Fixed remainder of running tests

* Fixed the rest of tests

* Added AuxNonLinear

* Updated examples (many were already broken?)

* Fixed recipes

* Removed MLP class

* Added some doc

* fix problem when calling a super constructor that is wrapped in serialized_init

* Added some doc

* fix / clean up sequence labeler

* fix using scorer

* document how to run test configs directly

* fix examples

* update api doc

* Removed extraneous yaml file

* Update to doc

* attempt to fix travis

* represent command line args as normal dictionary

* temporarily disable travis cache

* undo previous commit

* fix serialization problem

* downgrade pyyaml
  • Loading branch information
msperber committed Jun 27, 2018
1 parent 0d094f5 commit e3c7656
Show file tree
Hide file tree
Showing 110 changed files with 2,094 additions and 1,420 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ __pycache__
*.pyc
*.so
.idea
.mypy_cache
examples/output
examples/logs
examples/models
Expand Down
28 changes: 21 additions & 7 deletions docs/api_doc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,18 +83,32 @@ Bridge
:members:
:show-inheritance:

Linear
Transform
~~~~~~~~~
.. automodule:: xnmt.transform
:members:
:show-inheritance:

Scorer
~~~~~~
.. automodule:: xnmt.linear
.. automodule:: xnmt.scorer
:members:
:show-inheritance:

SequenceLabeler
~~~~~~~~~~~~~~~
.. automodule:: xnmt.seq_labeler
:members:
:show-inheritance:

Multi-layer Perceptron
~~~~~~~~~~~~~~~~~~~~~~
.. automodule:: xnmt.mlp
Classifier
~~~~~~~~~~
.. automodule:: xnmt.classifier
:members:
:show-inheritance:



Loss
----

Expand Down Expand Up @@ -154,8 +168,8 @@ ParamInitializer
Inference
---------

SimpleInference
~~~~~~~~~~~~~~~
AutoRegressiveInference
~~~~~~~~~~~~~~~~~~~~~~~

.. automodule:: xnmt.inference
:members:
Expand Down
6 changes: 3 additions & 3 deletions docs/experiment_config_files.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,11 +82,11 @@ This specifies the model architecture. An typical example looks like this
input_dim: 512
trg_embedder: !SimpleWordEmbedder
emb_dim: 512
decoder: !MlpSoftmaxDecoder
decoder: !AutoRegressiveDecoder
rnn_layer: !UniLSTMSeqTransducer
layers: 1
mlp_layer: !MLP
hidden_dim: 512
transform: !NonLinear
output_dim: 512
bridge: !CopyBridge {}
The top level entry is typically DefaultTranslator, which implements a standard
Expand Down
8 changes: 4 additions & 4 deletions examples/01_standard.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ standard: !Experiment # 'standard' is the name given to the experiment
input_dim: 512
trg_embedder: !SimpleWordEmbedder
emb_dim: 512
decoder: !MlpSoftmaxDecoder
rnn_layer: !UniLSTMSeqTransducer
decoder: !AutoRegressiveDecoder
rnn: !UniLSTMSeqTransducer
layers: 1
mlp_layer: !MLP
hidden_dim: 512
transform: !AuxNonLinear
output_dim: 512
activation: 'tanh'
bridge: !CopyBridge {}
# training parameters
Expand Down
2 changes: 1 addition & 1 deletion examples/03_multiple_exp.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,4 @@ exp2_no_dropout: !Experiment
dropout: 0.0
model: *my_model
train: *my_train
evaluate: *my_eval
evaluate: *my_eval
10 changes: 5 additions & 5 deletions examples/05_preproc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -91,13 +91,13 @@ preproc: !Experiment
input_dim: 512
trg_embedder: !SimpleWordEmbedder
emb_dim: 512
decoder: !MlpSoftmaxDecoder
rnn_layer: !UniLSTMSeqTransducer
decoder: !AutoRegressiveDecoder
rnn: !UniLSTMSeqTransducer
layers: 1
mlp_layer: !MLP
hidden_dim: 512
transform: !AuxNonLinear
output_dim: 512
bridge: !NoBridge {}
inference: !SimpleInference
inference: !AutoRegressiveInference
post_process: join-piece
train: !SimpleTrainingRegimen
run_for_epochs: 20
Expand Down
10 changes: 5 additions & 5 deletions examples/07_load_finetune.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,14 @@ exp1-pretrain-model: !Experiment
input_dim: 64
trg_embedder: !SimpleWordEmbedder
emb_dim: 64
decoder: !MlpSoftmaxDecoder
rnn_layer: !UniLSTMSeqTransducer
decoder: !AutoRegressiveDecoder
rnn: !UniLSTMSeqTransducer
layers: 1
mlp_layer: !MLP
hidden_dim: 64
transform: !AuxNonLinear
output_dim: 64
input_feeding: True
bridge: !CopyBridge {}
inference: !SimpleInference {}
inference: !AutoRegressiveInference {}
train: !SimpleTrainingRegimen
run_for_epochs: 2
src_file: examples/data/head.ja
Expand Down
12 changes: 6 additions & 6 deletions examples/08_load_eval_beam.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,14 @@ exp1-train-model: !Experiment
input_dim: 64
trg_embedder: !SimpleWordEmbedder
emb_dim: 64
decoder: !MlpSoftmaxDecoder
rnn_layer: !UniLSTMSeqTransducer
decoder: !AutoRegressiveDecoder
rnn: !UniLSTMSeqTransducer
layers: 1
mlp_layer: !MLP
hidden_dim: 64
transform: !AuxNonLinear
output_dim: 64
input_feeding: True
bridge: !CopyBridge {}
inference: !SimpleInference
inference: !AutoRegressiveInference
search_strategy: !BeamSearch
beam_size: 5
len_norm: !PolynomialNormalization
Expand All @@ -56,7 +56,7 @@ exp1-train-model: !Experiment
hyp_file: examples/output/{EXP}.test_hyp

exp2-eval-model: !LoadSerialized
filename: examples/output/exp1-pretrain-model.mod
filename: examples/output/exp1-train-model.mod
overwrite: # list of [path, value] pairs. Value can be scalar or an arbitrary object
- path: train # skip the training loop
val: null
Expand Down
24 changes: 14 additions & 10 deletions examples/09_programmatic.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,17 @@
import numpy as np

from xnmt.attender import MlpAttender
from xnmt.batcher import SrcBatcher
from xnmt.batcher import SrcBatcher, InOrderBatcher
from xnmt.bridge import CopyBridge
from xnmt.decoder import MlpSoftmaxDecoder
from xnmt.decoder import AutoRegressiveDecoder
from xnmt.embedder import SimpleWordEmbedder
from xnmt.eval_task import LossEvalTask, AccuracyEvalTask
from xnmt.experiment import Experiment
from xnmt.inference import SimpleInference
from xnmt.inference import AutoRegressiveInference
from xnmt.input_reader import PlainTextReader
from xnmt.lstm import BiLSTMSeqTransducer, UniLSTMSeqTransducer
from xnmt.mlp import MLP
from xnmt.transform import AuxNonLinear
from xnmt.scorer import Softmax
from xnmt.optimizer import AdamTrainer
from xnmt.param_collection import ParamManager
from xnmt.persistence import save_to_file
Expand Down Expand Up @@ -57,7 +58,7 @@

batcher = SrcBatcher(batch_size=64)

inference = SimpleInference(batcher=batcher)
inference = AutoRegressiveInference(batcher=InOrderBatcher(batch_size=1))

layer_dim = 512

Expand All @@ -69,11 +70,14 @@
encoder=BiLSTMSeqTransducer(input_dim=layer_dim, hidden_dim=layer_dim, layers=1),
attender=MlpAttender(hidden_dim=layer_dim, state_dim=layer_dim, input_dim=layer_dim),
trg_embedder=SimpleWordEmbedder(emb_dim=layer_dim, vocab_size=len(trg_vocab)),
decoder=MlpSoftmaxDecoder(input_dim=layer_dim,
rnn_layer=UniLSTMSeqTransducer(input_dim=layer_dim, hidden_dim=layer_dim, decoder_input_dim=layer_dim, yaml_path="decoder"),
mlp_layer=MLP(input_dim=layer_dim, hidden_dim=layer_dim, decoder_rnn_dim=layer_dim, yaml_path="decoder", vocab_size=len(trg_vocab)),
trg_embed_dim=layer_dim,
bridge=CopyBridge(dec_dim=layer_dim, dec_layers=1)),
decoder=AutoRegressiveDecoder(input_dim=layer_dim,
rnn=UniLSTMSeqTransducer(input_dim=layer_dim, hidden_dim=layer_dim,
decoder_input_dim=layer_dim, yaml_path="decoder"),
transform=AuxNonLinear(input_dim=layer_dim, output_dim=layer_dim,
aux_input_dim=layer_dim),
scorer=Softmax(vocab_size=len(trg_vocab), input_dim=layer_dim),
trg_embed_dim=layer_dim,
bridge=CopyBridge(dec_dim=layer_dim, dec_layers=1)),
inference=inference
)

Expand Down
9 changes: 5 additions & 4 deletions examples/11_component_sharing.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
exp1.pretrain: !Experiment
exp_global: !ExpGlobal
default_layer_dim: 32
model_file: 'examples/output/{EXP}.mod'
model: !DefaultTranslator
src_reader: !PlainTextReader
vocab: !Vocab {vocab_file: examples/data/head.ja.vocab}
Expand All @@ -31,16 +32,16 @@ exp1.pretrain: !Experiment
trg_embedder: !DenseWordEmbedder
_xnmt_id: trg_emb # this id must be unique and is needed to create a reference-by-name below.
emb_dim: 32
decoder: !MlpSoftmaxDecoder
rnn_layer: !UniLSTMSeqTransducer
decoder: !AutoRegressiveDecoder
rnn: !UniLSTMSeqTransducer
layers: 1
mlp_layer: !MLP
scorer: !Softmax
output_projector: !Ref { name: trg_emb }
# alternatively, the same could be achieved like this,
# in which case model.trg_embedder._xnmt_id is not required:
# !Ref { path: model.trg_embedder }
bridge: !CopyBridge {}
inference: !SimpleInference {}
inference: !AutoRegressiveInference {}
train: !SimpleTrainingRegimen
run_for_epochs: 2
src_file: examples/data/head.ja
Expand Down
10 changes: 5 additions & 5 deletions examples/12_multi_task.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,12 @@ exp1-multi_task: !Experiment
trg_embedder: !SimpleWordEmbedder
emb_dim: 64
vocab: !Ref {name: trg_vocab}
decoder: !MlpSoftmaxDecoder
rnn_layer: !UniLSTMSeqTransducer
decoder: !AutoRegressiveDecoder
rnn: !UniLSTMSeqTransducer
layers: 1
hidden_dim: 64
bridge: !CopyBridge {}
mlp_layer: !MLP
scorer: !Softmax
vocab: !Ref {name: trg_vocab}
dev_tasks:
- !AccuracyEvalTask
Expand Down Expand Up @@ -80,9 +80,9 @@ exp1-multi_task: !Experiment
trg_embedder: !SimpleWordEmbedder
emb_dim: 64
vocab: !Ref {name: trg_vocab}
decoder: !MlpSoftmaxDecoder
decoder: !AutoRegressiveDecoder
bridge: !CopyBridge {}
mlp_layer: !MLP
scorer: !Softmax
vocab: !Ref {name: trg_vocab}
dev_tasks:
- !AccuracyEvalTask
Expand Down
19 changes: 12 additions & 7 deletions examples/13_speech.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,11 @@ speech: !Experiment
input_dim: 64
trg_embedder: !SimpleWordEmbedder
emb_dim: 64
decoder: !MlpSoftmaxDecoder
rnn_layer: !UniLSTMSeqTransducer
decoder: !AutoRegressiveDecoder
rnn: !UniLSTMSeqTransducer
layers: 1
mlp_layer: !MLP
hidden_dim: 64
transform: !AuxNonLinear
output_dim: 64
bridge: !CopyBridge {}
src_reader: !H5Reader
transpose: True
Expand All @@ -61,14 +61,19 @@ speech: !Experiment
src_file: examples/data/LDC94S13A.h5
ref_file: examples/data/LDC94S13A.char
hyp_file: examples/output/{EXP}.dev_hyp
inference: !SimpleInference
inference: !AutoRegressiveInference
post_process: join-char
batcher: !InOrderBatcher
_xnmt_id: inference_batcher
pad_src_to_multiple: 4
batch_size: 1
src_pad_token: ~
evaluate:
- !AccuracyEvalTask
eval_metrics: cer,wer
src_file: examples/data/LDC94S13A.h5
ref_file: examples/data/LDC94S13A.words
hyp_file: examples/output/{EXP}.test_hyp
inference: !SimpleInference
inference: !AutoRegressiveInference
post_process: join-char

batcher: !Ref { name: inference_batcher }
12 changes: 6 additions & 6 deletions examples/14_report.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,13 @@ report: !Experiment
input_dim: 256
trg_embedder: !SimpleWordEmbedder
emb_dim: 256
decoder: !MlpSoftmaxDecoder
rnn_layer: !UniLSTMSeqTransducer
decoder: !AutoRegressiveDecoder
rnn: !UniLSTMSeqTransducer
layers: 1
mlp_layer: !MLP
hidden_dim: 256
transform: !AuxNonLinear
output_dim: 256
bridge: !NoBridge {}
inference: !SimpleInference {}
inference: !AutoRegressiveInference {}
train: !SimpleTrainingRegimen
run_for_epochs: 1
trainer: !AdamTrainer
Expand All @@ -42,6 +42,6 @@ report: !Experiment
src_file: examples/data/head.ja
ref_file: examples/data/head.en
hyp_file: examples/output/{EXP}.test_hyp
inference: !SimpleInference
inference: !AutoRegressiveInference
report_path: examples/output/{EXP}.report
report_type: html, file
12 changes: 6 additions & 6 deletions examples/15_score.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,14 @@ exp1-model: !Experiment
input_dim: 64
trg_embedder: !SimpleWordEmbedder
emb_dim: 64
decoder: !MlpSoftmaxDecoder
rnn_layer: !UniLSTMSeqTransducer
decoder: !AutoRegressiveDecoder
rnn: !UniLSTMSeqTransducer
layers: 1
mlp_layer: !MLP
hidden_dim: 64
transform: !AuxNonLinear
output_dim: 64
input_feeding: True
bridge: !CopyBridge {}
inference: !SimpleInference {}
inference: !AutoRegressiveInference {}
train: !SimpleTrainingRegimen
run_for_epochs: 2
src_file: examples/data/head.ja
Expand All @@ -59,7 +59,7 @@ exp2-score: !LoadSerialized
- path: train
val: ~
- path: model.inference
val: !SimpleInference
val: !AutoRegressiveInference
mode: score
ref_file: examples/data/head.nbest.en
src_file: examples/data/head.ja
Expand Down
2 changes: 1 addition & 1 deletion examples/16_transformer.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ transformer: !Experiment
vocab: !Vocab {vocab_file: examples/data/head.ja.vocab}
trg_reader: !PlainTextReader
vocab: !Vocab {vocab_file: examples/data/head.en.vocab}
inference: !SimpleInference {}
inference: !AutoRegressiveInference {}
train: !SimpleTrainingRegimen
run_for_epochs: 30
batcher: !SentShuffleBatcher
Expand Down
9 changes: 5 additions & 4 deletions examples/17_ensembling.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,12 @@ exp2-single: !Experiment
trg_embedder: !DenseWordEmbedder
_xnmt_id: dense_embed
emb_dim: 64
decoder: !MlpSoftmaxDecoder
rnn_layer: !UniLSTMSeqTransducer
hidden_dim: 64
mlp_layer: !MLP
decoder: !AutoRegressiveDecoder
rnn: !UniLSTMSeqTransducer
hidden_dim: 64
transform: !AuxNonLinear
output_dim: 64
scorer: !Softmax
output_projector: !Ref {name: dense_embed}
train: *train

Expand Down

0 comments on commit e3c7656

Please sign in to comment.