Skip to content

Commit

Permalink
subword regularization (#428)
Browse files Browse the repository at this point in the history
* subword regularization

* make subword sample input_reader more general
  • Loading branch information
cindyxinyiwang authored and neubig committed Jun 18, 2018
1 parent 838ebab commit 3fbf2b2
Show file tree
Hide file tree
Showing 8 changed files with 6,107 additions and 29 deletions.
55 changes: 55 additions & 0 deletions examples/20_subword_sample.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Sampling subword units for subword regularization
subword_sample: !Experiment
# global parameters shared throughout the experiment
exp_global: !ExpGlobal
# {EXP_DIR} is a placeholder for the directory in which the config file lies.
# {EXP} is a placeholder for the experiment name (here: 'standard')
model_file: '{EXP_DIR}/models/{EXP}.mod'
log_file: '{EXP_DIR}/logs/{EXP}.log'
default_layer_dim: 512
dropout: 0.3
# model architecture
model: !DefaultTranslator
src_reader: !SubwordSampleTextReader
vocab: !Vocab {vocab_file: examples/data/head.ja.vocab}
model_file: examples/data/big-ja.model
trg_reader: !SubwordSampleTextReader
vocab: !Vocab {vocab_file: examples/data/head.en.vocab}
model_file: examples/data/big-en.model
src_embedder: !SimpleWordEmbedder
emb_dim: 512
encoder: !BiLSTMSeqTransducer
layers: 1
attender: !MlpAttender
hidden_dim: 512
state_dim: 512
input_dim: 512
trg_embedder: !SimpleWordEmbedder
emb_dim: 512
decoder: !MlpSoftmaxDecoder
rnn_layer: !UniLSTMSeqTransducer
layers: 1
mlp_layer: !MLP
hidden_dim: 512
activation: 'tanh'
bridge: !CopyBridge {}
# training parameters
train: !SimpleTrainingRegimen
batcher: !SrcBatcher
batch_size: 32
trainer: !AdamTrainer
alpha: 0.001
run_for_epochs: 2
src_file: examples/data/head.ja
trg_file: examples/data/head.en
dev_tasks:
- !LossEvalTask
src_file: examples/data/head.ja
ref_file: examples/data/head.en
# final evaluation
evaluate:
- !AccuracyEvalTask
eval_metrics: bleu
src_file: examples/data/head.ja
ref_file: examples/data/head.en
hyp_file: examples/output/{EXP}.test_hyp
Binary file added examples/data/big-en.model
Binary file not shown.

0 comments on commit 3fbf2b2

Please sign in to comment.