Merge from upstream (original) repository #1

lorisbaz · 2018-05-11T08:07:50Z

No description provided.

* Loading/saving/init auxiliary parameters of the models

* added chrf signature; removed local dependency

Source factors are enabled by passing --source-factors file1 [file2 ...] (-sf), where file1, etc. are token-parallel to the source (-s). This option can be passed both to sockeye.train or in the data preparation step, if data sharding is used. An analogous parameter, --validation-source-factors, is used to pass factors for validation data. The flag --source-factors-num-embed D1 [D2 ...] denotes the embedding dimensions. These are concatenated with the source word dimension (--num-embed), which can continue to be tied to the target (--weight-tying --weight-tying-type=src_trg). At test time, the input sentence and its factors can be passed by multiple parallel files (--input and --input-factors) or through stdin with token-level annotations, separated by |. Another way is to send a string-serialized JSON object to the CLI through stdin which needs to have a top-level key called 'text' and optionally a key 'factors' of type List[str]. * Cleanup of vocab functions * Simplified vocab logic a bit. Removed pickle functionality since it has been deprecated for long * Refactor so that first factor corresponds to the source surface form (e.g. configs by default set num_factors to at least 1) * fixed a TODO. slightly reworded the changelog * Reworked inference interface. Added a bunch of TranslatorInput factory functions (including json) * Removed max_seq_len_{source,target} from ModelConfig * Separate data statistics relevant for inference from data information relevant only for training. * Bumped Major Version to 1.17.0 * Do not throw exceptions while translating (#294) * Remove bias parameters in Transformer attention layers as they bring no benefit. (#296)

* Updated to MXNet 1.1.0. Changed Sequence{Mask,Last} operators to use the new axis argument to avoid a bunch of transposes. * added test for matching changelog and __init__ version strings

* Optionally store the beam history over each time step. * Beam histories now stored as part of `inference.Translation()` and `inference.TranslatorOutput()`. * Beam stored only when output handler is set to `beam_store`. * Added and modified tests.

…ON (#305)

)

#306)

* Support for evaluating multiple hypotheses with sockeye-evaluate.

…asses. (#304) * Simplified model constructors by pulling out some logic into load_models() * Removed train_iter dependency from TrainingModel constructor

* Updated embeddings inspection CLI to support weight tying * Updated WMT tutorial

* Fixed bug when loading external parameters for initialization

* Added fixed_param_names argument for freezing model parameters

* Added option --dry-run

sacrebleu 1.2.5 * added wmt18/dev datasets (en-et and et-en) * fixed logic with --force * locale-independent installation * added "--echo both" (tab-delimited)

#333) * switching to mxnet smallest_k * added --use-mxnet-topk[=False] * small refactoring of topk call * changed to branching on context

Modify the `np.argpartition` call to return an already sorted top-k to avoid the extra sort.

[Sacrebleu] v1.2.7 - fixed another locale issue (with --echo) - grudgingly enabled -tok none from the command line

…324) * Change default target vocab name in model folder to vocab.trg.0.json

* restrict-lexicon: Allow specification of topk at runtime. * Changelog & version * Entries in loaded lexicon are now again sorted by target ids * Added inspection CLI for lexicon. * Changelog

…t: '' (#336)

…oding (#356)

* Typo fix in inference. * Updated wording.

…keys & values (#360) * Removed combined FC for transformer source attention. Added small conversion tool for existing params. * fix pylint * backwards compatibility

* Batched topk implementation * Batch the topk operator across all sentences in a batch during decoding. * CPU decoding uses a numpy-based implementation; GPU decoding uses an mxnet-based one. * Address comments * Formatting * Partialise topk * Remove pytest cache * Offset only when batch decoding

* Allow empty sequences in train/validation data and skip them * adressed comments

#363) * Use Python3's functools::lru_cache instead of explicit decoder shape caching * cleanup

#364) * Added option to control training length via number of samples processed (besides number of updates/batches and number of epochs). * address comments

…en cleaning up training (#368)

…ion (#369) fixes #366

* Added LHUC support. See David Vilar. "Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models" NAACL 2018.

* Word based batching clarification. * Version + changelog.

… checkpoint. This used a lot of disk space. (#379)

* Avoid recreating partial for topk with every request in inference

* PyPI markdown support.

- each sentence now has an active beam size, which generalizes the former special case for t=1 and allows pruning - length normalization is applied only to completed hypotheses - `--beam-prune` specifies a width that will prune all hypotheses outside the beam set by the best completed hyp's value - `--beam-search-stop first` allows stopping as soon as any hypothesis completes when the batch size is one (default is the current behavior, to wait until all items on the beam are stopped)

…384) * Remove overly restrictive check to support early stopping with BLEU Removes an overly restrictive check introduced with the train refactoring. Re-allows early stopping w.r.t BLEU. * changelog

…Metric (#376) * Simplified computation of smoothed cross entropy loss in CrossEntropyMetric

lorisbaz and others added 30 commits February 6, 2018 16:23

Loading/saving/init auxiliary parameters of the models (#292)

7485330

* Loading/saving/init auxiliary parameters of the models

added chrf signature; removed locale dependency (#297)

bfc860e

* added chrf signature; removed local dependency

Explicit dependency on DMLC tensorboard fork (#299)

8263615

Update to MXNet 1.1.0 (#300)

b70acbd

* Updated to MXNet 1.1.0. Changed Sequence{Mask,Last} operators to use the new axis argument to avoid a bunch of transposes. * added test for matching changelog and __init__ version strings

Do not create initial params file when resuming training (#302)

94f8d36

Updated BLEU threshold for system test sometimes failing in Travis CR…

a93e5f1

…ON (#305)

Fixed seqcopy tutorial and some links for CONTRIBUTING and LICENSE (#303

7e3c66d

)

Renamed pre-commit.sh to style-check.sh and clarified PR instructions. (

9089e98

#306)

Use stack instead of expand_dims, concat for memory efficiency (#307)

5d73482

Support for evaluating multiple hypotheses with sockeye-evaluate. (#316)

10eb633

* Support for evaluating multiple hypotheses with sockeye-evaluate.

Separate TrainingModel into TrainingModel and EarlyStoppingTrainer cl…

7800f9a

…asses. (#304) * Simplified model constructors by pulling out some logic into load_models() * Removed train_iter dependency from TrainingModel constructor

Updated embeddings inspection CLI to support weight tying (#319)

b304f33

* Updated embeddings inspection CLI to support weight tying * Updated WMT tutorial

Fixed bug when loading external parameters for initialization (#317)

ac47312

* Fixed bug when loading external parameters for initialization

Added --fixed-param-names argument for freezing model parameters (#320)

618e813

* Added fixed_param_names argument for freezing model parameters

Added option --dry-run (#322)

ed7e3ef

* Added option --dry-run

sacrebleu 1.2.5 (#325)

5a3e795

sacrebleu 1.2.5 * added wmt18/dev datasets (en-et and et-en) * fixed logic with --force * locale-independent installation * added "--echo both" (tab-delimited)

Minor fixes to transformer-related custom ops (#327)

eb0df86

added wmt17/ms, plus fix to --echo ref (#332)

56265f6

Use MXNet topk for beam search when GPU decoding. Numpy topk otherwise (

12e2204

#333) * switching to mxnet smallest_k * added --use-mxnet-topk[=False] * small refactoring of topk call * changed to branching on context

Transformer: do not concat/split cache tensors in decoding (#330)

3351245

Simplify numpy-based top-k (#340)

219f9c2

Modify the `np.argpartition` call to return an already sorted top-k to avoid the extra sort.

Added flag to strip unk symbols from translate output (#342)

ad74b9c

Sacrebleu 127 (#352)

83ab13c

[Sacrebleu] v1.2.7 - fixed another locale issue (with --echo) - grudgingly enabled -tok none from the command line

Changelog

eb951a5

Change default target vocab name in model folder to vocab.trg.0.json (#…

befcae1

…324) * Change default target vocab name in model folder to vocab.trg.0.json

[1.18] restrict-lexicon: Allow specification of topk at runtime. (#311)

acc001c

* restrict-lexicon: Allow specification of topk at runtime. * Changelog & version * Entries in loaded lexicon are now again sorted by target ids * Added inspection CLI for lexicon. * Changelog

rnn_attention config and factory improvements (#315)

ce0a30c

Make sure all model parameters are prefixed by a model prefix. Defaul…

dff9a8e

…t: '' (#336)

logogin and others added 29 commits April 10, 2018 11:43

dtype support for decoders (#344)

ba67443

Enable embedding logging to tensorboard (#350)

cce1acc

Added Cuda9.1 requirements file (#353)

9273f04

import cleanup (#355)

6e17732

Increased test coverage of integration tests by adding checkpoint dec…

fedaf29

…oding (#356)

Typo fix in inference. (#359)

da62afe

* Typo fix in inference. * Updated wording.

Transformer source attention: removed combined linear projection for …

3042a36

…keys & values (#360) * Removed combined FC for transformer source attention. Added small conversion tool for existing params. * fix pylint * backwards compatibility

Allow empty sequences in train/validation data and skip them (#362)

2c2c23a

* Allow empty sequences in train/validation data and skip them * adressed comments

Use Python3's functools::lru_cache instead of explicit decoder shape … (

908489d

#363) * Use Python3's functools::lru_cache instead of explicit decoder shape caching * cleanup

Added option to control training length via number of samples process… (

da518f1

#364) * Added option to control training length via number of samples processed (besides number of updates/batches and number of epochs). * address comments

Spring cleanup based on PEP8 code analysis (#365)

dd7933d

Ensure last checkpoint decoder results are written to metrics fail wh…

12fab9b

…en cleaning up training (#368)

add pypi badge to README.md (#370)

da5c687

moved entry_points dict into a separate variable (#372)

12fdcaa

Proper reloading of learning rate scheduler state at training resumpt…

6adb291

…ion (#369) fixes #366

Log loading time of models and topklexicon at inference (#373)

72d6e2c

Add flag for non-strict Optional checks in new version of mypy (#377)

9571fc3

Learning Hidden Unit Contribution (LHUC) (#371)

e934541

* Added LHUC support. See David Vilar. "Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models" NAACL 2018.

Word based batching clarification. (#378)

f192892

* Word based batching clarification. * Version + changelog.

Removed tensorboard logging of embedding & output parameters at every…

cf05bb2

… checkpoint. This used a lot of disk space. (#379)

Avoid recreating partial for topk with every request in inference (#375)

2725ce1

* Avoid recreating partial for topk with every request in inference

PyPI markdown support. (#380)

fc3c6d4

* PyPI markdown support.

Bugfix: Make sure lhuc flag is passed as bool to configs (#382)

3c1a8c5

Remove overly restrictive check to support early stopping with BLEU (#…

467c23f

…384) * Remove overly restrictive check to support early stopping with BLEU Removes an overly restrictive check introduced with the train refactoring. Re-allows early stopping w.r.t BLEU. * changelog

Removing unnecessary comma in bibtex reference + adding arxiv id. (#386)

1a6912a

proper none check for lhuc (#387)

f9c35cf

Simplified computation of smoothed cross entropy loss in CrossEntropy…

74804ce

…Metric (#376) * Simplified computation of smoothed cross entropy loss in CrossEntropyMetric

lorisbaz merged commit faa6704 into lorisbaz:master May 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge from upstream (original) repository #1

Merge from upstream (original) repository #1

lorisbaz commented May 11, 2018

Merge from upstream (original) repository #1

Merge from upstream (original) repository #1

Conversation

lorisbaz commented May 11, 2018