forked from awslabs/sockeye
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge from upstream (original) repository #1
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Loading/saving/init auxiliary parameters of the models
* added chrf signature; removed local dependency
Source factors are enabled by passing --source-factors file1 [file2 ...] (-sf), where file1, etc. are token-parallel to the source (-s). This option can be passed both to sockeye.train or in the data preparation step, if data sharding is used. An analogous parameter, --validation-source-factors, is used to pass factors for validation data. The flag --source-factors-num-embed D1 [D2 ...] denotes the embedding dimensions. These are concatenated with the source word dimension (--num-embed), which can continue to be tied to the target (--weight-tying --weight-tying-type=src_trg). At test time, the input sentence and its factors can be passed by multiple parallel files (--input and --input-factors) or through stdin with token-level annotations, separated by |. Another way is to send a string-serialized JSON object to the CLI through stdin which needs to have a top-level key called 'text' and optionally a key 'factors' of type List[str]. * Cleanup of vocab functions * Simplified vocab logic a bit. Removed pickle functionality since it has been deprecated for long * Refactor so that first factor corresponds to the source surface form (e.g. configs by default set num_factors to at least 1) * fixed a TODO. slightly reworded the changelog * Reworked inference interface. Added a bunch of TranslatorInput factory functions (including json) * Removed max_seq_len_{source,target} from ModelConfig * Separate data statistics relevant for inference from data information relevant only for training. * Bumped Major Version to 1.17.0 * Do not throw exceptions while translating (#294) * Remove bias parameters in Transformer attention layers as they bring no benefit. (#296)
* Updated to MXNet 1.1.0. Changed Sequence{Mask,Last} operators to use the new axis argument to avoid a bunch of transposes. * added test for matching changelog and __init__ version strings
* Optionally store the beam history over each time step. * Beam histories now stored as part of `inference.Translation()` and `inference.TranslatorOutput()`. * Beam stored only when output handler is set to `beam_store`. * Added and modified tests.
* Support for evaluating multiple hypotheses with sockeye-evaluate.
…asses. (#304) * Simplified model constructors by pulling out some logic into load_models() * Removed train_iter dependency from TrainingModel constructor
* Updated embeddings inspection CLI to support weight tying * Updated WMT tutorial
* Fixed bug when loading external parameters for initialization
* Added fixed_param_names argument for freezing model parameters
* Added option --dry-run
sacrebleu 1.2.5 * added wmt18/dev datasets (en-et and et-en) * fixed logic with --force * locale-independent installation * added "--echo both" (tab-delimited)
#333) * switching to mxnet smallest_k * added --use-mxnet-topk[=False] * small refactoring of topk call * changed to branching on context
Modify the `np.argpartition` call to return an already sorted top-k to avoid the extra sort.
[Sacrebleu] v1.2.7 - fixed another locale issue (with --echo) - grudgingly enabled -tok none from the command line
…324) * Change default target vocab name in model folder to vocab.trg.0.json
* restrict-lexicon: Allow specification of topk at runtime. * Changelog & version * Entries in loaded lexicon are now again sorted by target ids * Added inspection CLI for lexicon. * Changelog
* Typo fix in inference. * Updated wording.
…keys & values (#360) * Removed combined FC for transformer source attention. Added small conversion tool for existing params. * fix pylint * backwards compatibility
* Batched topk implementation * Batch the topk operator across all sentences in a batch during decoding. * CPU decoding uses a numpy-based implementation; GPU decoding uses an mxnet-based one. * Address comments * Formatting * Partialise topk * Remove pytest cache * Offset only when batch decoding
* Allow empty sequences in train/validation data and skip them * adressed comments
#363) * Use Python3's functools::lru_cache instead of explicit decoder shape caching * cleanup
#364) * Added option to control training length via number of samples processed (besides number of updates/batches and number of epochs). * address comments
…en cleaning up training (#368)
* Added LHUC support. See David Vilar. "Learning Hidden Unit Contribution for Adapting Neural Machine Translation Models" NAACL 2018.
* Word based batching clarification. * Version + changelog.
… checkpoint. This used a lot of disk space. (#379)
* Avoid recreating partial for topk with every request in inference
* PyPI markdown support.
- each sentence now has an active beam size, which generalizes the former special case for t=1 and allows pruning - length normalization is applied only to completed hypotheses - `--beam-prune` specifies a width that will prune all hypotheses outside the beam set by the best completed hyp's value - `--beam-search-stop first` allows stopping as soon as any hypothesis completes when the batch size is one (default is the current behavior, to wait until all items on the beam are stopped)
…384) * Remove overly restrictive check to support early stopping with BLEU Removes an overly restrictive check introduced with the train refactoring. Re-allows early stopping w.r.t BLEU. * changelog
…Metric (#376) * Simplified computation of smoothed cross entropy loss in CrossEntropyMetric
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.