t2t_second

Set of scripts used for experiments described in paper Morphological and Language-Agnostic Word Segmentation for NMT https://arxiv.org/abs/1806.05482 .

The scripts were not inteded for public and reuse, they're not clear or commented. Feel free to ask for explanation or details.

Requirements:

this fork of Tensor2Tensor: https://github.com/Gldkslfmsd/tensor2tensor
- it is T2T 1.2.9 + one simple modification of logging not affecting the core and performance on translation
11GB RAM GPU
etc.

Remarkable:

t2t_usr/

T2T user dir with definitions and implementations of problems

t2t_user/my_registrations.py

class TranslateDecsSubwords100kFbb100m: STE baseline
class TranslateDecsBpe: BPE run

gen-train-dec.sh

entry point to launch generator and trainer

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
t2t_usr		t2t_usr
walktrough		walktrough
README.md		README.md
bleu-loop.sh		bleu-loop.sh
bleu.sh		bleu.sh
decode.sh		decode.sh
decode_all.sh		decode_all.sh
decode_all_loop.sh		decode_all_loop.sh
gen-train-dec.sh		gen-train-dec.sh
help-bleu		help-bleu
help-datagen		help-datagen
help-decoder		help-decoder
help-trainer		help-trainer
link_data.sh		link_data.sh
poznamky		poznamky
qs-gen-train-dec.sh		qs-gen-train-dec.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

t2t_second

About

Releases

Packages

Languages

Gldkslfmsd/t2t_second

Folders and files

Latest commit

History

Repository files navigation

t2t_second

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages