First version.
Trained on 3.3M training data points.
Transformer with 9-layer encoder and 9-layer decoder.
Tested on a multi-domain dataset, outperforming Google Translate.
Experiments with style-tagging and data appending.
First version.
Trained on 3.3M training data points.
Transformer with 9-layer encoder and 9-layer decoder.
Tested on a multi-domain dataset, outperforming Google Translate.
Experiments with style-tagging and data appending.