MMT sentence re-segmentation #598

EtienneAb3d · 2021-10-22T10:23:57Z

Exploring the MMT code, I discovered it is doing a re-segmentation of sentences to process the segmented result as a batch.

First of all, it would be nice to explain somewhere that, when no adaptation is needed, very higher performances can be obtained by sending multiple sentences together. They will be then really processed one by one (not as a whole), but sent as a mini-batch to the GPU. The constraint is that the multi-sentence text should be re-segmentable (for example, with a point at the end of each sentence).

Secondly, this may cause problems: MMT can split a single sentence in several parts, depending on its content, then not translating it as a single sentence. It would be nice to add an option somewhere to prevent from re-segmentation, knowing the input sentence is to be taken as a whole (not to be split).

EtienneAb3d mentioned this issue Oct 22, 2021

Mini-batch with adaptation? #599

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MMT sentence re-segmentation #598

MMT sentence re-segmentation #598

EtienneAb3d commented Oct 22, 2021 •

edited

Loading

MMT sentence re-segmentation #598

MMT sentence re-segmentation #598

Comments

EtienneAb3d commented Oct 22, 2021 • edited Loading

EtienneAb3d commented Oct 22, 2021 •

edited

Loading