Skip to content

Latest commit

 

History

History
11 lines (6 loc) · 535 Bytes

README.md

File metadata and controls

11 lines (6 loc) · 535 Bytes

Repository for the article:

  • Gutierrez-Vasques Ximena, Bentz Christian, Sozinova Olga and Samardzic Tanja. "From characters to words: the turning point of BPE merges”. European Chapter of the Association for Computational Linguistics, Long Papers. 2021

The folder Correlations/ contains a summary of the measures reported in the article.

The folder Detailed/ contains scripts, corpus and detailed results per each language.

The folder Media/ contains slides and a poster of the article.