This repo hosts the codebase for reproduce the results for the ACL paper Neural Decipherment via Minimum-Cost Flow: from Ugaritic to Linear B. Still cleaning up some part of the code. Stay tuned for updates.
Data for linear B and Ugaritic are included in the
uga-heb.no_spe.cogis the entire Ugaritic-Hebrew data obtained from Ben Snyder.
no_spestands for no special symbols since the original file contains special symbols that mark the morphological segmentations and affixes.
uga-heb.small.no_spe.cogis the exact random subset of Ugaritic data I used in the paper for training the model. Around one tenth of the original file.
linear_b-greek.cogis the linear B data used in the paper.
notebooks/Linear_b_simplified.ipynbis the same notebook I used for preparing the linear B data.
linear_b-greek.names.cogis the linear B data that only included names on the Greek side.
Note that you might need to install fonts in order to render Linear B scripts properly in your computer.