Mostly-undocumented version of the XRayEmb project, published as a preprint in 2021: Yuval Pinter, Amanda Stent, Mark Dredze, and Jacob Eisenstein. Learning to Look Inside: Augmenting Token-Based Encoders with Character-Level Information.
This repository is not maintained regularly. If you wish to work with this code, let me know and I'll do my best to help.
Note that much of the code references the model by its previous working name, TokDetok.