MultiCochrane is a multilingual text simplification dataset for the medical domain in four languages: English, Spanish, French, and Farsi.
This repo will also contain Anno-Viewer, the annotation tool used to build MultiCochrane.
More details about the dataset can be found in the paper.
Right now, this repository is in the process of being fully compiled. Please send an email to sebaj@utexas.edu if you need to access any additional data or code.
If you use MultiCochrane in your project, please be sure to cite our work.
@misc{joseph2023multilingual,
title={Multilingual Simplification of Medical Texts},
author={Sebastian Joseph and Kathryn Kazanas and Keziah Reina and Vishnesh J. Ramanathan and Wei Xu and Byron C. Wallace and Junyi Jessy Li},
year={2023},
eprint={2305.12532},
archivePrefix={arXiv},
primaryClass={cs.CL}
}