The package contains the datasets collected as part of the Project LabForSims2 15-IDFN-0008 funded under the IDEFI-Numérique. The package contains 2 directories:
- single - single-turn dataset containing question and response pairs taken separately
- dialogues - context QA dataset containing the end-to-end dialogues
If you use this corpus, please cite:
@inproceedings{laleye_french_2020,
address = {Marseille, France},
title = {A {French} {Medical} {Conversations} {Corpus} {Annotated} for a {Virtual} {Patient} {Dialogue} {System}},
url = {https://www.aclweb.org/anthology/2020.lrec-1.72},
abstract = {Data-driven approaches for creating virtual patient dialogue systems require the availability of large data specific to the language,domain and clinical cases studied. Based on the lack of dialogue corpora in French for medical education, we propose an annotatedcorpus of dialogues including medical consultation interactions between doctor and patient. In this work, we detail the building processof the proposed dialogue corpus, describe the annotation guidelines and also present the statistics of its contents. We then conducted aquestion categorization task to evaluate the benefits of the proposed corpus that is made publicly available.},
booktitle = {Proceedings of {The} 12th {Language} {Resources} and {Evaluation} {Conference}},
publisher = {European Language Resources Association},
author = {Laleye, Fréjus A. A. and de Chalendar, Gaël and Blanié, Antonia and Brouquet, Antoine and Behnamou, Dan},
month = may,
year = {2020},
pages = {574--580}
}