Contains abstracts of 1,680 scientific papers on information technology in Russian:
- 1,600 unlabeled
- 80 manually labeled.
The dataset is available as a release.
See details here.
Entity annotation was performed in BIO format. Entities are terms represented by nouns or noun phrases.
The following types were used to annotate semantic relations:
CAUSE
COMPARE
ISA
PART_OF
SYNONYMS
TOOL
USAGE
See details here.
See details here.
See details here.
If you find this repository useful, feel free to cite our paper: Marshalova A., Bruches E., Batura T. Automatic Aspect Extraction from Scientific Texts. Recent Trends in Analysis of Images, Social Networks and Texts. AIST 2023. Communications in Computer and Information Science. Springer, 2024. V. 1905. pp. 67–80.
@inproceedings{ruserrc-aspect,
title={Automatic Aspect Extraction from Scientific Texts},
author={Marshalova, Anna and Bruches, Elena and Batura, Tatiana},
booktitle={Recent Trends in Analysis of Images, Social Networks and Texts. AIST 2023. Communications in Computer and Information Science},
publisher={Springer Nature Switzerland},
year={2024},
pages={67--80}
}
Bruches E., Pauls A., Batura T., Isachenko V. Entity recognition and relation extraction from scientific and technical texts in Russian. Proceedings of 2020 Science and Artificial Intelligence conference (S.A.I.ence). IEEE. 2020. p. 41-45.
@inproceedings{ruserrc-dataset,
title={Entity recognition and relation extraction from scientific and technical texts in Russian},
author={Bruches, Elena and Pauls, Alexey and Batura, Tatiana and Isachenko, Vladimir},
booktitle={2020 Science and Artificial Intelligence conference (SAI ence)},
pages={41--45},
year={2020}
}