Skip to content

This is an official Leaderboard for the RuSentRel-1.1 dataset originally described in paper (arxiv:1808.08932)


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



23 Commits

Repository files navigation

RuSentRel Leaderboard

📓 Update 01 October 2023: this collection is now available in arekit-ss for a quick sampling of contexts with all subject-object relation mentions with just single script into JSONL/CSV/SqLite including (optional) language transferring 🔥 [Learn more ...]

Dataset description: RuSentRel collection consisted of analytical articles from Internet-portal These are translated into Russian texts in the domain of international politics obtained from foreign authoritative sources. The collected articles contain both the author's opinion on the subject matter of the article and a large number of references mentioned between the participants of the described situations. In total, 73 large analytical texts were labeled with about 2000 relations.

This repository is an official results benchmark for automatic sentiment attitude extraction task within RuSentRel collection. Let's follow the task section for greater details.

Contributing: Please feel free to make pull requests, and at awesome-sentiment-attitude-extraction especially!

For more details about RuSentRel please proceed with the related repository.



Given a subset of documents in the RuSentRel collection, where each document is presented by a pair: (1) text, (2) a list of selected named entities. For each document, it is required to complete a list of such entity pairs (es, eo), for which text conveys the presence of sentiment relation from the es (subject) towards an eo (object). Label assignation can be neg or pos.

... При этом Москва неоднократно подчеркивала, что ее активность на балтике является ответом именно на действия НАТО и эскалацию враждебного подхода к Росcии вблизи ее восточных границ ... (... Meanwhile Moscow has repeatedly emphasized that its activity in the Baltic Sea is a response precisely to actions of NATO and the escalation of the hostile approach to Russia near its eastern borders ...)
(NATO->Russia, neg), (Russia->NATO, neg)

Task paper:


The task is considered as a context classification problem, in which context is a text region with mentioned pair (attitude participants) in it. Then classified context-level attitudes transfers onto document-level by averaging context labels of the related pair (using the voting method).

We implement AREkit toolkit which becomes a framework for the following applications:

  • BERT-based language models [code];
  • Neural Networks with (and w/o) Attention mechanism [code];
  • Conventional Machine Learning methods [code];

Back to Top

Submission Evaluation

Source code exported from AREkit-0.21.0 library and yields of:

  • Evaluation directory for details of the evaluator implementation and the related dependencies;
  • Test directory, which includes test scripts that allow applying evaluator for the archived results.

Use to evaluate your submissions. Below is an example for assessing the results of ChatGPT-3.5-0613:

python3 --input data/ --mode classification --split cv3

Back to Top


Results ordered from the latest to the oldest. We measure F1 (scaled by 100) across the following foldings (see evaluator section for greater details):

  • F1cv - the average F1 of a 3-fold CV check; foldings carried out by preserving the same number of sentences in each of them;
  • Ft -- F1 over the predefined TEST set;

The result assessment organized in experiments:

  • 3l -- subject-object pairs extraction.
  • 2l -- classification of already given subject-object pairs on document level;
Methods F1cv (3l) F1t (3l) F1cv (2l) F1t (2l)
Expert Agreement** [1] 55.0 55.0 - -
ChatGPT zero-shot with promptings*** [7]
ChatGPT3.5-0613, avg [200 words distance] 37.7 39.6
ChatGPT3.5-0613, avg [50 words distance] 66.19 74.47
ChatGPT3.5-0613, first [50 words distance] 69.23 74.09
Distant SupervisionRA-2.0-large for Language Models (BERT-based) [6]
[pt -- pretrained, ft -- fine-tunded]
SentenceRuBERT (NLIpt + NLIft) 39.0 38.0 70.2 67.7
SentenceRuBERT (NLIpt + QAft) 38.4 41.9 69.6 64.2
SentenceRuBERT (NLIpt + Cft) 37.9 39.8 70.0 69.8
RuBERT (NLIpt + NLIft) 36.8 39.9 71.0 68.6
RuBERT (NLIpt + QAft) 34.8 37.0 69.6 68.2
RuBERT (NLIpt + Cft) 35.6 35.4 70.0 69.8
mBase (NLIpt + NLIft) 33.6 36.0 69.4 68.2
mBase (NLIpt + QAft) 30.1 35.5 69.6 65.2
mBase (NLIpt + Cft) 30.5 31.1 68.9 67.7
Distant SupervisionRA-2.0-large for (Attentive) Neural Networks + Frames annotation [Joined Training] [6]reproduced, [4]original
PCNNends 32.2 39.9 70.2 67.8
BiLSTM 32.0 38.8 71.2 68.4
PCNN 31.6 39.7 69.5 70.5
LSTM 31.6 39.5 68.0 75.4
Att-BiLSTM [P.Zhou et. al] 31.0 37.3 66.2 71.2
AttCNNends 30.9 39.9 66.8 72.7
IANends 30.7 36.7 69.1 72.6
Distant SupervisionRA-1.0 for Multi-Instance Neural Networks [Joined Training] [5]
MI-PCNN 68.0
MI-CNN 62.0
PCNN 67.0
CNN 63.0
Language Models (BERT-based) [6]
SentenceRuBERT (NLI) 33.4 32.7 69.8 67.6
SentenceRuBERT (QA) 34.3 38.9 70.2 67.1
SentenceRuBERT (C) 34.0 35.2 69.3 65.5
RuBERT (NLI) 29.4 39.6 68.9 66.4
RuBERT (QA) 32.0 35.3 69.5 66.2
RuBERT (C) 36.8 37.6 67.8 66.2
mBase (NLI) 29.2 37.0 67.8 58.4
mBase (QA) 28.6 33.8 66.5 65.4
mBase (C) 26.9 30.0 67.0 68.9
(Attentive) Neural Networks + Frames annotation ([6]reproduced, [3]original)
IANends 30.8 32.2 60.8 63.5
AttPCNNends 29.9 32.6 64.3 63.3
PCNN 29.6 32.5 64.4 63.3
CNN 28.7 31.4 63.6 65.9
BILSTM 28.6 32.4 62.3 71.2
LSTM 27.9 31.6 61.9 65.3
AttCNNends 27.6 29.7 65.0 66.2
Att-BiLSTM [P.Zhou et. al] 27.5 32.3 65.7 68.2
Convolutional networks [2]
PCNN [code] 0.31
CNN 0.30
Conventional methods [1] [code]
Gradient Boosting (Grid search) 20.3* 28.0
Random Forest (Grid search) 19.1* 27.0
Random Forest 15.7* 27.0
Naive Bayes (Bernoulli) 15.2* 16.0
SVM 15.1* 15.0
Gradient Boosting 14.4* 27.0
SVM (Grid search) 14.3* 15.0
NaiveBayes (Gauss) 9.2* 11.0
KNN 7.0* 9.0
Baseline (School) [link] 12.0
Baseline (Distr) 8.0
Baseline (Random) 7.4* 8.0
Baseline (Pos) 3.9* 4.0
Baseline (Neg) 5.2* 5.0

*: Results that were not mentioned in papers.

**: We asked another super-annotator to label the collection, and compared her annotation with our gold standard using average F-measure of positive and negative classes in the same way as for automatic approaches. In such a way, we can reveal the upper border for automatic algorithms. We obtained that F-measure of human labeling. [1]

***: We consider translation into english samples via the arekit-ss by translating texts into english first, and then wrapping them into prompts. We consider a k-words distance (50 by default, in english) between words as a upper bound for pairs organization; because of the latter and prior standards, results might be lower (translation increases distance in words).

Back to Top

Neural Networks Optimization

The training process is described in Rusnachenko et. al., 2020 (section 7.1) and relies on the Multi-Instance learning approach, originally proposed in Zeng et. al., 2015 paper. (SGD application, bags terminology, instances selection within bags). All the batch context samples are gathered into bags. Authors propose to select the best instance in every bag as follows: calculate the max value of p(yi|mi,j) across i'th values within a particular j'th bag. The latter allows them to adopt loss function on bags level.

In our works, we adopt bags for synonymous context gathering. Therefore, for gradients calculation within bags, we choose avg function instead. The assumption here is to consider other synonymous attitudes during the gradients calculation procedure. We use BagSize > 1 in earlier work Rusnachenko, 2018 In the latest experiments, we consider BagSize = 1 and therefore don't exploit bag values averaging.

Back to Top

Related works


Awesome Sentiment Attitude Extraction

Back to Top


[1] Natalia Loukachevitch, Nicolay Rusnachenko Extracting Sentiment Attitudes from Analytical Texts Proceedings of International Conference on Computational Linguistics and Intellectual Technologies Dialogue-2018 (arXiv:1808.08932) [paper] [code]

[2] Nicolay Rusnachenko, Natalia Loukachevitch Using Convolutional Neural Networks for Sentiment Attitude Extraction from Analytical Texts, EPiC Series in Language and Linguistics 4, 1-10, 2019 [paper] [code]

[3] Nicolay Rusnachenko, Natalia Loukachevitch Studying Attention Models in Sentiment Attitude Extraction Task Métais E., Meziane F., Horacek H., Cimiano P. (eds) Natural Language Processing and Information Systems. NLDB 2020. Lecture Notes in Computer Science, vol 12089. Springer, Cham [paper] [code]

[4] Nicolay Rusnachenko, Natalia Loukachevitch Attention-Based Neural Networks for Sentiment Attitude Extraction using Distant Supervision The 10th International Conference on Web Intelligence, Mining and Semantics (WIMS 2020), June 30-July 3 (arXiv:2006.13730) [paper] [code]

[5] Nicolay Rusnachenko, Natalia Loukachevitch, Elena Tutubalina Distant Supervision for Sentiment Attitude Extraction Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019) [paper] [code]

[6] Nicolay Rusnachenko Language Models Application in Sentiment Attitude Extraction Task Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2021;33(3):199-222. (In Russ.) [paper] [code-networks] [code-bert]

[7] Bowen Zhang, Daijun Ding, Liwen Jing How would Stance Detection Techniques Evolve after the Launch of ChatGPT? [paper]

Back to Top