Skip to content

ChristophTurban/LREC-Irony-Detection-Ensemble-Classifier

Repository files navigation

LREC-Irony-Detection-Ensemble-Classifier

Repository for tables and code for the 2022 LREC paper "Tackling Irony Detection using Ensemble Classifiers"

Table 1 Results of fine-tuning ten Bertweet models on Task A with original train data

Model f1 score on Task A
0 0.7851
1 0.7448
2 0.7644
3 0.7581
4 0.7865
5 0.777
6 0.7455
7 0.5846
8 0.7585
9 0.7666
Mean 0.7471
Ensemble 0.7816

Table 2 Results of fine-tuning ten Bertweet models on Task B with original train data

Model f1 score f1 score B f1 on 0 f1 on 1 f1 on 2 f1 on 3
0 0.7731 0.5706 0.8067 0.7475 0.5486 0.1798
1 0.794 0.6131 0.8353 0.7717 0.5844 0.2609
2 0.7957 0.5921 0.8358 0.7692 0.6337 0.1299
3 0.7815 0.5691 0.8266 0.709 0.5588 0.1818
4 0.7577 0.4986 0.821 0.7413 0.3178 0.1143
5 0.7894 0.6369 0.8388 0.7768 0.6627 0.2692
6 0.7707 0.5492 0.8386 0.7954 0.5324 0.0303
7 0.7916 0.6081 0.8328 0.7531 0.6216 0.225
8 0.771 0.5649 0.8075 0.7184 0.6071 0.1266
9 0.7798 0.5703 0.8394 0.7692 0.5152 0.1573
Mean 0.7804 0.5773 0.8282 0.7552 0.5582 0.1675
Ensemble 0.7977 0.5902 0.8475 0.7817 0.5906 0.1408

Table 3 Results of fine-tuning Bertweet models on Task A with original train data + back-translated dataset

Model f1 score on Task A
es 0.7708
fi 0.7636
ru 0.7663
pl 0.7846
de 0.7657
cs 0.7945
nl 0.7732
fr 0.7657
Mean 0.773
Ensemble 0.7868

Table 4 - Results of fine-tuning Bertweet models on Task B with original train data + back-translated dataset

Model f1 score f1 score B f1 on 0 f1 on 1 f1 on 2 f1 on 3
es 0.7947 0.6501 0.8269 0.7857 0.6517 0.3361
fi 0.7253 0.5219 0.7687 0.6522 0.48 0.1867
ru 0.7668 0.5719 0.7959 0.7214 0.5298 0.2406
pl 0.7499 0.5672 0.7687 0.6593 0.6087 0.2321
de 0.7477 0.5561 0.7908 0.6957 0.5156 0.2222
cs 0.7832 0.5741 0.8271 0.7526 0.4923 0.2243
nl 0.7918 0.628 0.829 0.7611 0.6506 0.2712
fr 0.7744 0.5845 0.8118 0.7236 0.5479 0.2545
Mean 0.7667 0.5817 0.8024 0.7189 0.5596 0.246
Ensemble 0.794 0.6106 0.8353 0.7641 0.5986 0.2444

Table 5 Results of fine-tuning ten Bertweet models on Task B, original train data + antonym + negation not-ironic cases

Model f1 score f1 score B f1 on 0 f1 on 1 f1 on 2 f1 on 3
0 0.7338 0.5307 0.802 0.7297 0.5306 0.0606
1 0.782 0.558 0.8259 0.7635 0.5854 0.0571
2 0.7543 0.5445 0.812 0.7421 0.5405 0.0833
3 0.7549 0.5476 0.7864 0.6918 0.5325 0.1798
4 0.7384 0.592 0.7648 0.6683 0.623 0.3119
5 0.7496 0.5774 0.7979 0.719 0.5875 0.2051
6 0.7265 0.5746 0.739 0.6712 0.6111 0.2771
7 0.7512 0.5594 0.8017 0.7117 0.5306 0.1935
8 0.7442 0.5611 0.7667 0.6845 0.625 0.1684
9 0.7493 0.5211 0.8046 0.7293 0.3636 0.1871
Mean 0.7484 0.5567 0.7901 0.7111 0.553 0.1724
Ensemble 0.7662 0.5775 0.816 0.7454 0.6203 0.1282

Table 6 Results of fine-tuning ten Bertweet models on Task B, original train data + antonym not-ironic cases

Model f1 score f1 score B f1 on 0 f1 on 1 f1 on 2 f1 on 3
0 0.7647 0.5934 0.8052 0.7382 0.5828 0.2474
1 0.772 0.5775 0.8194 0.7722 0.5475 0.1707
2 0.7646 0.5687 0.8123 0.7296 0.5352 0.1978
3 0.7463 0.5635 0.79 0.7128 0.5481 0.2029
4 0.757 0.5876 0.7901 0.7089 0.5806 0.2708
5 0.7275 0.5317 0.7577 0.651 0.5882 0.1299
6 0.7609 0.5711 0.8017 0.7283 0.6244 0.1299
7 0.7325 0.555 0.7629 0.6874 0.5581 0.2115
8 0.7446 0.5073 0.7896 0.6819 0.5286 0.029
9 0.7617 0.5881 0.7909 0.7167 0.6228 0.2222
Mean 0.7532 0.5644 0.792 0.7127 0.5716 0.1812
Ensemble 0.78 0.5996 0.8226 0.7481 0.625 0.2025

Table 7 Results of fine-tuning ten Bertweet models on Task B, 1000 elements for each class

Model f1 score f1 score B f1 on 0 f1 on 1 f1 on 2 f1 on 3
0 0.6985 0.4987 0.7108 0.635 0.5784 0.0706
1 0.7274 0.5931 0.7518 0.7085 0.6421 0.2698
2 0.7078 0.5689 0.7134 0.6771 0.6333 0.2517
3 0.7211 0.5488 0.7367 0.697 0.55 0.2115
4 0.7533 0.6011 0.786 0.7473 0.6061 0.2649
5 0.7332 0.5741 0.753 0.6878 0.6188 0.237
6 0.7396 0.5712 0.7719 0.7111 0.5667 0.2353
7 0.7323 0.5911 0.7576 0.6868 0.6395 0.2804
8 0.7459 0.5762 0.766 0.7194 0.5871 0.2326
9 0.719 0.5763 0.7291 0.7224 0.6087 0.2452
Mean 0.7278 0.57 0.7476 0.6992 0.6031 0.2299
Ensemble 0.7608 0.6003 0.7886 0.7379 0.6462 0.2286

Table 8 Results of fine-tuning ten Bertweet models on Task B, 3000 elements for not ironic and ironic by polarity clash classes - 1000 elements for situational irony and other irony

Model f1 score f1 score B f1 on 0 f1 on 1 f1 on 2 f1 on 3
0 0.7669 0.5546 0.8044 0.7287 0.5914 0.0941
1 0.7633 0.5703 0.7913 0.7611 0.5887 0.14
2 0.7162 0.5598 0.7293 0.6784 0.5921 0.2394
3 0.763 0.5568 0.7923 0.6757 0.5455 0.2136
4 0.7066 0.5513 0.7089 0.6469 0.6162 0.2333
5 0.7675 0.5964 0.7987 0.7419 0.5442 0.3008
6 0.7653 0.561 0.8078 0.7067 0.6154 0.1143
7 0.7564 0.5618 0.803 0.7454 0.5466 0.1522
8 0.7401 0.5398 0.7703 0.6711 0.5697 0.1481
9 0.7638 0.5711 0.8135 0.7474 0.6211 0.1026
Mean 0.7509 0.5623 0.7819 0.7103 0.5831 0.1738
Ensemble 0.7742 0.5886 0.8122 0.7456 0.6279 0.1687

Table 9 Results of training the proposed combination ensemble model ten times

Model f1 score f1 score B f1 on 0 f1 on 1 f1 on 2 f1 on 3
0 0.7902 0.593 0.8432 0.7855 0.6133 0.1299
1 0.7797 0.5921 0.8198 0.7531 0.6135 0.1818
2 0.7807 0.5956 0.8247 0.7649 0.6026 0.1905
3 0.7733 0.5996 0.8175 0.7468 0.6093 0.2247
4 0.7919 0.5866 0.8354 0.7624 0.5987 0.15
5 0.7833 0.609 0.8237 0.7684 0.6265 0.2174
6 0.7851 0.6006 0.8303 0.7619 0.6375 0.1728
7 0.7779 0.5849 0.8229 0.7558 0.596 0.1647
8 0.7771 0.6175 0.8209 0.7579 0.6115 0.2796
9 0.8001 0.6182 0.8414 0.7807 0.6626 0.1882
Mean 0.7839 0.5997 0.828 0.7637 0.6171 0.19

Table 10 Results of ten fold cross-validation

Model f1 score f1 score B f1 on 0 f1 on 1 f1 on 2 f1 on 3
0 0.7942 0.5951 0.851 0.7531 0.6667 0.1096
1 0.7787 0.6188 0.8202 0.7386 0.6369 0.2796
2 0.8325 0.6621 0.8733 0.8083 0.646 0.321
3 0.8047 0.6278 0.8551 0.7356 0.6241 0.2963
4 0.7711 0.5854 0.8397 0.6985 0.6061 0.1972
5 0.8139 0.6227 0.8653 0.7791 0.6434 0.2029
6 0.8013 0.5889 0.8484 0.7443 0.5921 0.1707
7 0.7948 0.6315 0.8528 0.753 0.6383 0.2821
8 0.8142 0.6375 0.8544 0.7507 0.6748 0.2703
9 0.8321 0.6514 0.871 0.7619 0.6871 0.2857
Mean 0.8038 0.6221 0.8531 0.7523 0.6415 0.2415

About

Repository for tables and code for a LREC paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages