This work is inspired by the article The Surprising Performance of Simple Baselines for Misinformation Detection. In the project all the suggested approaches for automatic fake news detection were reproduced and tested on 3 main datasets: FakeNews, Celebrity, ReCOVery. In addition, different preprocessing approaches were examined, cross-dataset experiments were conducted and also experiments on larger datasets (FakeNewsNet, NELA_GT_2018) were made to verify the quality of results. Moreover, since the texts of the articles in our datasets are pretty long, we provide the experiments on Longformer model, which was designed to have long sequences of tokens (More than 1000. All other models have the restriction of 512 tokens) as an input. And last but not least, we obtained the visualization of attention layers, which are the part of the majority of considered transformers, in order to get some understanding, what words are important for model to make a decision.
Here you can see our video presentation of the project.
Model | F1 | Recall | Precision |
---|---|---|---|
Bert | 0.756 | 0.784 | 0.784 |
Bert-tiny | 0.768 | 0.842 | 0.707 |
RoBERTa | 0.831 | 0.967 | 0.728 |
ALBERT | 0.712 | 0.703 | 0.737 |
BERTWeet | 0.811 | 0.891 | 0.745 |
Covid twitter | 0.829 | 0.989 | 0.713 |
DeCLUTR | 0.820 | 0.929 | 0.735 |
Funnel Transf. | 0.779 | 0.808 | 0.751 |