This project focuses on the classification of speeches as either populist or non-populist using fine-tuned pre-trained language models. We evaluated four models—BERT-tiny, BERT-large, GPT-2, and RoBERTa-large—on a dataset of 500 manually labeled speeches. The best performing model, RoBERTa-large, achieved an accuracy of 88%, demonstrating its effectiveness for this task.
- BERT-tiny (Google)
- BERT-large (Google)
- GPT-2 (OpenAI)
- RoBERTa-large (Facebook AI)
Each model was fine-tuned using pre-processed speeches, tokenized to fit the input structure of the model, and evaluated based on accuracy and loss metrics.
The dataset contains 500 speeches, equally split between populist and non-populist categories, manually labeled by the contributors. The speeches were collected via web scraping and include translated texts from various languages, further enriching the diversity of the data.
The project code was primarily run on Google Colab using a high-RAM A100 GPU, which allowed for:
- Batch size of 128 for BERT-tiny
- Batch size of 20 for the larger models
The code can be run all together with no issues given the right GPU and Database path. Replicating these results on Colab is straightforward, though smaller GPUs or CPUs will require a reduction in batch size:
- Batch size of 8 for single-model runs
- Batch size of 2-4 for running all models simultaneously
- RoBERTa-large: Best model with an accuracy of 88%.
- BERT-large: Accuracy of 71%.
- GPT-2: Accuracy of 61%.
- BERT-tiny: Accuracy of 59%.
For more details on the performance and methodology, please refer to the Experiments and Results section in the report.
- Clone this repository.
- Ensure that you have PyTorch, transformers, and the necessary libraries installed.
- Set up your environment with a high-memory GPU for best performance, or adjust the batch size as outlined above.
- Run the code using the provided scripts, ensuring you specify the correct database path.
For further details on implementation, refer to the documentation within the scripts.
- Alessandro Pala
- Lorenzo Cino
- Greta Grelli
- Alberto Calabrese
- Giacomo Filippin
