Our work includes (1) collecting and constructing fine-grained data, (2) implementing and evaluating LLMs interpretability techniques in the fine-grained way.
We collected Allsides dataset, which contains 970 news headlines extracted from Allsides across different domain dimensions (based on a fine-grained 8-values scheme). The collected data is released at HuggingFace Allsides-8Values.
Now we can let LLMs generate both left and right opinions for each event, and obtain the rephrased dataset. The code implementation for LLMs rephrasing can be found in \data.
\xllm contains key modules for the learning, detecting, and intervening tasks in LLM political internal opinions.
Resources
- The designed templates are based on the document Model Prompting Guides
Requirements
pip intsall torch,notebook,pandas,openai,transformers,seaborn,numpy,matplotlib,scikit-learn,datasets
pip install transformer-utils
Cite our paper if our paper/repository inspires or supports your work.
Cheers,
APA Format
Hu, J., Yang, M., Du, M., & Liu, W. (2025). Fine-Grained Interpretation of Political Opinions in Large Language Models. arXiv preprint arXiv:2506.04774.
BibTex
@article{hu2025fine,
title={Fine-Grained Interpretation of Political Opinions in Large Language Models},
author={Hu, Jingyu and Yang, Mengyue and Du, Mengnan and Liu, Weiru},
journal={arXiv preprint arXiv:2506.04774},
year={2025}
}