Skip to content
Source code that reproduces the results from the paper "Who Let The Trolls Out? Towards Understanding State-Sponsored Trolls" (
HTML Jupyter Notebook Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
Analysis Notebook.html
Analysis Notebook.ipynb

Analysis Tools for State-Sponsored Trolls on Twitter

This repository contain the source code for reproducing the results from the paper "Who Let The Trolls Out? Towards Understanding State-Sponsored troll accounts on Twitter" (see for detailed description on the results).



The code relies on the following Python packages:

Pandas (
Numpy (
Matplotlib (
tldextract (
Gensim (
Networkx (
twitter-text-python (
pigeo (
stop-words (
Other requirements
sudo apt-get install libgeos-3.5.0
sudo apt-get install libgeos-dev
sudo pip install

Our data is publicly available at The dataset consists of the data released by Twitter on October 2018 for Russian and Iranian state-sponsored troll accounts, which is available at as well as intermediate data that we generated after processing the raw data. For instance, we include trained Word2Vec and LDA models, the output of our influence estimation experiments via Hawkes Processes, and a lot of other data necessary to reproduce the results in the paper. To use the provided data simply download the compressed file from and make sure that the uncompressed data folder is in the same directory as the IPython Notebook.

Reproducing the results

The plots and tables in the paper can be reproduced using the provided IPython Notebook. The notebook contains the code and the resulting Figures and Tables. Also, it refers to some scripts that need to be run outside the notebook (e.g., location_share/


If you use or find this source code or dataset useful please cite the following work:

  title={{Who Let The Trolls Out? Towards Understanding State-Sponsored Trolls}},
  author={Zannettou, Savvas and Caulfield, Tristan and Setzer, William and Sirivianos, Michael and Stringhini, Gianluca and Blackburn, Jeremy},
  journal={arXiv preprint arXiv:1811.03130},


  • This project has received funding from the European Union’s Horizon 2020 Research and Innovation program under the Marie Skłodowska-Curie ENCASE project (Grant Agreement No. 691025).
You can’t perform that action at this time.