Skip to content
2018 Reddit release of suspicious accounts
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
README.md
seed.csv

README.md

2018 Reddit Suspicious Accounts Release

On April 11, 2018 Reddit released its 2017 transparency report, along with a list of 944 accounts that the site's administrators suspect belonged to the Russian Internet Research Agency.

To give you more insight into our findings, here is a link to all 944 accounts. We have decided to keep them visible for now, but after a period of time the accounts and their content will be removed from Reddit. We are doing this to allow moderators, investigators, and all of you to see their account histories for yourselves.

-- /u/spez

Harvesting mode

This dataset is an archive all public comments, submissions and user data beloging to these accounts, retrieved on Aprile 11, 2018 at ~17:00 CEST (GMT+2) from the Reddit API and stored as CSV. As of the extraction, one of the 944 accounts had been taken down and a 404 status was returned for its profile. Each sheet contains selected and relevant fields from the objects returned by the API.

The data has been harvested through the excellent PRAW library. A log.txt file is made available.

Contents

The following files are made available:

  • seed.csv : the original user list as released by Reddit
  • data/users.csv: user data related to each of the released accounts
  • data/comments.csv: comment history from each of the released accounts
  • data/subissions.csv: submissions from each of the released accounts
  • data/log.txt: data extraction log

Legal

Comment releases have been a thing for a long time. All information provided were publicly available and searchable as of the extraction. No data is apparently classifiable as PPI. Nothing in the Reddit API TOS explicitly prohibits harvesting publicly available data. All legal stuff should be directed at inbox [/at/] albertocoscia [/dot/] me.

Where applicable, this data is released under a CC0 license and is public domain.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.