Under the Radar: Analyzing Recent Twitter Information Operations to Improve Detection and Removal of Malicious Actors, Part 1
R code and markdown documents used to create a data analytics project examining three information operations removed from Twitter in 2021.
Russian information operations (IOs) targeting the U.S. political system have generated a significant amount of attention in recent years. Yet a recently published report that I co-authored argues that this focus on a single threat actor leaves the West open to being blindsided by other, emerging threats. While analysts in the West focused on Russia, other actors like Iran, North Korea, and China were actively conducting their own IOs on social media platforms such as Twitter. In fact, Twitter regularly makes announcements about IO takedowns. The company also makes datasets of tweets posted by accounts tied to these IOs available for researchers. This report analyzes some of the key similarities and differences across three tweet datasets collected from recent IOs in the hope of shedding some light on the possible reach and return on investment gained by malign actors conducting these IOs. Each of the three Twitter datasets focus on a separate country, Russia, China, and Iran, all disrupted by Twitter in 2021. This analysis found that Iran’s IO had the widest reach. This was contrary to, or perhaps because of, the growing attention placed on Russian IOs and, to a lesser extent, Chinese IOs. This analysis also revealed that more accounts do not necessarily yield a greater overall reach for the IO and that a more generalized response mechanism for countering IOs, regardless of country of origin, remains imperative. Lastly, these datasets lend themselves well to further analysis, which I am hopeful will prompt a series of follow-on reports aimed at improving detection and disruption of these operations.
The full written report of my findings can be found on my website, https://wonksecurity.com. Here is the direct link to the pdf of the report: https://wonksecurity.com/wp-content/uploads/2022/10/Twitter_info_op_report_v1-1.pdf.
How to use this project:
- Go to Twitter's Transparency Center (https://transparency.twitter.com/en/reports/moderation-research.html) and acquire the original datasets used in this project. I will not be making the raw data available while it remains easily accessible via Twitter's own service.
- Acquire the "People’s Republic of China - Xinjiang (December 2021) - 2048 Accounts" dataset released in December 2021, particularly the Tweet Information file.
- Acquire the Tweet Information file from the "Iran (February 2021) - 238 Accounts" dataset released in February 2021.
- Lastly, acquire the Tweet Information file from the "Russia IRA (February 2021) - 31 Accounts" dataset also releaed in February 2021.
- Download the R scripts and Quarto markdown file provided here to replicate, build upon, or fork the cleaning and analysis process that I used.
These datasets provided a massive wealth of possible information, and it is my goal to conduct additional analysis in future projects. I list out several potential ideas of how the data could be further analyzed. If you try one of these or have additional ideas, feel free to drop me a line at cody [@] wonksecurity [.] com. I'd love to hear what you think.