Time Series Analysis of Russian IRA Tweets

This repository contains a recipe for bootstrapping a project that does time series analysis on tweets from the Internet Research Agency (IRA) open sourced by FiveThirtyEight. The analysis in this project is bootstrapped using Apache Pinot and Superset.

Warning!

This dataset contains some of the most offensive and toxic text I've ever seen. The tweets contained within the original dataset attempted to hide or obscure the ideological nature of text that the trolls intended to bleed into mainstream media.

The raw text of tweets contained within the dataset will elicit an emotional response, as it was designed to do, and as such, I do not recommend exposing the raw text to any reader without providing this warning.

Usage

The example application in this repository bootstraps an Apache Pinot recipe for importing tweets by fake IRA Twitter accounts for analysis with Apache Superset.

To start the cluster, run the following commands.

$ docker network create PinotNetwork
$ docker-compose up -d
$ docker-compose logs -f --tail=100

After the Docker containers have started and are running, you'll need to bootstrap the cluster with the Twitter data and charts. The following command will download the raw CSV data from this repository and start the Pinot ingestion job.

$ sh ./bootstrap.sh

After the bootstrap script has completed, you should be able to see data in Apache Pinot and be able to login to the Superset website. After logging into Superset, navigate to the dashboards to view the time series analysis of the IRA tweets.

Example Dashboard

The screenshot below is the default dashboard that comes with the example project.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
docker		docker
import		import
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bootstrap.sh		bootstrap.sh
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Time Series Analysis of Russian IRA Tweets

Warning!

Usage

Example Dashboard

About

Releases

Packages

Languages

License

imranansari/russian-troll-analysis

Folders and files

Latest commit

History

Repository files navigation

Time Series Analysis of Russian IRA Tweets

Warning!

Usage

Example Dashboard

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages