Skip to content

Using data from twitter to create analyses about Brazilian Elections

Notifications You must be signed in to change notification settings

mascalmeida/br-elections-on-twitter

Repository files navigation

urna Brazilian Election on Twitter twitter

This project is a data-driven solution that uses Twitter data to analyse the brazilian presidency candidates profiles performance.

Note: This dashboard is discontinued since Oct 30th 2022, full last screen-shot.

Achitecture

architecture_v0

AWS Resources

Resource Function Description
EventBridge Trigger the ETL Build event-driven applications at scale across AWS, existing systems, or SaaS apps
ECR Store the container with the ETL Easily store, share, and deploy your container software anywhere
Lambda Run the ECR with the ETL Run code without thinking about servers or clusters
RDS Operate MySQL Database Set up, operate, and scale a relational database in the cloud with just a few clicks

ETL

  • Extraction: Tweepy is a python package that makes easier the access to Twitter API. The functions that have been used here are:

    1. get_recent_tweets_count. This function gets the number of Tweets that mentioned the query words.
    2. get_user. This function gets information about the user, i.e. followers, posts, screen name and etc.
  • Transformation: The main package used to transform and manipulate data was Pandas, it was used mainly to transform data scraped from Twitter API into pandas data frame format.

  • Loading: The SQLAlchemy was used to create the connection (engine) between the python code and the MySQL database, it is possible to combine this connection with Pandas load function.

MySQL Database

  • Profile Mentions Table: Store the number of total mentions, mentions without retweets, and the respective date and time. It is an hourly table.

  • Profile Info Table: Store some important info about the users and the respective date. It is a daily table.

  • Last Updated View: Store the date and time of the last time that the ETL ran.

Shiny App

  • Connecting the app to the MySQL Database: Remote databases are an excellent solution to keep a Shiny app updated. The pool package helps establish and manage remote storage connections. Of course, some sensitive information is needed to build these bridges between the app and storage. That's when the dotenv package comes to aid: it allows the developer to hide their credentials in a .env file, upload it to the host service, and easily access them.

  • Leveraging the power of purrr: When building an app UI, one can use HTML tags inside the R code. Just like some ggplot2 layers, these tags are stored in lists. This means that purrr can be used to build such structures, especially if they are repetitive.

  • Interactive dataviz: ggiraph is a ggplot2-friendly package to build interactive plots. It helps to create plots that do not overwhelm users with data. Hover events and tooltips aid the user to focus on particular aspects of a plot.

References


Support

Give a ⭐️ if you like this project!

React 👍 in our Linkedin post!

Interact ❤️ in our Twitter post!

About

Using data from twitter to create analyses about Brazilian Elections

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published