Skip to content

A Data wrangling and Cleaning Project That Involves Gathering Data from Different sources Including querying Twitter's API using tweepy

Notifications You must be signed in to change notification settings

quzeem91/Project-Wrangle-and-Analyze-Data

Repository files navigation

Project-Wrangle-and-Analyze-Data

A Data wrangling and Cleaning Project That Involves Gathering Data from Different sources Including querying Twitter's API using tweepy

by OLAMIDE QUZEEM O.

Installations

  • NumPy
  • pandas
  • Matplotlib
  • Seaborn
  • import tweepy
  • json
  • timeit

Dataset

For this project, three datasets were used. Two of them were provided directly, while the third one required querying Twitter's API and writing the data to a .txt file. The datasets used are as follows:

  1. WeRateDogs Twitter Archive Data: This dataset contains information such as tweet ID, timestamp, rating numerator, rating denominator, name, etc.
  2. Tab Separated Values (TSV) file: This file contains images that need to be filtered to extract pictures of dogs.
  3. Querying Twitter's API: Data obtained from querying Twitter's API and writing it to a .txt file.

Insights:

  1. The top dog breeds posted on WeRateDogs are:
    • Labrador Retriever
    • French Bulldog
    • Chihuahua
    • Pembroke
    • Eskimo Dog
  2. The most commonly used device by handlers for tweeting is an iPhone.
  3. December has the highest tweet rates, followed by November.
  4. There is no significant effect of the day on the tweet rate.

Reports

This project consists of two reports:

  1. Wrangle Report: This report provides detailed information about the data wrangling efforts undertaken during the project. It is framed as an internal document.
  2. Wrangle Act: This report communicates all the insights and visualizations derived from the wrangled data. It is framed as an external document, similar to a blog post or a magazine article.

By presenting the findings in these two reports, the audience will gain a comprehensive understanding of the data wrangling process and the insights obtained from the analysis.

About

A Data wrangling and Cleaning Project That Involves Gathering Data from Different sources Including querying Twitter's API using tweepy

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages