Skip to content

This is a data analytics project focusing on data wrangling three separate sources of data for famous Twitter account We Rate Dogs.

License

Notifications You must be signed in to change notification settings

jpadillo/Data-Wrangling

Repository files navigation

Data Wrangling Project

This is a data analytics project utilizing Python v.3 libraries Numpy, PANDAS, Matplotlib, Os, Beautiful Soup, tweepy and Requests through Jupyter Notebook in order to analyze data from renowned Twitter account We Rate Dogs. The focus of this project is data wrangling wherein we undergo the three steps of data wrangling systematically from gathering data to data cleaning.

Required Software:

  • Jupyter Notebook
  • Numpy
  • PANDAS
  • Matplotlib
  • Os
  • Beautiful Soup
  • Tweepy
  • Requests

    Data Analysis Outline:

  • The first stage of data analysis is data wrangling, and in data wrangling, the first step is gathering data. In this stage, pandas requests, numpy, Beautiful Soup, Tweepy and os were utilized to gather and read data from three different sources.
  • The second step of data wrangling is assessment wherein we utilize both visual and programmatic assessment methods in order to assess data quality and tidiness issues that we need to address prior to analyzing data.
  • The third and last step of the data wrangling process is the cleaning stage wherein we use several pandas methods in order to clean any quality and tidiness issues we've detected during the assessment process.
  • The last part of the data analytics process is data visualization and analysis. In this section, I used Matplotlib in order to create visualizations and show results of my analysis.