The-Data-Wrangling-Report-

Introduction

Data wrangling is a core skill that everyone who works with data should be familiar with since so much of the world's data is not clean. The data wrangling process

Gathering data.
Assessing data.
Cleaning data.

Gathering Data

Enhanced Twitter Archive

The Twitter archive #WeRateDogs, a csv file that contains the tweet, tweet id, timestamp, text, rating numerator and denominator, dog name, etc.

Image Predictions File

Image prediction file, based breed of dog is in each tweet, I downloaded the image prediction file programmatically from Udacity's servers using the requests library.

Additional Data via the Twitter API

Additional data collection including "Retweet count "and "Favorite count", using python's tweepy library.

Assessing Data

After gathering the data and storing them in DataFrames, assessing the data for quality and tidiness.

Data were assessed based on quality and tidiness.

Low quality is data has content issues such as missing, inaccurate. So I doing I removing unnecessary columns, converting data types, and removing outliers.
Untidy is data has structural issues. So I doing gathering dog stages from multiple columns into one, creating a “prediction” column (dog, not dog, maybe dog), and combining the three datasets into one.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
act_report.pdf		act_report.pdf
wrangle_act.html		wrangle_act.html
wrangle_act.ipynb		wrangle_act.ipynb
wrangle_report.pdf		wrangle_report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The-Data-Wrangling-Report-

Introduction

Gathering Data

Enhanced Twitter Archive

Image Predictions File

Additional Data via the Twitter API

Assessing Data

About

Releases

Packages

Languages

Abdulrahman1997a/The-Data-Wrangling-Report-

Folders and files

Latest commit

History

Repository files navigation

The-Data-Wrangling-Report-

Introduction

Gathering Data

Enhanced Twitter Archive

Image Predictions File

Additional Data via the Twitter API

Assessing Data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages