Skip to content

Project 4 of the Udacity Data Analyst Nanodegree Program. Data Wrangling of tweets from three different sources, with different file extensions.

License

Notifications You must be signed in to change notification settings

leosimoes/Udacity-Wrangle-and-Analyse-Data-Tweets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wrangle and Analyse Data - Tweets

Project 4 of the Udacity Data Analyst Nanodegree Program.

Data Wrangling of tweets from three different sources, with different file extensions. The dataset was evaluated according to its structure and data quality, and after that, the data was cleaned, so that there could be an analysis.

Summary of Findings

  • The highest 'rating', which is calculated by 'rating_numerator' / 'rating_denominator', is 177.6.
  • There is a big difference in the number of dogs per stage. 84% of the stages are 'none'.
  • Considering those that are classified (there is no stage 'none') 66% are 'pupper', 20% are doggo, 7.5% are 'puppo', 2.3% are 'floofer' and and the rest is classified as with 2 stages.
  • Considering only the unique stages, the ‘puppo’ stage has the highest average favorite count, followed by ‘doggo’, ‘floofer’ and ‘pupper’.
  • The p2 algorithm was the one that identified the largest number of dogs, 1553. Followed by p1 with 1532 and p3 with 1499.
  • The algorithms p1, p2 and p3, identified 111,113 and 116 different breeds of dogs respectively. In this metric, the algorithms do not differ much.

References

UDACITY - Data Analyst Nanodegree Program: https://www.udacity.com/course/data-analyst-nanodegree--nd002

WeRateDogs, Twitter profile (@dog_rates): https://twitter.com/dog_rates/status/749981277374128128

About

Project 4 of the Udacity Data Analyst Nanodegree Program. Data Wrangling of tweets from three different sources, with different file extensions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published