Skip to content

MatthewKrey-zz/Harvest_My_Tweets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 

Repository files navigation

Harvesting & Geolocating Twitter Data

Data Science Experiment 1

As a first venture into the world of Data Science with Python, I found a recipe for Harvesting & Geolocating Twitter Data in a great book by Tony Ojeda, Sean Patrick Murphy, Benjamin Bengfort and Abhijit Dasgupta, “Practical Data Science Cookbook: 89 hands-on recipes to helpyou complete real-world data science projects in R and Python.”

Project Goals:

  1. Create & Build Twitter API

  2. Use Python to determine Twitter followers, friends, and pull Twitter user profiles

  3. Store JSON Twitter data to disk and to MongoDB using PyMongo

  4. Explore the geographic information available in profiles

  5. Visualize geographic information using Python

Toolkit + Credits

Practical Data Science Cookbook: 89 hands-on recipes to help you complete real-world data science projects in R and Python by Tony Ojeda, Sean Patrick Murphy, Benjamin Bengfort and Abhijit Dasgupta. This is an excellent resource for beginning in the world of Data Science with R and Python and I am very thankful to these guys for putting together a fantastic book!

  1. SciPy - Python based ecosystem of open source software for math, science, and engineering and includes a number of useful libraries for machine learning, scientific computing, and modeling.

  2. NumPy - The foundational Python package providing numerical computation in Python. NumPy is the reason that Python cna do efficient, large-scale numerical computation that other interpreted or scripting languages cannot do.

  3. Pandas - Provides a robust data frame object and many additional tools to make traditional data and statistical analysis fast and easy.

  4. [Twitter API](<https://dev.twitter.com/rest/public) - Provides programmatic access to read and write Twitter data.

  5. Twython - Actively maintained, pure Python wrapper for the Twitter API. Supports both normal and streaming Twitter APIs.

  6. mongoDB - Will make your life working with a database super easy, flexible, and scalable.

  7. PyMongo - Python distribution containing tools for working with MongoDB, and is the recommended way to work with MongoDB from Python.

  8. Folium - Folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js library. Manipulate your data in Python, then visualize it in on a Leaflet map via Folium.

About

Harvesting & Visualizing Twitter Data w/ Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages