Skip to content

UWNETLAB/tidyextractors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tidyextractors

Overview

tidyextractors makes extracting data from supported sources as painless as possible, delivering you a populated Pandas DataFrame in just a few lines of code. tidyextractors was inspired by Hadley Wickham's (2014) paper <http://vita.had.co.nz/papers/tidy-data.html>_ which introduces "tidy data" as a conceptual framework for data preparation.

For more information, including code examples, API reference, and general documentation, click HERE <http://tidyextractors.readthedocs.io/en/latest/>_.

Features

  • Extracts data with minimal effort.
  • Creates readable code that requires minimal explanation.
  • Exports Pandas Dataframes to maximize compatibility with the Python data science ecosystem.

Currently Implemented Data Sources

  • Local Git Repositories <http://tidyextractors.readthedocs.io/en/latest/git_overview.html>_
  • Twitter User Data (including Tweets) using the Twitter API <http://tidyextractors.readthedocs.io/en/latest/twitter_overview.html>_
  • Emails stored in the Mbox file format. <http://tidyextractors.readthedocs.io/en/latest/mbox_overview.html>_

Installing

Just run pip3 install tidyextractors.