Skip to content

NYU-DataServices/DataHarvesting-Python

Repository files navigation

Python for Harvesting Data on the Web

Nicholas Wolf and Vicky Steeves, NYU Data Services

Vicky's ORCID: 0000-0003-4298-168X | Nick's ORCID: 0000-0001-5512-6151

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Overview

This session is an intermediate-to-advanced level class that offers some ideas for how to approach the following common data wrangling needs in research:

  • Obtain data and load it into a suitable data "container" for analysis, often via a web interface, especially an API
  • Parse the data retrieved via an API and turn it into a useful object for manipulation and analysis
  • Perform some basic data integrity checks to ensure data found "in the wild" on the web is ready for analysis

Setup

Project Environment

Download the notebook available at https://goo.gl/Pnm7Dx and open it in Jupyter Notebook.