John Little 2022-04-22
Thes are my slides, supporting materials for DSVIL 2018, the Data Science & Visualization Institute hosted by NCSU Libraries. June 6, 2018
Data Science And Visualization Institute for Librarians. I am teaching modules on Data Cleaning, Web Scraping, HTML and JSON parsing, and Twitter Stream Gathering
https://www.lib.ncsu.edu/data-science-and-visualization-institute/schedule
NCSU Libraries
Because data transformations are crazy-fun awesome
Slides are divided into the following sections
- Index
- Introduction
- Web Scraping
- OpenRefine: Data Cleaning Basics
- OpenRefine: Reconciliation
- Capturing Twitter Data
- APIs & JSON Parsing
- More HTML Parsing
- Data Cleaning – Basic Transformation with OpenRefine (Exercise 1)
- Data Cleaning – GREL (Exercise 2)
- Reconciliation with OpenRefine
- Social Media – Twitter gathering with TAGS app (Exercise 1)
- Social Media – Twitter: TAGS visualization and tools
- APIs & JSON parsing – OpenRefine (exercise 1)
- APIs – using API Keys (exercise 2)
- Intro HTML Parsing: Steps 1 -6 (exercise 1)
- More OpenRefine – Looping Control: Steps 7-end (exercise 2 – This section will introduced more advanced features of OpenRefine using HTML parsing as the example exercise)
https://github.com/libjohn/openrefine/tree/master/data
Data, presentation, and handouts are shareable under CC BY-NC license