title | tags | slideOptions | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
OpenRefine |
presentation |
|
- Download the data and save it to your desktop
- Download the latest stable release of OpenRefine and unzip it to a convenient location on your computer
Questions:
- What is OpenRefine useful for?
Objectives:
- Describe OpenRefine's uses and applications.
- Differentiate data cleaning from data organization.
- Experiment with OpenRefine's user interface.
- Locate helpful resources to learn more about OpenRefine.
- OpenRefine is a powerful, free, and open source tool that helps you with data wrangling
- OpenRefine started as Google Refine before it was released to the open source community
- It combines the GUI interface of Excel with the reproducibility of scripting languages like R and Python
- Automatically keeps a log of every change you make
- Does not allow you to modify your original file
- Any operation can be undone
- Can repeat your steps for more than one data set
- Provides a user-friendly interface for complex clustering algorithms
- Overview a data set
- Resolve inconsistencies
- Help you split data into granular parts
- Match local data with other data sets
- Save a set of data cleaning steps for replay on multiple files
Questions:
- How can we import data into OpenRefine?
- How can we sort and summarize data with OpenRefine?
- How can we find and correct errors with OpenRefine?
Objectives:
- Create a new OpenRefine project from a
.csv
file. - Look at facets and how they sort and summarize data.
- Look at clustering and how to apply it to edit groups of typos.
- Undo/redo steps.
- Split values into multiple columns.
- Remove white spaces from cells.