The lesson and source files for Dan Nguyen's NICAR 2012 lesson on Google Refine
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
README

README

# Google Refine for Investigative Journalism

*An introduction to one of the best data tools for any reporter of **any** technical level*


This is a hands-on walkthrough for [NICAR 2012](http://www.ire.org/conferences/nicar-2012/). It will take place on Friday, from 2-2:50PM in the **Jeffersonian/Knickerbocker** room. It will be led by Dan Nguyen ([@dancow](http://twitter.com/dancow)) with help from Joe Kokenge ([@josephkokenge](http://twitter.com/josephkokenge)) of ProPublica.
 

## A tool for cleaning and investigations
You may have heard how [Google Refine](http://code.google.com/p/google-refine/) – a "power tool for working with messy data" – is a great data cleaning tool. But if you haven't tried it out yet, then you're missing out on the potential stories and insights that Refine can easily (and sometimes exclusively) find in data.

It doesn't matter what skill level you have. Refine is one of those unique tools that is as equally useful to those who have never left their click-and-drag interfaces into command-line world as it is to the most anal-retentive detail-oriented data analysts and power-programmers.

In this lesson, I'll start at the very basics: opening a file with Refine, doing basic sorting, searching (things that you can do in Excel, of course) to its easy-to-use data cleaning methods and then to how Refine can help you probe unfamiliar datasets to scout out a story.


The two datasets I will be working with are:

* [FEC Individual Contribution Data](http://www.fec.gov/finance/disclosure/ftpdet.shtml#a2011_2012) – the list of citizens who have individually contributed $200+ to the political process

* The [White House Visitor Logs](http://www.whitehouse.gov/briefing-room/disclosures/visitor-records) – everyone who has visited the White House during the Obama Administration (in the time period it chose to disclose), who they visited, and (sometimes) why.

(hopefully we'll have time to do both)


## Contacts

* Dan Nguyen ([@dancow](http://twitter.com/dancow))
* Joseph Kokenge ([@josephkokenge](http://twitter.com/josephkokenge))

## The FEC Data
** This tip sheet is a work-in-progress...I will be filling it out through today and it should be done by the time of my hands-on session**


## The White House Visitor Logs
** This tip sheet is a work-in-progress...I will be filling it out through today and it should be done by the time of my hands-on session**