Skip to content
Amanda Hickman edited this page Mar 3, 2015 · 1 revision

Notes from Data Skills on Feb. 11, 2015

Homework - what we learned

If you spend 15 or 20 minutes on the assignment and you're not getting anywhere, send a note asking about the instructions. Segue into how to ask good questions. Guidance on writing good questions to get help Go through all the steps you've taken. Get out of the habit of using shorthand, try to use full sentences. If I follow your steps, I should be able to reproduce your problem. The assignments will re-enforce what we did in class, not be something completely new.

Reflecting on the 311 assignment

Are there points of comparison that you need? (To understand the data better) For how many noise complaints per kind of dwelling, it would be useful to get the number of each kind of dwelling from the Department of Buildings. Michael Heimbinde, the AirCasting guy also has an app that measures volume of an area and files it with the latitude-longitude coordinates.

Introduction by Jue Yang, technologist in residence

A resource to go to when needing help. Past projects: http://open.undp.org/#2014 Can help us with technically cleaning data, storytelling, some analyses, finding more datasets, and lots more. Doesn't matter what kind of question, she's happy to talk or meet. To reach her: https://github.com/jueyang/call-me-maybe

Turning to look at our pitches on Gist

https://gist.github.com/amandabee/1f97f9639a1b7128559f http://daringfireball.net/projects/markdown/dingus Taking 5 minutes to add formatting to our pitches using Markdown. Take 5-10 minutes to read another group's pitch, then give feedback, particularly about potential points of comparison and what datasets they could throw out. Look at each other's pitches and see what could be useful for you and give feedback to each other.

Learning Open Refine

Downloaded the Louisiana Elected Officials Excel file: http://www.sos.la.gov/ElectionsAndVoting/FindPublicOfficials/Pages/default.aspx Those who hadn't earlier downloaded Google Open Refine. Opening the Louisiana Elected Officials file in Open refine http://amandabee.github.io/CUNY-data-skills/hands-on/refine.html We went over:

  • Clustering
  • Transform: Trim leading and trailing whitespace
  • Reading the Undo/Redo tab with each action you took to the dataset, in a way that you can read it (JSON)
  • Splitting columns (like zip codes into first five digits and plus four into a separate column)
  • Custom numeric facet

Using Open Refine is particularly useful in campaign finance reporting. More on faceting: https://github.com/OpenRefine/OpenRefine/wiki/Faceting

If you want to export the dataset from Refine to Excel or something else, export it as a csv.

Going over assignment for next week:

http://amandabee.github.io/CUNY-data-skills/assignments/week03.html Pick a council member, look at their contributions Clean it up and say what you found in there Post your findings to the Tumblr. What did you cluster?

Getting gists ready to publish

Using Mou or Dingus to take something from markdown to HTML Take the HTML and paste in into an added file to your Gist called index.html Switch out the https://gist.github.com/ in the URL with bl.ocks.org/ You should be able to view it as a page now, with all the pretty formatting.

Why Markdown

  • So you know it'll look the same for everyone, regardless of browser/device
  • So you stay focused on your content, not on the design
  • Potentially less frustrating than FTP-ing

Feedback from Jue on our pitches

  • Think about the final format of your report.
  • Think about what kind of product, what kinds of charts or maps, and what kind of raw data will you need to build those.
  • If you see a chart in a PDF (or a Pew report), try to get to the original raw data that went into the report so that you can work with it directly.
  • WNYC has a project called SchoolBook that will be helpful in doing school profiles.
  • Footnotes on good visualizations (like maps) can lead you to the raw data
  • The American Communities Survey by the Census, and lots of other great data available from Census