Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

#Parole Hearing Data

What is the Parole Hearing Data Project?

The Parole Hearing Data Project pulls records from the New York State Parole Board’s Interview Calendar and dumps them into a spreadsheet, which grows every month as the parole board updates the calendar by adding newly scheduled hearings and newly issued determinations. The goal of this project is to enable researchers to analyze this data and better understand patterns within parole hearing determinations within New York State.

Data from a sample run is included in data.csv. This project is in development.

Why are we working on this?

Because in New York over 10,000 parole eligible prisoners are denied release every year based on the discretion of a small number of parole commissioners. The consequences of these denials have very real social and financial costs: families remain separated for periods that are arguably longer than necessary, incarcerated individuals who have changed their lives lose hope in a better future, millions of dollars are spent to imprison men and women who have been determined as posing a low risk to society were they to be released (in New York it costs $60,000 annually to incarcerate one individual, and more to incarcerate older individuals with illnesses).

When we began this project, we wanted to see if we could understand better what the parole board's deicison making patterns were. Since then, we've learned through feedback from academic and nonprofit researchers just how hard it has been to actually analyze the data that the parole board actually publishes without it being reformatted in this way. We're excited for this work to be used and for new insights to result from researchers and criminal justice experts using this project as a resource.

Setup and run

Install app requirements

$ pip install -r requirements.txt

Running the scraper

To run the scraper, execute the following python in the base directory of this repo.

python data.csv > output.csv 2>log.txt &

The results of the scraper will be in output.csv. Scraper logs will be in log.txt. The scraper will run in the background, and use the existing data.csv as a source of historical data.

If you're OK with automatically updating the existing data, there is a convenience script.


You can also automate this by specializing the crontab file and installing it in your system by adapting the line and pasting it in to crontab -e.


Nikki Zeichner, Code for America

Rebecca Ackerman, self

John Krauss, CartoDB

Jane Adams, self

And a special thanks to R. Luke DuBois of NYU and Annie Waldman of ProPublica.

Errors / Bugs

If something is not behaving intuitively, it is a bug, and should be reported. Report it here:

Note on Patches/Pull Requests

  • Fork the project.
  • Make your feature addition or bug fix.
  • Send a pull request. Bonus points for topic branches.