Skip to content

nzherald/crashes-2018

Repository files navigation

Crashes 2018

This is the data, analysis, and source code for the interactive Our most fatal morning

caveat This code has been made public as is after the article publication. We (the New Zealand Herald data journalism team) are trying out the approach of just releasing data, analysis, and source code once an article is published - rather than planning to tidy up the code and then releasing it. Hopefully this is helpful and makes our reporting more transparent. If you would like something to be better documented please contact me (Chris) on twitter (@vizowl) or by email chris.knox@nzherald.co.nz

License

License: CC BY-NC-SA 4.0

The data, analysis, and visualisations are released under a Creative Commons Attribution-NonCommercial-ShareAlike license CC BY-NC-SA 4. You can use it, but please attribute the New Zealand Herald and we would prefer it if you got in touch and let us know how you are using it.

The data analysis and processing code is MIT licensed.

The data was released to the New Zealand Herald by NZTA under the Official Information Act, and the original data file is data/crashes by severity and hour.csv - data released under OIA generally has no explicit license so is treated as falling under the general NZGOAL Framework

General approach

The source code is broken into 4 directories, data, analysis, preparation, and interactive.

Haskell's shake build system is used to marshal all the data into a database. The analysis directory is where open ended analysis is carried out. These analysis scripts are often not complete - but show some of the directions looked at. The build process converts data into (usually JSON) and drops it into the interactive directory.

All the data build products are checked into git so that it is not necessary to have to run the haskell, R, and PostgreSQL portions.

The actual interactive is an Elm app.

There are a lot of reasons for working in Haskell and Elm - which I won't document here - but the primary motivation is to be confident in the structure of all the data we have marshaled - both now and in a year when we revisit this article.

Building the code

This article uses a lot of tools so getting it running may be frustrating - and it may not be possible on a Windows computer.

Prerequisites

The build will drop a local PostgreSQL database called 'crashes-18' so DO NOT run it if this will cause you problems

R will need to have the following libraries installed.

library(tidyverse)
library(ggthemes)
library(RPostgreSQL)
library(zoo)
library(lubridate)
library(here)
library(knitr)
library(stats)

You will need to run the build as a user that can access a PostreSQL server without a password and create new databases.

Then just run stack build --exec build and this should create the database crashes-18 and prepare all the data and produce an interactive in interactive/dist