# Forecasting Emergency Events in Seattle

Welcome to my capstone project for [The Data Incubator](https://www.thedataincubator.com/)!

This project aims to use [real-time data](https://data.seattle.gov/Public-Safety/Seattle-Real-Time-Fire-911-Calls/kzjm-xkqj) provided by the [Seattle Fire Department](http://www.seattle.gov/fire/) to forecast upcoming "hotspots" (areas of high spatial intensity) for emergency calls (separately for fire and medical).

In the background, I'll make use of Random Fourier Feature Expansions of [Gaussian Process Priors](http://www.gaussianprocess.org/gpml/) to model the observed spatiotemporal point processes as deriving from a latent [Poisson intensity surface](https://en.wikipedia.org/wiki/Poisson_point_process); the brunt of the work is in discovering optimal hyperparameters for this meta-model.

My tool set is [R](https://cran.r-project.org/), especially [`data.table`](https://github.com/Rdatatable/data.table) for high-performance data manipulation, [`splancs`](https://cran.r-project.org/web/packages/splancs/index.html) for high-performance two-dimensional kernel density estimation, and [`spatstat`](http://spatstat.org/) for high-performance geospatial aggregation; and [`Vowpal Wabbit (VW)`](http://hunch.net/~vw/) for ultra-high-performance large-scale machine-learned Poisson regression.

This project is inspired by work forecasting crime hotspots in Portland for the [NIJ Real-Time Forecasting Challenge](https://www.nij.gov/funding/Pages/fy16-crime-forecasting-challenge.aspx), code for which is stored [here](https://github.com/MichaelChirico/portland).

In [1]:
library(rgdal)
from_rff = readOGR('.', 'rff_plot')
from_kde = readOGR('.', 'kde_plot')

Loading required package: sp
rgdal: version: 1.2-7, (SVN revision 660)
 Geospatial Data Abstraction Library extensions to R successfully loaded
 Loaded GDAL runtime: GDAL 1.10.1, released 2013/08/26
 Path to GDAL shared files: /usr/share/gdal/1.10
 Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480]
 Path to PROJ.4 shared files: (autodetected)
 Linking to sp version: 1.2-4 


OGR data source with driver: ESRI Shapefile 
Source: ".", layer: "rff_plot"
with 8417 features
It has 3 fields
OGR data source with driver: ESRI Shapefile 
Source: ".", layer: "kde_plot"
with 7229 features
It has 3 fields


In [4]:
names(from_rff@data)

In [7]:
with(from_rff@data, table(htspt_p, htspt_c))

       htspt_c
htspt_p    0    1
      0 8333   41
      1   41    2

In [8]:
with(from_kde@data, table(htspt_p, htspt_c))

       htspt_c
htspt_p    0    1
      0 7136   44
      1   44    5