Scrapes RUPD Crime Alerts into JSON.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore
README.md
crime.html
json_assembler.py
ugly_parser.py

README.md

Rutgers Crime Alert API

A very rough API that scrapes the gross RUPD Crime Alert page, strips it of HTML and bureaucratic boilerplate with regex, saving a json array to file.

Extracts time and location with regex.

Map demo here!

TODO

Since the structure of these alerts is so repetitive, it should be easy to extract structured data from them. Might require some level of NLP and isn't strictly necessary, I suppose.

Notably: crime, perpetrator description.

Location extraction doesn't currently catch cases like Mason and Pine Streets or on Robinson Street between Hamilton Street and Central Avenue, reporting just Pine Street or Robinson Street, Hamilton Street respectively.

Also needs to load the local database and merge with new requests instead of clobbering in case RUPD takes down old alerts.

Python Requirements:

requests
BeautifulSoup
dateutils