Rutgers Crime Alert API
A very rough API that scrapes the gross RUPD Crime Alert page, strips it of HTML and bureaucratic boilerplate with regex, saving a json array to file.
Extracts time and location with regex.
Since the structure of these alerts is so repetitive, it should be easy to extract structured data from them. Might require some level of NLP and isn't strictly necessary, I suppose.
Notably: crime, perpetrator description.
Location extraction doesn't currently catch cases like
Mason and Pine Streets or
on Robinson Street between Hamilton Street and Central Avenue, reporting just
Pine Street or
Robinson Street, Hamilton Street respectively.
Also needs to load the local database and merge with new requests instead of clobbering in case RUPD takes down old alerts.
requests BeautifulSoup dateutils