Here are useful modules to import from the outset.

In [3]:
# Some fairly standard modules
import os, csv, lzma
import numpy as np
import matplotlib.pyplot as plt

# The geopandas module does not come standard with anaconda,
# so you'll need to run the anaconda prompt as an administrator
# and install it via "conda install -c conda-forge geopandas".
# That installation will include pyproj and shapely automatically.
# These are useful modules for plotting geospatial data.
import geopandas as gpd
import pyproj
import shapely.geometry

# These modules are useful for tracking where modules are
# imported from, e.g., to check we're using our local edited
# versions of open_cp scripts.
import sys
import inspect
import importlib

# In order to use our local edited versions of open_cp
# scripts, we insert the parent directory of the current
# file ("..") at the start of our sys.path here.
sys.path.insert(0, os.path.abspath(".."))

# Elements from PredictCode's custom "open_cp" package
import open_cp.sources.chicago as chicago
import open_cp.sources.ukpolice as ukpolice

# Confirm we're using local versions of code
print(inspect.getfile(chicago))
print(inspect.getfile(ukpolice))


C:\Users\Dustin\Documents\GitHub\PredictCode\open_cp\sources\chicago.py
C:\Users\Dustin\Documents\GitHub\PredictCode\open_cp\sources\ukpolice.py


# Chicago data

Download data from here
* Past year: [https://catalog.data.gov/dataset/crimes-one-year-prior-to-present-e171f](https://catalog.data.gov/dataset/crimes-one-year-prior-to-present-e171f)
* Everything since 2001: [https://catalog.data.gov/dataset/crimes-2001-to-present-398a4](https://catalog.data.gov/dataset/crimes-2001-to-present-398a4)

The open_cp code (open_cp.sources.chicago) expects data files with these names:
* chicago.csv
* uk_police.csv

Some functions also expect a file named `Chicago_areas.geojson` in the same datadir. That file can be downloaded at https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Community-Areas-current-/cauq-8yn6

# UK data

Download data from here
* [https://data.police.uk/data/](https://data.police.uk/data/)

The data available ranges from January 2016 to December 2018. However, files are divided by month and need to be manually merged. Also, there is no finer-grained time information than the month of the crime. An example in the "Example Data Sets" notebook just uses data from the month of January 2017 from West Yorkshire, which contains 2358 instances of burglary.

The open_cp code (open_cp.sources.ukpolice) expects a data file in its same directory (open_cp/sources) with this name:
* uk_police.csv

NOTE: The existing code in open_cp.sources.ukpolice did not play nicely with Windows filenames. This line:

`_default_filename = os.path.join(os.path.split(__file__)[0],"uk_police.csv")`

needed to be replaced with this line:

`_default_filename = os.path.join(os.path.split(os.path.abspath(__file__))[0],"uk_police.csv")`