## Analysing VPD Crime Data

We've uploaded a tiny portion of the crime data shared by the [Vancouver Police Department](https://vancouver.ca/police/)'s [Open Data initiative](https://geodash.vpd.ca/opendata/). The complete file has well over half a million rows. The portion we uploaded is all crimes labelled as "break and enter" (in two variants: commercial and residential) and "theft of" (in two variants: vehicle and bicycle) in 2018.

You can see our information file in this directory named `crimedata_subset_bne_theft_of_bike_veh_2018.csv`. You can also find the license for this information and a PDF file from VPD describing the information source.

Let's see if we can answer the question: At what time of day does crime of various types peak in Vancouver?

We'll **start from the project final submission template** to get good practice both on using HtDAP and preparing for the project! (We've edited this slightly to note places where we'll deviate from the project.)

### Step 1a: Planning 
#### Identify the information in the file your program will read

For each crime, the information is:
* type of crime
* date
* time
* hundred block (address)
* neighbourhood
* latitude
* longtitude

There are an arbitrary number of crimes in the file

### Step 1b: Planning 
#### Write a description of what your program will produce

You must brainstorm at least three ideas for graphs or charts that your program could produce and choose the one that you'd like to work on. You can choose between a line chart, histogram, bar chart, scatterplot, or pie chart. *Note: we might focus on non-graphs for now, since we're really studying HtDAP rather than the project.*

* Most common type of crime
* The neighbourhood that had the most crimes of these types
* The month with the most bike thefts
* What time is most common for a business to be broken into?
* When do crimes occur over the hours of the day for the various types of crime.

### Step 1c: Planning 
#### Write or draw examples of what your program will produce

You must include an image that shows what your chart or plot will look like. You can insert an image using the Insert Image command near the bottom of the Edit menu. 


![crime-line-chart.jpg](attachment:crime-line-chart.jpg)


### Step 2a: Building
#### Design data definitions

Double click this cell to edit.

Before you design data definitions in the code cell below, you must explicitly document here which information in the file you chose to represent and why that information is crucial to the chart or graph that you'll produce when you complete step 2c. *Note: we'll skip the "chart or graph" part!*

In [6]:
from cs103 import *
from typing import NamedTuple, List
from enum import Enum
import csv

##################
# Data Definitions

CrimeType = Enum('CrimeType', ['bec', 'ber', 'tv', 'tb'])

CrimeData = NamedTuple('CrimeData', [('type', CrimeType),
                                     ('time', int)]) # in range [0, 23]



# List[CrimeData]
# interp. a list of crime data

LOC0 = []

@typecheck
def fn_for_loc(loc: List[CrimeData]) -> ...:
    ... # choose which template body to use for List[CrimeData]


In [7]:
# Here are some definitions we'll need later on that aren't particularly interesting to work on in class!

# List[str]
# interp. a list of strings
LOS0 = []
LOS1 = ['hello', 'world']

# template based on arbitrary-sized data
@typecheck
def fn_for_los(los: List[str]) -> ...:
    # description of accumulator
    acc = ... # type: ...
    
    for s in los:
        acc = ...(s, acc)
        
    return ...(acc)


# List[int]
# interp. a list of integers
LOI0 = []
LOI1 = [1, -12]

# template based on arbitrary-sized data
@typecheck
def fn_for_loi(loi: List[int]) -> ...:
    # description of accumulator
    acc = ... # type: ...
    
    for i in loi:
        acc = ...(i, acc)
        
    return ...(acc)

### Step 2b: Building
#### Design a function to read the information and store it as data in your program

We've split this off into a separate cell so we can finish this in our first class in Module 7!

In [24]:
@typecheck
def read(filename: str) -> List[CrimeData]:
    """    
    reads information from the specified file and returns a list
    of crime data
    """
    #return []  #stub
    # Template from HtDAP
    # loc contains the result so far
    loc = [] # type: List[CrimeData]

    with open(filename) as csvfile:
        
        reader = csv.reader(csvfile)
        next(reader) # skip header line

        for row in reader:
            # you may not need to store all the rows, and you may need
            # to convert some of the strings to other types
            c = CrimeData(parse_crime_type(row[0]), parse_int(row[4]))
            loc.append(c)
    
    return loc

@typecheck
def parse_crime_type(ct: str) -> CrimeType:
    """
    convert ct to a CrimeType
    """
    if ct == "Break and Enter Commercial":
        return CrimeType.bec
    elif ct == "Break and Enter Residential":
        return CrimeType.ber
    elif ct == "Break and Enter Residential/Other":
        return CrimeType.ber
    elif ct == "Theft of Vehicle":
        return CrimeType.tv
    elif ct == "Theft of Bicycle":
        return CrimeType.tb

start_testing()

# Examples and tests for read
expect(read("crimedata_subset_bne_theft_of_bike_veh_2018_test1.csv"), 
            [CrimeData(CrimeType.bec, 6),
             CrimeData(CrimeType.bec, 18),
             CrimeData(CrimeType.bec, 0)])
expect(read("crimedata_subset_bne_theft_of_bike_veh_2018_test2.csv"), 
            [CrimeData(CrimeType.ber, 12),
             CrimeData(CrimeType.tb, 8),
             CrimeData(CrimeType.tv, 15)])

expect(parse_crime_type("Break and Enter Commercial"), CrimeType.bec)
expect(parse_crime_type("Break and Enter Residential"), CrimeType.ber)
expect(parse_crime_type("Theft of Vehicle"), CrimeType.tv)
expect(parse_crime_type("Theft of Bicycle"), CrimeType.tb)

summary()

[92m6 of 6 tests passed[0m


### Step 2c: Building
#### Design functions to analyze the data

Complete these steps in the code cell below. You will likely want to rename the analyze function so that the function name describes what your analysis function does.


**NOTE:** To make this manageable in class, we might provide some finished helper functions.

In [9]:
###########
# Functions

@typecheck
def main(filename: str) -> ...:
    """
    Reads the file from given filename, analyzes the data, returns the result 
    """
    # Template from HtDAP, based on function composition 
    return analyze(read(filename)) 
    
    


@typecheck
def analyze(loc: List[Consumed]) -> Produced: 
    """ 
    ... 
    """ 

    return ...


start_testing()

# Examples and tests for main
expect(..., ...)

summary()

start_testing()

# Examples and tests for analyze 
expect(..., ...) 

summary()

NameError: name 'Consumed' is not defined