# Average Crime Location in VPD Data

Let's practice implementing a composition plan. We'll find the average location of crimes (of the four types we selected, from 2018) in our data.

This is loosely an HtDAP design, but we've skipped the planning stages to keep it short! Check out last class's exercise for many relevant planning stages, including critically a description of the file.

To help get us started, we assume the following functions are available:

```python
def get_x_locations(locd: List[CrimeData]) -> List[float]:
    """
    return the x locations from locd
    """

def get_y_locations(locd: List[CrimeData]) -> List[float]:
    """
    return the y locations from locd
    """

# These are actually built-in with somewhat different signatures and purposes than we give them,
# but we'll run with this!
def sum(lof: List[float]) -> float:
    """
    return the sum of all the values in lof (or 0.0 if lof is empty)
    """

def len(lof: List[float]) -> int:
    """
    return the number of elements in lof
    """
```

In [None]:
# COMPLETE, CORRECT DATA DEFINITIONS

from cs103 import *
from typing import NamedTuple, List
from enum import Enum
import csv

##################
# Data Definitions

CrimeData = NamedTuple('CrimeData', [('x', float),
                                     ('y', float)])
# interp. data about a single crime in Vancouver with its x and y location.
# (Locations are in metres offset from a somewhat arbitrary point on the surface of
# the earth. (Caution: locations of (0, 0) are sometimes placeholders
# or intentionally inaccurate reports. Fortunately, that doesn't occur in the 
# subset of the data we're looking at.)
CD1 = CrimeData(0, 0)
CD2 = CrimeData(-3.5, 2.0)
CD3 = CrimeData(490258.683, 5458154.503)  # sample location actually pulled from our data

# template based on compound (2 fields) and reference rule (once)
@typecheck
def fn_for_crime_data(cd: CrimeData) -> ...:
    return ...(cd.x,
               cd.y)
    

# List[CrimeData]
# interp. a list of crime data
LOCD0 = []
LOCD1 = [CD1, CD2]

# template based on arbitrary-sized data and reference rule
@typecheck
def fn_for_locd(locd: List[CrimeData]) -> ...:
    # description of accumulator
    acc = ... # type: ...
    
    for cd in locd:
        acc = ...(fn_for_crime_data(cd), acc)
        
    return ...(acc)


# List[float]
# interp. a list of floats
LOF0 = []
LOF1 = [0, -3.5]

# template based on arbitrary-sized data
@typecheck
def fn_for_lof(lof: List[float]) -> ...:
    # description of accumulator
    acc = ... # type: ...
    
    for f in lof:
        acc = ...(f, acc)
        
    return ...(acc)

In [None]:
# COMPLETE, CORRECT READ FUNCTION

@typecheck
def read(filename: str) -> List[CrimeData]:
    """    
    reads information from the specified file and returns a list of crime data
    
    the file must be in the VPD crime format, and the x and y entries must be valid 
    floats.
    """
    # Note: in future, we might want to skip (0, 0) entries, but we won't now.
    
    #return []  #stub
    # Template from HtDAP

    # locd contains the result so far
    locd = [] # type: List[CrimeData]

    with open(filename) as csvfile:
        
        reader = csv.reader(csvfile)
        next(reader) # skip header line

        for row in reader:
            cd = CrimeData(parse_float(row[8]), parse_float(row[9]))
            locd.append(cd)
    
    return locd



start_testing()
expect(read("testfile_empty.csv"), []) 
expect(read("testfile_small.csv"), [CrimeData(0, 0),
                                    CrimeData(-3.5, 2.0)]) 

summary()


In [None]:
# COMPLETE, CORRECT main WITH A START ON ANALYZE THAT WE SHOULD COMPLETE!

@typecheck
def main(filename: str) -> CrimeData:
    """
    Reads the crime data from given filename and finds and returns the average
    location of all crimes in the file as a new (fictitious) CrimeData.
    
    Returns CrimeData(0, 0) if there are no data.
    """
    # We might want to rename main, but I left it as is just to emphasize that
    # that is OK here as well. Our file should have a good name, however!
    
    # Template from HtDAP, based on function composition 
    return analyze(read(filename))     


@typecheck
def analyze(locd: List[CrimeData]) -> CrimeData: 
    """ 
    Finds and returns the average location of all crimes in the file as a 
    new (fictitious) CrimeData.
    
    Returns CrimeData(0, 0) if there are no data.
    """ 
    #return CrimeData(0, 0)  #stub
    # template based on composition

    # Plan:
    # 1) get only the x values
    # 2) find the average of the x values
    # 3) get only the y values
    # 4) find the average of the y values
    # 5) return a new CrimeData constructed from the averages
    
@typecheck
def get_x_locations(locd: List[CrimeData]) -> List[float]:
    """
    return the x locations from locd
    """
    return []  #stub
    # NO NEED TO COMPLETE THIS; the solution released at end of day has a complete version.

@typecheck
def get_y_locations(locd: List[CrimeData]) -> List[float]:
    """
    return the y locations from locd
    """
    return []  #stub
    # NO NEED TO COMPLETE THIS; the solution released at end of day has a complete version.

start_testing()
expect(main("testfile_empty.csv"), CrimeData(0, 0)) 
expect(main("testfile_small.csv"), CrimeData((0 + -3.5) / 2, (0 + 2.0) / 2))
summary()

start_testing()
expect(analyze([]), CrimeData(0, 0)) 
expect(analyze([CrimeData(0, 0), CrimeData(-3.5, 2.0)]), CrimeData((0 + -3.5) / 2, (0 + 2.0) / 2))
summary()

# The get_x/y_locations tests will fail, but that's OK. We'll release a complete version at end of day.
start_testing()
expect(get_x_locations([]), []) 
expect(get_x_locations([CrimeData(0, 0), CrimeData(-3.5, 2.0)]), [0, -3.5])
summary()

start_testing()
expect(get_y_locations([]), []) 
expect(get_y_locations([CrimeData(0, 0), CrimeData(-3.5, 2.0)]), [0, 2.0])
summary()


