# Admin Notes

## Project
1. Reminder: Project final submissions are due on June 17 at 9AM (**we do not accept late submissions for this assignment**). 
1. Your TAs are aiming to give you feedback on your milestones as soon as possible but in the meantime, if you have any questions, let them know. 
1. Reminder: Be sure to schedule your project demo time by tonight (refer to Piazza for more details).

## Final Exam
1. Reminder: June 22, 2020 at 7PM (Vancouver time). Everyone will have to take it at the same time so for those of you who are in a different timezone, please adjust accordingly.
1. Reminder: we will only test on line graphs for the final. The other types of graphs are provided for your interest and in case you need it for the final.

## What's coming up?
1. We will talk about the solution to test 2 on Tuesday. The rest of the time will be a free-form office hour where you can get help with finishing the project. 
1. Thursday's class (the last one of the semester!) will be when we look at the final exam from 2019W2. We'll release the file and have some time where you work on the question (try to simulate exam conditions if you can) and then we'll cover the answer/approaches together.

# Mapping Crime in Vancouver

Let's just have some plotting fun!

This is loosely an HtDAP design, but we've skipped the planning stages to keep it short! We're also working from the Module 7 VPD location project for fun :)

So, edit and refactor this into something to map crime! (We've already updated the data definitions and `read` function, but not `main` or `analyze`!)

In [5]:
from cs103 import *
from typing import NamedTuple, List
from enum import Enum
import csv

##################
# Data Definitions

CrimeData = NamedTuple('CrimeData', [('x', float),
                                     ('y', float)])
# interp. data about a single crime in Vancouver with its x and y location.
# (Locations are in metres offset from a somewhat arbitrary point on the surface of
# the earth. (Caution: locations of (0, 0) are sometimes placeholders
# or intentionally inaccurate reports. Fortunately, that doesn't occur in the 
# subset of the data we're looking at.)
CD1 = CrimeData(0, 0)
CD2 = CrimeData(-3.5, 2.0)
CD3 = CrimeData(490258.683, 5458154.503)  # sample location actually pulled from our data

# template based on compound (2 fields)
@typecheck
def fn_for_crime_data(cd: CrimeData) -> ...:
    return ...(cd.x,
               cd.y)
    

# List[CrimeData]
# interp. a list of crime data
LOCD0 = []
LOCD1 = [CD1, CD2]

# template based on arbitrary-sized data and reference rule
@typecheck
def fn_for_locd(locd: List[CrimeData]) -> ...:
    # description of accumulator
    acc = ... # type: ...
    
    for cd in locd:
        acc = ...(fn_for_crime_data(cd), acc)
        
    return ...(acc)


# List[float]
# interp. a list of floats
LOF0 = []
LOF1 = [0, -3.5]

# template based on arbitrary-sized data
@typecheck
def fn_for_lof(lof: List[float]) -> ...:
    # description of accumulator
    acc = ... # type: ...
    
    for f in lof:
        acc = ...(f, acc)
        
    return ...(acc)

In [6]:
@typecheck
def read(filename: str) -> List[CrimeData]:
    """    
    reads information from the specified file and returns a list of crime data
    
    the file must be in the VPD crime format, and the x and y entries must be valid 
    floats.
    """
    # Note: in future, we might want to skip (0, 0) entries, but we won't now.
    
    #return []  #stub
    # Template from HtDAP

    # locd contains the result so far
    locd = [] # type: List[CrimeData]

    with open(filename) as csvfile:
        
        reader = csv.reader(csvfile)
        next(reader) # skip header line

        for row in reader:
            cd = CrimeData(parse_float(row[8]), parse_float(row[9]))
            locd.append(cd)
    
    return locd



start_testing()
expect(read("testfile_empty.csv"), []) 
expect(read("testfile_small.csv"), [CrimeData(0, 0),
                                    CrimeData(-3.5, 2.0)]) 

summary()


[92m2 of 2 tests passed[0m


## Scatterplot solution from the worksheet

Our "template" in the viz module is just to copy-and-paste from a sample of the kind of plot we want. That's not so unrealistic as a starting point as long as we understand what we're using!

Here's the scatterplot worked example body as a starting point for our template:

```python
@typecheck
def show_scatterplot(ages: List[int], salaries: List[int], counts: List[int]) -> None:
    """
    display a scatterplot of salaries vs. ages. salaries are given in 1000s
    
    Assumes that the lengths of ages, salaries, and counts are all equal
    """
    #return None #stub
    # Template based on visualization
    
    areas = convert_counts_to_areas(counts)

    # set the labels for the axes
    plt.xlabel('Age')
    plt.ylabel('Salary (in 1000s)')
    plt.title('Salaries by age')

    # range for the axes
    # [x-min, x-max, y-min, y-max]
    plt.axis([0,65,0,105])

    # create the scatterplot, with markers that are red (c='r') and triangular (marker='^')
    plt.scatter(ages,salaries,marker='^', c='r', s=areas)

    # show the plot
    plt.show()
    
    return None
```