### Announcements
- see your midterm grading feedback on PrairieLearn, ask me if you have any questions. See the Piazza post for more information and details about what to do if you think we made a grading error
- Your project milestone is due on June 23. Please see the canvas assignment for more details. After today's class, you are ready to start on your project milestone.

## Analysing Air Quality Data

We've uploaded some [air quality data shared by the Canadian governent's Open Government initiative](https://open.canada.ca/data/en/dataset/e6cc3ae2-92b1-4df6-87ff-698a1cd5a7bd). 

You can see our information file in this directory named `average_fine_particulate_matter.csv`. You can also find the [license](https://open.canada.ca/en/open-government-licence-canada) for this information online.

Let's start by looking at the information that we have available.

We'll **start from the project final submission template** to get good practice both on using HtDAP and preparing for the project! (We've edited this slightly to note places where we'll deviate from the project.)

### Step 1a: Planning 
#### Identify the information in the file your program will read

The file has Station number, Station name, Address, City, Province, Latitude, Longitude, Local land use, Station type, Concentration, Units, Station details, Report year for many air quality readings from across Canada.


### Step 1b: Planning 
#### Write a description of what your program will produce

You must brainstorm at least three ideas for graphs or charts that your program could produce and choose the one that you'd like to work on. You can choose between a line chart, histogram, bar chart, scatterplot, or pie chart. *Note: we might focus on non-graphs for now, since we're really studying HtDAP rather than the project.*

Ideas:
- What is the average latitude for the air quality centers in a province?
- What province had the best air quality in 2019?
- What province had the worst air quality in 2019?
- What is the average air quality for a given province or territory?

We will design a program to answer this question:
- What is the average air quality for a given province or territory?

### Step 1c: Planning 
#### Write or draw examples of what your program will produce

You must include an image that shows what your chart or plot will look like. You can insert an image using the Insert Image command near the bottom of the Edit menu. *Note: we'll practice using the "insert image" command just for the fun of it, but we are still not focusing on graphs/charts.*

We won't draw a picture because we're not doing a visualization, so instead we'll write an example

expect (main('average_fine_particulate_matter.csv', 'British Columbia'), 6.3)



### Step 2a: Building
#### Design data definitions

Before you design data definitions in the code cell below, you must explicitly document here which information in the file you chose to represent and why that information is crucial to the chart or graph that you'll produce when you complete step 2c. *Note: we'll skip the "chart or graph" part!*

We need to store the province and concentration to answer our question. For this example, we're choosing to include the station name, too.

In [1]:
from cs103 import *
from typing import NamedTuple, List, Optional
import csv
from enum import Enum

##################
# Data Definitions

Province = Enum('Province', ['BC', 'AB', 'SK', 'MB', 'ON', 'QC', 
                             'PE', 'NL', 'NS', 'NB', 'YT', 'NT', 'NU'])
# interp. a province or territory of Canada, which is one of British Columbia (BC), 
#         Alberta (AB), Saskatchewan (SK), Manitoba (MB), Ontario (ON), Quebec (QC), 
#         Prince Edward Island (PE), Newfoundland and Labrador (NL), Nova Scotia (NS),
#         New Brunswick (NB), Yukon (YT), Northwest Territories (NT), Nunavut (NU)

# examples are redundant for enumerations

@typecheck
# template based on one-of (13 cases), and atomic distinct (13 times)
def fn_for_province(p: Province) -> ...:
    if p == Province.BC:
        return ...
    elif p == Province.AB:
        return ...    
    elif p == Province.SK:
        return ...  
    elif p == Province.MB:
        return ...  
    elif p == Province.ON:
        return ...  
    elif p == Province.QC:
        return ...  
    elif p == Province.PE:
        return ...
    elif p == Province.NL:
        return ...  
    elif p == Province.NS:
        return ...  
    elif p == Province.NB:
        return ...  
    elif p == Province.YT:
        return ...  
    elif p == Province.NT:
        return ...  
    elif p == Province.NU:
        return ...   
    
Concentration = Optional[float] # in range [0, ...]
# interp. the concentration of fine particulate matter in micrograms per cubic metre (µg/m3)

C_MISSING = None
C1 = 1.0
C8 = 8.0

@typecheck
# template based on one of (two cases), atomic distinct, and atomic non-distinct
def fn_for_concentration(c: Concentration) -> ...:
    if x == None:
        return ...
    else:
        return ...(c)

AirQuality = NamedTuple('AirQuality', [('name', str),
                                       ('prov', Province),
                                       ('conc', Concentration)])
# interp. an air quality reading from the station named name, in some province (prov),
#         with the concentration in micrograms per cubic metre (µg/m3)

AQ1 = AirQuality("NEW WESTMINSTER", Province.BC, 6.3)
AQ2 = AirQuality("KENSINGTON PARK", Province.BC, 4.5)
AQ3 = AirQuality("MOUNT PEARL", Province.NL, 4.7)
AQ4 = AirQuality("FREDERICTON-ABERDEEN", Province.NB, None)

@typecheck
# template based on compound and the reference rule (twice)
def fn_for_air_quality(aq: AirQuality) -> ...:
    return ...(aq.name,
               fn_for_province(aq.prov),
               fn_for_concentration(aq.conc))



# List[AirQuality]
# a list of air quality readings

L0 = []
L1 = [AQ1, AQ2, AQ3, AQ4]

@typecheck
# template based on arbitrary-sized and the reference rule
def fn_for_loaq(loaq: List[AirQuality]) -> ...:
    # description of the acc
    acc = ... # type: ...
    
    for aq in loaq:
        acc = ... (fn_for_air_quality(aq), acc)
    return ...(acc)


In [2]:
# Here are some definitions we might need later on that aren't particularly interesting to 
# work on in class!

# List[str]
# interp. a list of strings
LOS0 = []
LOS1 = ['hello', 'world']

# template based on arbitrary-sized data
@typecheck
def fn_for_los(los: List[str]) -> ...:
    # description of accumulator
    acc = ... # type: ...
    
    for s in los:
        acc = ...(s, acc)
        
    return ...(acc)


# List[int]
# interp. a list of integers
LOI0 = []
LOI1 = [1, -12]

# template based on arbitrary-sized data
@typecheck
def fn_for_loi(loi: List[int]) -> ...:
    # description of accumulator
    acc = ... # type: ...
    
    for i in loi:
        acc = ...(i, acc)
        
    return ...(acc)

### Step 2b: Building
#### Design a function to read the information and store it as data in your program

We've split this off into a separate cell so we can finish this in our first class in Module 7!

In [4]:
import csv

@typecheck
def read(filename: str) -> List[AirQuality]:
    """    
    reads information from the specified file and returns a list of air quality
    """
    # Template from HtDAP
    # loaq contains the result so far
    loaq = [] # type: List[AirQuality]

    with open(filename) as csvfile:
        
        reader = csv.reader(csvfile)
        next(reader) # skip header line

        for row in reader:
            # you may not need to store all the rows, and you may need
            # to convert some of the strings to other types
            aq = AirQuality(row[1], 
                            parse_province(row[4]),
                            parse_float(row[9]))
            loaq.append(aq)
    
    return loaq

@typecheck
def parse_province(p: str) -> Province:
    """
    return the Province represented by p
    """
    #return Province.BC
    # template inspired by Province
    if p == "British Columbia":
        return Province.BC
    elif p == "Alberta":
        return Province.AB   
    elif p == "Saskatchewan":
        return Province.SK  
    elif p == "Manitoba":
        return Province.MB 
    elif p == "Ontario":
        return Province.ON  
    elif p == "Quebec":
        return Province.QC  
    elif p == "Prince Edward Island":
        return Province.PE
    elif p == "Newfoundland and Labrador":
        return Province.NL 
    elif p == "Nova Scotia":
        return Province.NS  
    elif p == "New Brunswick":
        return Province.NB 
    elif p == "Yukon":
        return Province.YT  
    elif p == "Northwest Territories":
        return Province.NT 
    elif p == "Nunavut":
        return Province.NU   

start_testing()

# Examples and tests for read
expect(read('average_fine_particulate_matter-test1.csv'), [AQ1, AQ2])
expect(read('average_fine_particulate_matter-test2.csv'), 
       [AirQuality("NAPS HOUSE", Province.YT, None), 
        AirQuality("STEELE STREET", Province.YT, None)])

# Examples and tests for parse_province
expect(parse_province("British Columbia"), Province.BC)
expect(parse_province("Alberta"), Province.AB)
expect(parse_province("Saskatchewan"), Province.SK)
expect(parse_province("Manitoba"), Province.MB)
expect(parse_province("Ontario"), Province.ON)
expect(parse_province("Quebec"), Province.QC)
expect(parse_province("Prince Edward Island"), Province.PE)
expect(parse_province("Nova Scotia"), Province.NS)
expect(parse_province("New Brunswick"), Province.NB)
expect(parse_province("Newfoundland and Labrador"), Province.NL)
expect(parse_province("Yukon"), Province.YT)
expect(parse_province("Northwest Territories"), Province.NT)
expect(parse_province("Nunavut"), Province.NU)

summary()



[92m15 of 15 tests passed[0m


### Step 2c: Building
#### Design functions to analyze the data

Complete these steps in the code cell below. You will likely want to rename the analyze function so that the function name describes what your analysis function does.


**NOTE:** To make this manageable in class, we might provide some finished helper functions.

In [None]:
###########
# Functions

@typecheck
def main(filename: str) -> ...:
    """
    Reads the file from given filename, analyzes the data, returns the result 
    """
    # Template from HtDAP, based on function composition 
    return analyze(read(filename)) 
    
    


@typecheck
def analyze(loc: List[Consumed]) -> Produced: 
    """ 
    ... 
    """ 

    return ...


start_testing()

# Examples and tests for main
expect(..., ...)

summary()

start_testing()

# Examples and tests for analyze 
expect(..., ...) 

summary()