In [None]:
from cs103 import *

# Tutorial Solution - Analysis Programs

## Pre-Tutorial Work

None this week

## Overview

We are again continuing to use the file  containing information about government grants.

Take a look at the included file called `report-government-grant-year-to-date.csv` to see how it is structured. Two files containing small subsets of this information have also been  provided for testing purposes (`report-government-grant-year-to-date-test1.csv` and `report-government-grant-year-to-date-test2.csv`). You can find the original information [here](https://catalogue.data.gov.bc.ca/dataset/gaming-grants-paid-to-community-organizations/resource/7281e8ca-b649-4af9-b812-2a3e0bf8e4be)

Now that you have looked at the file, we'll complete the planning steps of the HtDAP recipe. 

#### Step 1a
The file contains information about BC community gaming grants between April 1, 2017 and July 13, 2018. For each grant, the city, organization name, grant type, grant area, grant subarea and amount (in CAD) is included.

#### Step 1b

Now, here are some ideas of what a program operating on this information might produce.

We might find the city that received the most grant money.

We might find the largest grant.

We might find the area that received the largest number of grants.

We might find the area that received the largest single grant.

We might find the total amount of grant money allocated to the city of Vancouver.

We might find the average amount of grant money allocated to the grant area Sport.

**We are going to focus on the last idea and find the average amount of grant money allocated to the grant area Sport.** We will round this average to 2 decimal digits.

#### Step 1c
Here's an  example that shows the kind of output we expect from this program:
```python
expect(main('report-government-grant-year-to-date.csv'), 12450.55)
```


## Problem 1

Now it is time to start building the program. Using the planning steps completed above, determine the information that you will need to represent in your program as data. 

You must clearly state which pieces of information you will choose to repesent. You should refer to the previous data definitions that we've used, but only store the information that you'll need to solve this particular problem. **However**, we want it to be easy **without changing the `read` function or data definitions** to switch to averaging grant money from another grant area. (This may impact what you store as data!)

Then complete the design of data definition(s) to represent that information. 

In [None]:
from typing import NamedTuple, List

GovernmentGrant = NamedTuple('GovernmentGrant', [('area', str),        
                                                 ('amt', int)])     # in range[0, ...)
# interp. government grant data from BC. includes the grant area ('area')
#         and payment amount ('amt') in CAD

AGRIFAIR = GovernmentGrant("Arts and Culture", 80000)
BMX = GovernmentGrant("Sport", 29500)
AIRSHOW = GovernmentGrant("Arts and Culture", 60000)
FALCONS = GovernmentGrant("Sport", 22000)
ARTS_COUNCIL = GovernmentGrant("Arts and Culture", 15100)
BARRACUDAS = GovernmentGrant("Sport", 51900)
AFRICAN_DESCENT_SOCIETY = GovernmentGrant("Arts and Culture", 25000)
ALL_BODIES = GovernmentGrant("Arts and Culture", 12000)
RAINBOW = GovernmentGrant("Arts and Culture", 5000)
YOUTH_MUSIC = GovernmentGrant("Arts and Culture", 31500)
CHINESE_ORCHESTRA = GovernmentGrant("Arts and Culture", 5000)

# template based on compound
def fn_for_government_grant(gg: GovernmentGrant) -> ...:
    return ...(gg.area,
               gg.amt)


# List[GovernmentGrant]
# interp. a list of community gaming grants

L0 = []
L1 = [AGRIFAIR, 
      BMX, 
      AIRSHOW, 
      FALCONS,
      ARTS_COUNCIL, 
      BARRACUDAS, 
      AFRICAN_DESCENT_SOCIETY, 
      ALL_BODIES, 
      RAINBOW,
      YOUTH_MUSIC, 
      CHINESE_ORCHESTRA]

# template based on arbitrary-sized and the reference rule
def fn_for_logg(logg: List[GovernmentGrant]) -> ...:
    # description of the acc
    acc = ... # type: ...
    for gg in logg:
        acc = ...(acc, fn_for_governmentGrant_grant(gg))
    return ...(acc)


## Problem 2a

Once you have your data definition(s) from Problem 1, design a function that reads
the information from the file and stores it as data in your program. 

You should begin by copying the template from the HtDAP page, then complete the 
design of the `main` and `read` functions (but not the analysis helper function for `main`). When testing your functions, you may use the testing files called `report-government-grant-year-to-date-test1.csv` and `report-government-grant-year-to-date-test2.csv`.

In [None]:
import csv

def main(fn: str) -> int:
    """
    Reads the file from given filename and returns the average grant money allocated to the area Sport
    """
    #return 0  #stub
    # template as a function composition
    return grants_to_sport(read(fn))


def read(fn: str) -> List[GovernmentGrant]:
    """    
    Reads the file from given filename and returns a list of the government grants
    """
    #return []   #stub
    #template from HtDAP
    
    # logg contains the result so far
    logg = []   # type: List[GovernmentGrant]
    with open(fn) as csvfile:
        reader = csv.reader(csvfile, delimiter=',')
        next(reader) # skip header line
        
        for row in reader:     
            gg = GovernmentGrant(row[4], parse_int(row[2]))
            # SOLUTION COMMENT: Note that the order in which the different elements of row appear in 
            # GovernmentGrant depends on the structure of GovernmentGrant, which has area first, then 
            # amount, and not on the columns order in the csv file. 
            logg.append(gg)  
                             
    return logg

@typecheck
def grants_to_sport(logg: List[GovernmentGrant]) -> float:
    """
    returns the average grant money allocated to the area Sport
    """
    #return 0 #stub
    # template based on function composition
    
    return average_grants(sport_only(logg))

@typecheck
def sport_only(logg: List[GovernmentGrant]) -> List[GovernmentGrant]:
    """
    return a list of grants from logg that have been assigned to the area Sport
    """
    # return [] #stub
    # template from List[GovernmentGrant]
    
    # acc stores the result so far
    acc = [] # type: List[GovernmentGrant]
    
    for gg in logg:
        if (is_sport_grant(gg)):
            acc.append(gg)
    return acc

@typecheck
def is_sport_grant(gg: GovernmentGrant) -> bool:
    """
    return True if gg has been assigned to the area Sport and False otherwise
    """
    #return False #stub
    # template from GovernmentGrant 
    return gg.area == "Sport"


@typecheck
def average_grants(logg: List[GovernmentGrant]) -> float:
    """
    return the average sum of grant money from logg, rounded to 2 decimal digits
    """
    # return 0 #stub
    # template from List[GovernmentGrant]
    # total stores the total grants so far
    total = 0 # type: int
    
    # count stores the number of grants seen so far
    count = 0 # type: int
    for gg in logg:
        total = total + gg.amt
        count = count + 1
        
    if count == 0:
        return 0.0
    
    return round(total/count, 2)

start_testing()

# examples and tests for main
expect(main('report-government-grant-year-to-date-test1.csv'), 25750.0)
expect(main('report-government-grant-year-to-date-test2.csv'), 51900.0)

summary()

start_testing()

# examples and tests for read
expect(read('report-government-grant-year-to-date-test1.csv'), [AGRIFAIR, BMX, FALCONS])
expect(read('report-government-grant-year-to-date-test2.csv'), [ARTS_COUNCIL, 
                                                                      BARRACUDAS, 
                                                                      YOUTH_MUSIC, 
                                                                      CHINESE_ORCHESTRA,
                                                                      AFRICAN_DESCENT_SOCIETY, 
                                                                      ALL_BODIES])
summary()

start_testing()

# examples and tests for sport_only
expect(sport_only([]), [])
expect(sport_only(L1), [BMX, FALCONS, BARRACUDAS])

summary()

start_testing()

# examples and tests for is_sport_grant
expect(is_sport_grant(ALL_BODIES), False)
expect(is_sport_grant(BARRACUDAS), True)

summary()

start_testing()

# examples and tests for average_grants
expect(average_grants([]),0)
expect(average_grants([ALL_BODIES, RAINBOW]), 8500)
expect(average_grants([AGRIFAIR, BMX, ALL_BODIES, RAINBOW]), 31625)

summary()

## Problem 2b

To finish your program, complete the design of the analysis function(s). For this particular problem, find the average amount of grant money allocated to the area Sport.

Think about your data definitions and the helper rules to determine how many helper functions you will need to write when designing this function. 

In [None]:
# RETURN to the cell above to complete your design of the analysis functions.
# Do not design them here.

In [None]:
main('report-government-grant-year-to-date.csv')