In [1]:
from cs103 import *

# Tutorial Solution - Analysis Programs

## Pre-Tutorial Work

None this week

## Overview

We are again continuing to use the file  containing information about government grants.

Take a look at the file [report-government-grant-year-to-date.csv](report-government-grant-year-to-date.csv) in this notebook's folder to see how it is structured. Two files containing small subsets of this information have also been  provided for testing purposes:
* [report-government-grant-year-to-date-test1.csv](report-government-grant-year-to-date-test1.csv) and
* [report-government-grant-year-to-date-test2.csv](report-government-grant-year-to-date-test2.csv). 

(You may find that the file downloads to your computer instead of opening in the browser.  In that case, locate the downloaded `.csv` file and open it with a spreadsheet program, such as Microsoft Excel or Apple Numbers.  You can also find the original information [here](https://catalogue.data.gov.bc.ca/dataset/gaming-grants-paid-to-community-organizations/).)

Now that you have looked at the complete file, we'll complete the planning steps of the HtDAP recipe. 

## Step 1: Planning
#### Step 1a: Identify the information in the file your program will read
The file contains information about BC community gaming grants between April 1, 2017 and July 13, 2018. For each grant, the city, organization name, payment amount (in CAD), grant type, and grant sector (area) is included.

#### Step 1b: Write a description of what your program will produce

Here are some ideas of what a program operating on this information might produce:
* We might find the city that received the most grant money.
* We might find the area that received the largest number of grants.
* We might find the area that received the largest single grant.
* We might find the total amount of grant money allocated to the city of Vancouver.
* We might find the average amount of grant money allocated to the grant area Arts and Culture.
* We might find the amount of the largest grant allocated to the grant area Sport.

We are going to focus on the last idea and find the amount of the largest grant allocated to the grant area Sport.

#### Step 1c: Write or draw examples of what your program will produce
Here's an  example that shows the kind of output we expect from this program:
```python
expect(main('report-government-grant-year-to-date.csv'), 12450)
```

The value shown here is just an example, demonstrating the expected format.  The value `12450` was chosen arbitrarily.

## Problem 1

Now it's time to start building the program. Using the planning steps completed above, determine the information that you will need to represent in your program as data. 

You must clearly state which pieces of information you will choose to repesent. You should refer to the previous data definitions that we've used, but only store the information that you'll need to solve this particular problem. **However**, we want it to be easy – **without changing the `read` function or data definitions** – to switch to finding the largest grant money from another grant area. (This may impact what you store as data!)

Then complete the design of data definition(s) to represent that information. 

In [2]:
# your solution goes here

from typing import NamedTuple, List

GovernmentGrant = NamedTuple('GovernmentGrant', [('area', str),        
                                                 ('amt', int)])     # in range[0, ...)
# interp. government grant data from BC. includes the grant area ('area')
#         and payment amount ('amt') in CAD

SPECIAL_OLYMPICS = GovernmentGrant("Sport", 28500)
HERITAGE_SOCIETY = GovernmentGrant("Arts and Culture", 3000)
LOW_ENTROPY = GovernmentGrant("Human and Social Services", 5000)
CURLING_CLUB = GovernmentGrant("Sport", 15000)
HOSPICE_SOCIETY = GovernmentGrant("Human and Social Services", 36000)
ARTS_COUNCIL = GovernmentGrant("Arts and Culture", 16500)
THEATRE_SOCIETY = GovernmentGrant("Arts and Culture", 13000)
AGRIFAIR = GovernmentGrant("Arts and Culture", 80000)
BMX = GovernmentGrant("Sport", 29500)
AIRSHOW = GovernmentGrant("Arts and Culture", 60000)
FALCONS = GovernmentGrant("Sport", 22000)
BARRACUDAS = GovernmentGrant("Sport", 51900)
AFRICAN_DESCENT_SOCIETY = GovernmentGrant("Arts and Culture", 25000)
ALL_BODIES = GovernmentGrant("Arts and Culture", 12000)
RAINBOW = GovernmentGrant("Arts and Culture", 5000)
YOUTH_MUSIC = GovernmentGrant("Arts and Culture", 31500)
CHINESE_ORCHESTRA = GovernmentGrant("Arts and Culture", 5000)

# template based on compound
def fn_for_government_grant(gg: GovernmentGrant) -> ...:
    return ...(gg.area,
               gg.amt)


# List[GovernmentGrant]
# interp. a list of community gaming grants

L0 = []
L1 = [SPECIAL_OLYMPICS,
      HERITAGE_SOCIETY,
      LOW_ENTROPY]
L2 = [CURLING_CLUB,
      HOSPICE_SOCIETY,
      ARTS_COUNCIL,
      THEATRE_SOCIETY]
L3 = [AGRIFAIR, 
      BMX, 
      AIRSHOW, 
      FALCONS,
      BARRACUDAS, 
      AFRICAN_DESCENT_SOCIETY, 
      ALL_BODIES, 
      RAINBOW,
      YOUTH_MUSIC, 
      CHINESE_ORCHESTRA]

# template based on arbitrary-sized and the reference rule
def fn_for_logg(logg: List[GovernmentGrant]) -> ...:
    # description of the acc
    acc = ... # type: ...
    for gg in logg:
        acc = ...(acc, fn_for_governmentGrant_grant(gg))
    return ...(acc)


## Problem 2a

Once you have your data definition(s) from Problem 1, design a function that reads
the information from the file and stores it as data in your program. 

You should begin by copying the template from the HtDAP page, then complete the 
design of the `main` and `read` functions (but not the analysis helper function for `main`). When testing your functions, you may use the provided testing files:
* [report-government-grant-year-to-date-test1.csv](report-government-grant-year-to-date-test1.csv) and
* [report-government-grant-year-to-date-test2.csv](report-government-grant-year-to-date-test2.csv). 


In [3]:
import csv

# your solution goes here

def main(fn: str) -> float:
    """
    Reads the file from given filename and returns the largest grant money allocated to the area Sport
    """
    #return 0  #stub
    # template as a function composition
    return largest_grant_to_sport(read(fn))


def read(fn: str) -> List[GovernmentGrant]:
    """    
    Reads the file from given filename and returns a list of the government grants
    """
    #return []   #stub
    #template from HtDAP
    
    # logg contains the result so far
    logg = []   # type: List[GovernmentGrant]
    with open(fn) as csvfile:
        reader = csv.reader(csvfile, delimiter=',')
        next(reader) # skip header line
        
        for row_data in reader:     
            gg = GovernmentGrant(row_data[4], parse_int(row_data[2]))
            # SOLUTION COMMENT: Note that the order in which the different elements of row appear in 
            # GovernmentGrant depends on the structure of GovernmentGrant, which has area first, then 
            # amount, and not on the columns order in the csv file. 
            logg.append(gg)  
                             
    return logg


@typecheck
def largest_grant_to_sport(logg: List[GovernmentGrant]) -> float:
    """
    Returns the largest grant money allocated to the area Sport.
    
    Returns 0 if no Sports grants in list.
    """
    #return 0 #stub
    # template based on function composition
    
    return largest_grant(sport_only(logg))


@typecheck
def sport_only(logg: List[GovernmentGrant]) -> List[GovernmentGrant]:
    """
    return a list of grants from logg that have been assigned to the area Sport
    """
    # return [] #stub
    # template from List[GovernmentGrant]
    
    # acc stores the result so far
    acc = [] # type: List[GovernmentGrant]
    
    for gg in logg:
        if is_sport_grant(gg):
            acc.append(gg)
    return acc


@typecheck
def is_sport_grant(gg: GovernmentGrant) -> bool:
    """
    return True if gg has been assigned to the area Sport and False otherwise
    """
    #return False #stub
    # template from GovernmentGrant 
    return gg.area == "Sport"


@typecheck
def largest_grant(logg: List[GovernmentGrant]) -> float:
    """
    Returns the largest grant money from logg, or 0 if there are no grants in logg.
    """
    # return 0 #stub
    # template from List[GovernmentGrant]
    # largest stores the largest grant amount so far
    largest = 0.0 # type: float
    
    for gg in logg:
        if is_larger_amount(gg, largest):
            largest = gg.amt
        
    return largest


@typecheck
def is_larger_amount(gg: GovernmentGrant, amount: float) -> bool:
    """
    return True if gg has an amount larger than `amount`, otherwise False.
    """
    #return False #stub
    # template from GovernmentGrant 
    return gg.amt > amount


start_testing()

# examples and tests for main
expect(main('report-government-grant-year-to-date-test1.csv'), 28500.0)
expect(main('report-government-grant-year-to-date-test2.csv'), 15000.0)

summary()


start_testing()

# examples and tests for read
expect(read('report-government-grant-year-to-date-test1.csv'), L1)
expect(read('report-government-grant-year-to-date-test2.csv'), L2)

summary()


start_testing()

# examples and tests for sport_only
expect(sport_only(L0), [])
expect(sport_only(L1), [SPECIAL_OLYMPICS])
expect(sport_only(L2), [CURLING_CLUB])
expect(sport_only(L3), [BMX, FALCONS, BARRACUDAS])

summary()


start_testing()

# examples and tests for is_sport_grant
expect(is_sport_grant(ALL_BODIES), False)
expect(is_sport_grant(BARRACUDAS), True)

summary()


start_testing()

# examples and tests for largest_grant
expect(largest_grant([]),0)
expect(largest_grant([ALL_BODIES, RAINBOW]), 12000)
expect(largest_grant([AGRIFAIR, BMX, ALL_BODIES, RAINBOW]), 80000)

summary()


start_testing()

# examples and tests for is_larger_amount
expect(is_larger_amount(ALL_BODIES, 11999), True)
expect(is_larger_amount(ALL_BODIES, 12000), False)
expect(is_larger_amount(ALL_BODIES, 12001), False)

summary()


[92m2 of 2 tests passed[0m
[92m2 of 2 tests passed[0m
[92m4 of 4 tests passed[0m
[92m2 of 2 tests passed[0m
[92m3 of 3 tests passed[0m
[92m3 of 3 tests passed[0m


## Problem 2b

To finish your program, complete the design of the analysis function(s). For this particular problem, find the largest amount of grant money allocated to the area Sport.

Think about your data definitions and the helper rules to determine how many helper functions you will need to write when designing this function. 

In [4]:
# RETURN to the cell above to complete your design of the analysis functions.
# Do not design them here.

# Call your program with the full dataset to determine the largest grant money allocated to the area Sport.
main('report-government-grant-year-to-date.csv')


250000

## Submit your solution