- none yet
The City of Madison has many different
agencies providing a variety
of services. In this project, you'll analyze real spending data from
2015 to 2018 for five of the largest agencies: police, fire, streets,
library, and parks. You'll get practice calling functions from a
project module, which we'll provide, and practice writing your own
Start by downloading
Double check that these files don't get renamed by your browser (by
ls in the terminal from your
p3 project directory).
You'll do all your work in a new
main.ipynb notebook that you'll
create and hand in when you're done (please do not write your
functions in a separate .py file). You'll test as usual by running
python test.py (or similar, depending on your laptop setup). Before
handing in, please put the project, submitter, and partner info in a
comment in the first cell, in the same format you used for previous
projects (please continue doing so for all projects this semester).
We won't explain how to use the
project module here (the code is in the
project.py file). The lab this week is designed to teach you how it
works, so be sure to do the lab from home (if you missed it) before
starting the project.
This project consists of writing code to answer 20 questions. If
you're answering a particular question in a cell in your notebook, you
need to put a comment in the cell so we know what you're answering.
For example, if you're answering question 13, the first line of your
cell should contain
The data looks like this:
The dataset is in the
madison.csv file. We'll learn about CSV files
later in the semester. For now, you should know this about them:
- it's easy to create them by exporting from Excel
- it's easy to use them in Python programs
- we'll give you a
project.pymodule to help you extract data from CSV files until we teach you to do it directly yourself
All the numbers in the dataset are in millions of dollars. Answer questions in millions of dollars unless we specify otherwise.
You may not hardcode agency IDs in your code. For example, if we ask
how much was spent on streets in 2015, you could obtain the answer
with this code:
get_spending(get_id("streets"), 2015). If you don't
get_id and instead use
get_spending(15, 2015), we'll deduct
For some of the questions, we'll ask you to write (then use) a
function to compute the answer. If you compute the answer without
creating the function we ask you to, we'll manually deduct points from
test.py score when recording your final grade, even if the way
you did it produced the correct answer.
Questions and Functions
Q1: What is the agency ID of the parks agency?
Q2: How much did the agency with ID 6 spend in 2018?
It is OK to hardcode
6 in this case since we asked directly about
agency 6 (instead of about "fire").
Q3: How much did "streets" spend in 2017?
Hint: instead of repeatedly calling
similar) for each function, you may wish to make these calls once at
the beginning of your notebook and save the results in variables,
something like this:
streets_id = project.get_id("streets") police_id = project.get_id("police") fire_id = project.get_id("fire") ...
This function will compute the maximum spending of any one agency in a given year. We'll give this one to you directly (you'll have to wriet the code for the subsequent functions yourself). Copy/paste this into a cell in your notebook:
def year_max(year): # grab the spending by each agency in the given year police_spending = project.get_spending(project.get_id("police"), year) fire_spending = project.get_spending(project.get_id("fire"), year) library_spending = project.get_spending(project.get_id("library"), year) parks_spending = project.get_spending(project.get_id("parks"), year) streets_spending = project.get_spending(project.get_id("streets"), year) # use builtin max function to get the largest of the five values return max(police_spending, fire_spending, library_spending, parks_spending, streets_spending)
Q4: What was the most spent by a single agency in 2015?
year_max to answer this.
Q5: What was the most spent by a single agency in 2018?
We'll help you start this one, but you need to fill in the rest yourself.
def agency_min(agency): agency_id = project.get_id(agency) y15 = project.get_spending(agency_id, 2015) y16 = project.get_spending(agency_id, 2016) # grab the other years # use the min function (similar to the max function) # to get the minimum across the four years, and return # that value
This function will compute the minimum the given agency ever spent over the course of a year.
Q6: What was the least the police ever spent in a year?
Q7: What was the least that library ever spent in a year?
Q8: What was the least that parks ever spent in a year?
This function will compute the average (over the four datapoints) that the given agency spends per year.
Hint: start by copy/pasting
agency_min and renaming your copy to
agency_avg. Instead of computing the minimum of
compute the average of these by adding, then dividing by 4.
Q9: How much is spent per year on streets, on average?
Q10: How much is spent per year on fire, on average?
Q11: How much did the police spend above their average in 2018?
You should answer by giving a percent between 0 and 100, with no
percent sign. In this case, your code should produce a number close
change_per_year(agency, start_year, end_year)
This function returns the average increase in spending (could be
negative if there's a decrease) over the period from
end_year for the specified
You can start from the following code:
def change_per_year(agency, start_year=2015, end_year=2018): pass # TODO: replace this line with your code
Python requires all functions to have at least one line of code. When
you don't have some code, yet, it's common for that line to be
which does nothing. Note the default arguments above.
We're not asking you to assume exponential growth or do anything fancy here; you just need to compute the difference between spending in the last year and the first year, then divide by the number of elapsed years.
Q12: how much has spending increased per year (on average) for police from 2015 to 2018?
Use the default arguments (your call to
change_per_year should only
pass one argument explicitly).
Q13: how much has spending increased per year (on average) for police from 2017 to 2018?
Use the default argument for the
end_year parameter (your call to
change_per_year should only pass two arguments explicitly).
Q14: how much has spending increased per year (on average) for streets from 2016 to 2018?
Note that a negative answer is expected (a negative increase represents a decrease).
extrapolate(agency, year1, year2, year3)
This function should compute the average change per year from the data
agency, using your previous function for
finding average change. It then returns the predicted spending in
year3, assuming spending continues increasing (or decreasing) by the
same constant amount each year. We don't have anything for you to
copy for this one (you need to write it from scratch).
As an example, suppose spending in 2016 (year1) is 100 and spending in 2018 (year2) is 120. The average increase is 10 per year. So we would extrapolate to 130 for 2019, 140 for 2020, etc. This kind of prediction is a simple linear extrapolation.
Q15: how much will library spend in 2019?
Extrapolate to 2019 from the data between 2015 and 2018.
Q16: how much will library spend in 2100?
Extrapolate from the data between 2015 and 2018.
Q17: how much will library spend in 2100?
Extrapolate from the data between 2017 and 2018.
We can't know how well our simple extrapolations will perform in the future (unless we wait 80 years), but we can do shorter extrapolations to years for which we DO know the result. For example, we can extrapolate to 2018 from the 2015-to-2017 data, then compare our extrapolation to the actual spending in 2018.
Write a function named
extrapolate_error that does an extrapolation
extrapolate function and compares the extrapolation to the
actual result, returning the error (i.e., how much
overestimated). For example, if the extrapolation is 105 and the
actual is 110, then the function should return -5.
What parameters should
extrapolate_error have? That's your
Q18: what is the error if we extrapolate to 2018 from the 2015-to-2017 data for police?
Q19: what is the error if we extrapolate to 2018 from the 2015-to-2016 data for streets?
Q20: what is the standard deviation for library spending over the 4 years?
Compute the population standard devation, as in this example.