In [1]:
# Initialize Otter
import otter
grader = otter.Notebook("p3.ipynb")

In [2]:
# DO NOT MODIFY the code in this cell
# You must run this cell before you start working on the project
import test

# Project 3: Electric Vehicle Sales

## Learning Objectives:

In this project you will demonstrate your ability to:
- import a module and use its functions,
- write functions,
- use default arguments when calling functions,
- use positional and keyword arguments when calling functions,
- avoid hardcoding, and
- work with the index of a row of data.

## Testing your code:

Along with this notebook, you must have downloaded the file `test.py`. If you are curious about how we test your code, you can explore this file, and specifically the value of the variable `expected_json`, to understand the expected answers to the questions. You can have a look at [p2](https://git.doit.wisc.edu/cdis/cs/courses/cs220/cs220-f22-projects/-/tree/main/p2) if you have forgotten how to read the outputs of the `grader.check(...)` function calls.

**Please go through [lab-p3](https://git.doit.wisc.edu/cdis/cs/courses/cs220/cs220-f22-projects/-/tree/main/lab-p3) before starting this project.** The lab introduces some useful techniques necessary for this project.

## Project Description:

In this project, you'll analyze data on six different electric vehicle models sold in the United States between 2015 and 2019. The dataset we will analyze is truncated and modified from [the Alternative Fuels Data Center](https://afdc.energy.gov/data/) published by the U.S. Department of Energy.

You'll get practice calling functions from the `project` module, which we've provided, and practice writing your own functions.

If you haven't already downloaded `project.py`, `test.py`, and  `car_sales_data.csv` (you can verify by running `ls` in a new terminal tab from your `p3` project directory). , please terminate the current `jupyter notebook` session, download all the required files, launch a `jupyter notebook` session again and click on *Kernel* > *Restart and Clear Output*. Start by executing all the cells (including the ones containing `import` statements).


We won't explain how to use the `project` module here (i.e., the code in the `project.py` file).  Refer to [lab-p3](https://git.doit.wisc.edu/cdis/cs/courses/cs220/cs220-f22-projects/-/tree/main/lab-p3) to understand how the inspection process works and use the `help(...)` function to learn about the various functions inside `project.py`. Feel free to take a look at the `project.py` code, if you are curious about how it works.

This project consists of writing code to answer 20 questions.

## Dataset:

The dataset you will be working with for this project is reproduced here:

id|vehicle|2015|2016|2017|2018|2019
------|------|------|------|------|------|------|
958|Tesla Model S|26200|30200|26500|25745|15090
10|Chevy Volt|15393|24739|20349|18306|4915
64|Nissan Leaf|17269|14006|11230|14715|12365
977|Toyota Prius PHEV|4191|2474|20936|27595|23630
332|Ford Fusion Energi|9750|15938|9632|8074|7476
951|Tesla Model X|208|19600|21700|26100|19425


This table lists 6 different electric vehicle models, and how many cars of each model were sold each year between 2015 and 2019 (inclusive of both years).

The dataset is in the `car_sales_data.csv` file which you downloaded. Alternatively, you can open the `car_sales_data.csv` file, to look at the same data and verify answers to simple questions.

## Project Requirements:

You **may not** hardcode indices in your code. For example, if we ask how many Nissan Leaf cars were sold in 2016, you could obtain the answer with this code: `get_sales(get_id("Nissan Leaf"), 2016)`.  If you don't use `get_id` and instead use `get_sales(64, 2016)`, we'll **manually deduct** points from your autograder score on Gradescope during code review.

For some of the questions, we'll ask you to write (then use) a function to compute the answer.  If you compute the answer **without** creating the function we ask you to write, we'll **manually deduct** points from your autograder score on Gradescope, even if the way you did it produced the correct answer.

Students are only allowed to use Python commands and concepts that have been taught in the course before the release of p3. In particular, you are **NOT** allowed to use Conditionals or Iteration on this project. We will **manually deduct** points from your autograder score on Gradescope otherwise.

For more details on what will cause you to lose points during code review, please take a look at the [Grading rubric](https://git.doit.wisc.edu/cdis/cs/courses/cs220/cs220-f22-projects/-/blob/main/p3/rubric.md).

## Incremental Coding and Testing:

You should always strive to do incremental coding. Incremental coding enables you to avoid challenging bugs. Always write a few lines of code and then test those lines of code, before proceeding to write further code. You can call the `print` function to test intermediate step outputs. Store your final answer for each question in the variable recommended for each question. This step is important because Otter grades your work by comparing the value of this variable against the correct answer. So, if you store your answer in a different variable, you will not get points for it.

We also recommend you do incremental testing: make sure to run the local tests as soon as you are done with a question. This will ensure that you haven't made a big mistake that might potentially impact the rest of your project solution. Please refrain from making multiple submissions on Gradescope for testing individual questions' answers. Instead use the local tests, to test your solution on your laptop. 

That said, it is very **important** that you check the *Gradescope* test results as soon as you submit your project on Gradescope. Test results on *Gradescope* are typically available somewhere between 2 to 10 minutes after the submission.

## Project Questions and Functions:

In [11]:
# Include the relevant import statements in this cell
import project 

In [13]:
# Call the init function to load the dataset
project.init("car_sales_data.csv")
# You may call the dump function here to test if you have loaded the dataset correctly.

**Question 1:** What is the `id` of the *Toyota Prius PHEV*?

In [14]:
# INCORRECT ANSWER prius_id = 977 => this is considered hardcoding
prius_id = project.get_id("Toyota Prius PHEV")
prius_id

977

In [9]:
grader.check("q1")

Instead of repeatedly calling `project.get_id` function for each question, make these calls once at the beginning of your notebook and save the results in variables. Recall that calling the same function multiple times with the same argument(s) is a waste of computation. Complete the code in the below cell and make sure to use the relevant ID variables for the rest of the project questions.

In [15]:
model_s_id = project.get_id('Tesla Model S') # we have done this for you
# replace the ... in the line below with code to get the id of 'Chevy Volt'
volt_id = project.get_id("Chevy Volt")
# invoke get_id for the other car models and store the result into similar variable names
leaf_id=project.get_id("Nissan Leaf")
phev_id=project.get_id("Toyota Prius PHEV")
energi_id=project.get_id("Ford Fusion Energi")
model_x_id=project.get_id("Tesla Model X")
# considering that you already invokved get_id for Toyota Prius PHEV, you need to 
# make 3 more function calls to store the ID for the rest of the vehicles


**Question 2:** How many *Nissan Leaf* cars were sold in *2017*?

Your answer should just be a number (without any units at the end). You **should not** hardcode the ID of the car. You **must** use the variable that you used to store the ID of Nissan Leaf (assuming you already invoked `get_id` for all the car models in the cell right below Question 1).

In [16]:
num_leaf = project.get_sales(leaf_id, 2017)
num_leaf

11230

In [17]:
grader.check("q2")

### Function 1: `year_max(year)`

This function will compute the highest number of sales for any model in the given `year`.

It has already been written for you, so you do not have to modify it. You can directly call this function to answer the following questions. 

In [18]:
def year_max(year):
    """
    computes the highest number of sales for any model in the given year
    """
    # get the sales of each model in the given year
    model_s_sales = project.get_sales(project.get_id('Tesla Model S'), year)
    volt_sales = project.get_sales(project.get_id('Chevy Volt'), year)
    leaf_sales = project.get_sales(project.get_id('Nissan Leaf'), year)
    prius_sales = project.get_sales(project.get_id('Toyota Prius PHEV'), year)
    fusion_sales = project.get_sales(project.get_id('Ford Fusion Energi'), year)
    model_x_sales = project.get_sales(project.get_id('Tesla Model X'), year)

    # use the built-in max function to get the maximum of the six values
    return max(model_s_sales, volt_sales, leaf_sales, prius_sales, fusion_sales, model_x_sales)

**Question 3:** What was the highest number of sales for *any* model in the year *2017*?

You **must** call the `year_max` function to answer this question.

In [19]:

max_sales_2017 = year_max(2017)
max_sales_2017

26500

In [20]:
grader.check("q3")

**Question 4:** What was the highest number of sales for *any* model in a single year in the period *2016-2018*?

Recall that we can use the `max` function to compute the maximum of some values. Look at the lab examples where you used the `max` function or the `year_max` function definition. To be clear, the answer to this question is a single integer whose value is the highest sales number achieved by any model in a single year during these three years. You **must** invoke the `year_max` function in your answer to this question.

In [21]:
max_sales_2016_to_2018 = max(year_max(2016),year_max(2017),year_max(2018))
max_sales_2016_to_2018

30200

In [22]:
grader.check("q4")

### Function 2: `sales_min(model)`

This function should compute the lowest number of sales in a year for the given `model` considering every year in the dataset.

We'll help you get started with this function, but you need to fill in the rest of the function yourself.

In [23]:
def sales_min(model):
    """
    computes the lowest number of sales in a year for the given model
    """
    model_id = project.get_id(model)    
    sales_2015 = project.get_sales(model_id, 2015)
    sales_2016 = project.get_sales(model_id, 2016)
    sales_2017= project.get_sales(model_id, 2017)
    sales_2018= project.get_sales(model_id, 2018)
    sales_2019 =project.get_sales(model_id, 2019)
    
    # use the built-in min function (similar to the max function) to get the minimum across the 
    # five years and return that value
    
    min_sales_2015_to_2019 = min(sales_2015,sales_2016,sales_2017,sales_2018,sales_2019)
    return min_sales_2015_to_2019

**Question 5:** What was the lowest number of sales for the *Tesla Model S* in a *single* year?

You **must** call the `sales_min` function to answer this question.

In [24]:
min_sales_model_S = sales_min("Tesla Model S")
min_sales_model_S

15090

In [25]:
grader.check("q5")

**Question 6:** What was the lowest sales number in a *single* year between the *Chevy Volt*, *Ford Fusion Energi*, and the *Nissan Leaf*?

Recall that we can use the `min` function to compute the minimum of some values. To be clear, the answer to this question is a single integer whose value is the lowest sales number achieved in a single year during this entire period between 2015-2019 by any of the 3 models mentioned. You **must** invoke the `sales_min` function in your answer to this question.

In [29]:
# compute and store the answer in the variable 'min_sales_CV_FFE_NL'
min_sales_CV_FFE_NL=min(sales_min("Chevy Volt"),sales_min("Ford Fusion Energi"),sales_min("Nissan Leaf"))
min_sales_CV_FFE_NL
# display the variable 'min_sales_CV_FFE_NL' here

4915

In [30]:
grader.check("q6")

### Function 3: `sales_avg(model) `

This function should compute the average yearly sales number for the given `model` across the five years in the dataset (i.e. *2015 - 2019*).

**Hint:** start by copy/pasting the `sales_min` function definition, and renaming your copy to `sales_avg` (this is not necessary, but it will save you time).  
Instead of returning the minimum of `sales_2015`, `sales_2016`, etc., return the average of these by adding them together, then dividing by five. 
**You may hardcode the number 5 for this computation**.

The type of the *return value* should be `float`.

In [106]:
# define the function sales_avg(model) here
def sales_avg(model):
    sum=0
    for i in range (2015,2019+1):
        sum+=project.get_sales(project.get_id(model), i)
    return sum/(2019-2015+1)

**Question 7:** What was the average number of *Toyota Prius PHEV* cars sold per year between *2015* and *2019*?

You **must** call the `sales_avg` function to answer this question.

In [107]:
# compute and store the answer in the variable 'sales_avg_prius_2015_to_2019'
sales_avg_prius_2015_to_2019=sales_avg("Toyota Prius PHEV")
sales_avg_prius_2015_to_2019
# display the variable 'sales_avg_prius_2015_to_2019' here

15765.2

In [108]:
grader.check("q7")

**Question 8:** What was the average number of *Chevy Volt* cars sold per year between *2015* and *2019*?

You **must** call the `sales_avg` function to answer this question.

In [45]:
# compute and store the answer in the variable 'sales_avg_volt_2015_to_2019'
sales_avg_volt_2015_to_2019=sales_avg("Chevy Volt")
sales_avg_volt_2015_to_2019
# display the variable 'sales_avg_volt_2015_to_2019' here

16740.4

In [46]:
grader.check("q8")

**Question 9:** Relative to its 5-year average, how many more or fewer *Nissan Leaf* cars were sold in *2018*?

**Hint:** Call the `sales_avg` function, to compare the *Nissan Leaf* average sales to the *Nissan Leaf* sales in *2018*. 
Your answer will be a positive number if more Nissan Leafs were sold in 2018 than on average. Your answer will be a negative number if fewer Nissan Leafs were sold in 2018 than on average.

In [52]:
# compute and store the answer in the variable 'diff_leaf_2018_to_average'.
# it is recommended that you create more intermediary variables to make your code easier to write and read.
# some useful intermediary variables you could create are: 'leaf_id', 'num_sales_leaf_2018'.
diff_leaf_2018_to_average=project.get_sales(project.get_id("Nissan Leaf"),2018)-sales_avg("Nissan Leaf")
diff_leaf_2018_to_average
# display the variable 'diff_leaf_2018_to_average' here

798.0

In [53]:
grader.check("q9")

### Function 4: `year_sum(year)`

This function should compute the total number of sales across every model for the given `year`.

You can start from the following code snippet:

In [55]:
def year_sum(year=2019): # DO NOT EDIT THIS LINE
    """
    computes the total number of sales across every model for the given year
    """
    #pass # this statement tells Python to do nothing.
    # since this function has no code inside, we have added the pass statement 
    # inside so the code does not crash.
    # once you have added code to this function, you can (and should) 
    # remove the pass statement as it does nothing.
    total=0
    total+=project.get_sales(project.get_id("Tesla Model S"),year)
    total+=project.get_sales(project.get_id("Chevy Volt"),year)
    total+=project.get_sales(project.get_id("Nissan Leaf"),year)
    total+=project.get_sales(project.get_id("Toyota Prius PHEV"),year)
    total+=project.get_sales(project.get_id("Ford Fusion Energi"),year)
    total+=project.get_sales(project.get_id("Tesla Model X"),year)
    return total
    # finish this function definition and return the total number of sales 
    # across every model for the given 'year'
    

**Question 10:** What was the *total* number of vehicles sold in *2019*?

You **must** call the `year_sum` function to answer this question. Use the default argument (your call to `year_sum` function **should not** pass any arguments).

In [57]:
# compute and store the answer in the variable 'sales_sum_2019'
sales_sum_2019=year_sum(2019)
sales_sum_2019
# display the variable 'sales_sum_2019' here

82901

In [58]:
grader.check("q10")

**Question 11:** What was the *total* number of vehicles sold between *2017* and *2019*?

You **must** invoke the `year_sum` function in your answer to this question. To be clear, the answer to this question is a single integer whose value is the total sales number achieved by all six models during these three years.

In [61]:
# compute and store the answer in the variable 'sales_sum_2017_to_2019'

sales_sum_2017_to_2019=0
for i in range (2017,2019+1):
    sales_sum_2017_to_2019+= year_sum(year=i)
sales_sum_2017_to_2019
# display the variable 'sales_sum_2017_to_2019' here

313783

In [62]:
grader.check("q11")

### Function 5: `change_per_year(model, start_year, end_year)`

This function should return the average increase/decrease in sales (could be positive if there's an increase, negative if there’s a decrease) over the period from `start_year` to `end_year` for the given `model`.

The type of the *return value* should be `float`.

We're not asking you to do anything complicated here; you just need to compute the difference in sales between the last year and the first year, then divide by the number of elapsed years. Recall that you created a similar function in the lab. You can start with the following code snippet (with the default arguments):

In [96]:
def change_per_year(model, start_year=2015, end_year=2019): # DO NOT EDIT THIS LINE
    """
    computes the average increase/decrease in sales (could be positive if there's an increase, 
    negative if there’s a decrease) over the period from start_year to end_year for the given model
    """
#     sales_start_year=start_year
#     sales_end_year=end_year
    sales_difference=project.get_sales(project.get_id(model),end_year) - project.get_sales(project.get_id(model),start_year)
    return (sales_difference/(end_year-start_year))
    #pass # as before, you should delete this statement after finishing your function.
    
    # compute and return the change per year in sales of the model between start_year and end_year
    # it is recommended that you create intermediary variables to make your code easier to write and read.
    # some useful intermediary variables you could create are: 
    # 'sales_start_year', 'sales_end_year', 'sales_difference'.
    

**Question 12:** How much have the sales of the *Ford Fusion Energi* changed per year (on average) from *2015* to *2019*?

You **must** call the `change_per_year` function to answer this question. Use the default arguments (your call to `change_per_year` function **should not** pass any more arguments than is absolutely necessary).

In [97]:
# compute and store the answer in the variable 'fusion_average_change'
fusion_average_change=change_per_year("Ford Fusion Energi")
fusion_average_change
# display the variable 'fusion_average_change' here

-568.5

In [98]:
grader.check("q12")

**Question 13:** How much have the sales of the *Chevy Volt* changed per year (on average) from *2016* to *2019*?

You **must** call the `change_per_year` function to answer this question. Use the default arguments (your call to `change_per_year` function **should not** pass any more arguments than is absolutely necessary).

In [112]:
# compute and store the answer in the variable 'volt_average_change'
volt_average_change=change_per_year("Chevy Volt",2016,2019)
volt_average_change
# display the variable 'volt_average_change' here

-6608.0

In [113]:
grader.check("q13")

**Question 14:** How much have the sales of the *Tesla Model X* changed per year (on average) from *2015* to *2018*?

You **must** call the `change_per_year` function to answer this question. Use the default arguments (your call to `change_per_year` function **should not** pass any more arguments than is absolutely necessary).

In [115]:
# compute and store the answer in the variable 'model_x_average_change'
model_x_average_change=change_per_year("Tesla Model X",2015,2018)
model_x_average_change
# display the variable 'model_x_average_change' here

8630.666666666666

In [116]:
grader.check("q14")

### Function 6: `estimate_sales(model, target_year, start_year, end_year)`

This function should estimate what the sales would be for the given `model` in the given `target_year` assuming that there is a constant rate of change in the sales after `end_year` that is equal to the average change per year in the period between `start_year` and `end_year`.

The type of the *return value* should be `float`.

You **must** define `estimate_sales` so that the parameter `start_year` has the default argument `2015` and `end_year` has the default argument `2019`.

You **must** call the `change_per_year` function in the definition of `estimate_sales`. **Do not** manually compute the average change in sales.

In [121]:
# define the function estimate_sales(model, target_year, start_year, end_year) here.
# it should return the estimated sales of the model in target_year based on the change in 
# sales between start_year and end_year.
# it is recommended that you create intermediary variables to make your code easier to write and read.
def estimate_sales(model, target_year, start_year, end_year):
    cpy=change_per_year(model,start_year, end_year)
    e_sales=project.get_sales(project.get_id(model),end_year)+cpy*(target_year-end_year)
    
    return e_sales

**Question 15:** What are the estimated sales for the *Nissan Leaf* in *2021* based on the average change in sales per year for it between *2015* and *2019*?

You **must** call the `estimate_sales` function to answer this question. Use the default arguments if possible (your call to `estimate_sales` function **should not** pass any more arguments than is absolutely necessary).

In [122]:
# compute and store the answer in the variable 'leaf_sales_in_2021'
leaf_sales_in_2021=estimate_sales("Nissan Leaf",2021, 2015,2019)
leaf_sales_in_2021
# display the variable 'leaf_sales_in_2021' here

9913.0

In [123]:
grader.check("q15")

**Question 16:** What are the estimated sales for the *Toyota Prius PHEV* in *2022* based on the average change in sales per year for it between *2016* and *2018*?

You **must** call the `estimate_sales` function to answer this question. Use the default arguments if possible (your call to `estimate_sales` function **should not** pass any more arguments than is absolutely necessary).

In [124]:
# compute and store the answer in the variable 'prius_sales_in_2022'
prius_sales_in_2022=estimate_sales("Toyota Prius PHEV",2022,2016,2018)
prius_sales_in_2022
# display the variable 'prius_sales_in_2022' here

77837.0

In [125]:
grader.check("q16")

**Question 17:** What is the difference between estimated sales for the *Tesla Model X* in *2030* based on the average change per year between *2015* and *2018* and between *2015* and *2019*?

You **must** invoke the `estimate_sales` function in your answer to this question. Use the default arguments if possible (your call to `estimate_sales` function **should not** pass any more arguments than is absolutely necessary). A positive answer implies that the estimate based on the sales between *2015* and *2018* is higher, while a negative answer implies that it is lower.

In [129]:
# compute and store the answer in the variable 'diff_sales_model_x_2030'
diff_sales_model_x_2030=estimate_sales("Tesla Model X", 2030,2015,2018)-estimate_sales("Tesla Model X", 2030,2015,2019)

diff_sales_model_x_2030
# display the variable 'diff_sales_model_x_2030' here

57396.25

In [130]:
grader.check("q17")

**Question 18:** What is the difference between estimated sales for the *Nissan Leaf* in *2030* based on the average change per year between *2015* and *2017* and between *2016* and *2017*?

You **must** invoke the `estimate_sales` function in your answer to this question. Use the default arguments if possible (your call to `estimate_sales` function **should not** pass any more arguments than is absolutely necessary). A positive answer implies that the estimate based on the sales between *2015* and *2017* is higher, while a negative answer implies that it is lower.

In [132]:
# compute and store the answer in the variable 'diff_sales_leaf_2030'
diff_sales_leaf_2030=estimate_sales("Nissan Leaf",2030,2015,2017)-estimate_sales("Nissan Leaf",2030,2016,2017)
diff_sales_leaf_2030
# display the variable 'diff_sales_leaf_2030' here

-3165.5

In [133]:
grader.check("q18")

The wild answers we get to **Question 17** and **Question 18** suggest that our function `estimated_sales` is not very good at estimating the sales of any model in the future. This is not surprising since it is clearly not reasonable to assume that the rate of change in sales will remain constant over a period of time. We will now try to see how much the change per year in the sales of the models varies over time.

**Question 19:** What is the difference in change per year of *Chevy Volt* sales between the time periods of *2017* to *2019* and *2015* to *2016*?

You **must** invoke the `change_per_year` function in your answer to this question. Use the default arguments if possible (your call to `change_per_year` function **should not** pass any more arguments than is absolutely necessary). A positive answer would imply that more cars were sold each year on average during the period *2017-2019* than during the period *2015-2016*, while a negative answer would imply that fewer cars were sold during the period *2017-2019*.

In [134]:
# compute and store the answer in the variable 'volt_diff_change_per_year'
volt_diff_change_per_year=change_per_year("Chevy Volt",2017,2019)-change_per_year("Chevy Volt",2015,2016)
volt_diff_change_per_year
# display the variable 'volt_diff_change_per_year' here

-17063.0

In [135]:
grader.check("q19")

**Question 20:** What is, for the *Toyota Prius PHEV* sales, the ratio of the change per year between *2017* and *2018* to the change per year between *2015* and *2019*?

You **must** invoke the `change_per_year` function in your answer to this question. Use the default arguments if possible (your call to `change_per_year` function **should not** pass any more arguments than is absolutely necessary). A value *greater than 1* here would imply that on average, more cars were sold each year during the period *2017-2018* than normal, while a value *less than 1* would imply that on average, fewer cars were sold each year during the period *2017-2018* than normal.

In [136]:
# compute and store the answer in the variable 'prius_change_per_year_ratio'
prius_change_per_year_ratio=change_per_year("Toyota Prius PHEV",2017,2018)/change_per_year("Toyota Prius PHEV",2015,2019)
prius_change_per_year_ratio
# display the variable 'prius_change_per_year_ratio' here

1.37023509439786

In [137]:
grader.check("q20")

## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**

**SUBMISSION INSTRUCTIONS**: 1. **Save** the notebook file **now (before you run the next cell of code)**. 2. **Upload** the zipfile to Gradescope. 3. Check **Gradescope otter** results as soon as the auto-grader execution gets completed. Don't worry about the score showing up as -/100.0. You only need to check that the test cases passed.

In [None]:
# Save your notebook first, then run this cell to export your submission.
grader.export(pdf=False, run_tests=True)