### Importing Libraries and Magic Commands

In CSCI 3022, we will be using common Python libraries to help us process data. By convention, we import all libraries at the very top of the notebook. There are also a set of standard aliases that are used to shorten the library names. Below are some of the libraries that you may encounter throughout the course, along with their respective aliases.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')


<a id='verytop'></a>

# Homework 2
## Prerequisities and Python
## Due Date: Thursday, Sept 7th, 11:59 PM on Gradescope


### Detailed Submission Instructions Are Provided at the end of this Notebook


## Assignment Scope:

* Slicing DataFrames (i.e. selecting rows and columns)
* Filtering data (using boolean arrays)
* Concepts from Discrete Math used in CSCI 3022
* Concepts from Calculus used in CSCI 3022


## Collaboration Policy

Data science is a collaborative activity.  However a key step in learning and retention is **creating solutions on your own.**  

Below are examples of acceptable vs unacceptable use of resources and collaboration when doing HW assignments in CSCI 3022.


The following would be some **examples of cheating** when working on HW assignments in CSCI 3022.  Any of these constitute a **violation of the course's collaboration policy and will result in an F in the course and a trip to the honor council**.   


 - Consulting web pages that may have a solution to a given homework problem or one similar is cheating.  However, consulting the class notes, and web pages that explain the material taught in class but do NOT show a solution to the homework problem in question are permissible to view.  Clearly, there's a fuzzy line here between a valid use of resources and cheating. To avoid this line, one should merely consult the course notes, the course textbook, and references that contain syntax and/or formulas.
 - Copying a segment of code or math solution of three lines or more from another student from a printout, handwritten copy, or by looking at their computer screen 
 - Allowing another student to copy a segment of your code or math solution of three lines or more
 - Taking a copy of another student's work (or a solution found online) and then editing that copy
 - Reading someone else’s solution to a problem on the HW before writing your own.
 - Asking someone to write all or part of a program or solution for you.
 - Asking someone else for the code necessary to fix the error for you, other than for simple syntactical errors
 


On the other hand, the following are some **examples of things which would NOT usually be
considered to be cheating**:
 - Working on a HW problem on your own first and then discussing with a classmate a particular part in the problem solution where you are stuck.  After clarifying any questions you should then continue to write your solution independently.
 - Asking someone (or searching online) how a particular construct in the language works.
 - Asking someone (or searching online) how to formulate a particular construct in the language.
 - Asking someone for help in finding an error in your program.  
 - Asking someone why a particular construct does not work as you expected in a given program.
   

To test whether you are truly doing your own work and retaining what you've learned you should be able to easily reproduce from scratch and explain a HW solution that was your own when asked in office hours by a TA/Instructor or on a quiz/exam.   


If you have difficulty in formulating the general solution to a problem on your own, or
you have difficulty in translating that general solution into a program, it is advisable to see
your instructor or teaching assistant rather than another student as this situation can easily
lead to a, possibly inadvertent, cheating situation.

We are here to help!  Visit HW Hours and/or post questions on Piazza!



If while completing this assignment you reference any websites other than those linked in this assignment or provided on Canvas please list those references here:

**External references**:  *list any websites you referenced***Collaborators**: *list collaborators here*

## Grading
Grading is broken down into autograded answers and manually graded answers. 

For autograded answers, the results of your code are compared to provided and/or hidden tests.

For manually graded answers you must show and explain all steps.  Graders will evaluate how well you answered the question and/or fulfilled the requirements of the question.


### Score breakdown



Question | Points | Grading Type
--- | --- | ---
Question 1a | 3 | autograded
Question 1b | 3 | autograded
Question 1c | 2 | manual
Question 1d | 4 | autograded
Question 1e | 4 | autograded
Question 2a | 3 | manual
Question 2b | 4 | manual
Question 2c | 3 | manual
Question 3a | 4 | manual
Question 3b | 6 | manual
Question 4a | 3 | autograded
Question 4b | 3 | manual
Question 4c | 2 | manual
Question 4d | 2 | manual
Question 4e | 4 | manual
<a id='top'></a>Total | 50

<a id='top'></a>
---
**Shortcuts:**  [Problem 1](#p1) | [Problem 2](#p2) | [Problem 3](#p3) | [Problem 4](#p4) 

---

<hr style="border: 5px solid #003262;" />
<hr style="border: 1px solid #fdb515;" />



### Jupyter Shortcuts ###

Here are some useful Jupyter notebook keyboard shortcuts.  To learn more keyboard shortcuts, go to **Help -> Keyboard Shortcuts** in the menu above. 

Here are a few we like:
1. `ctrl`+`return` : *Evaluate the current cell*
1. `shift`+`return`: *Evaluate the current cell and move to the next*
1. `esc` : *command mode* (may need to press before using any of the commands below)
1. `esc`+`a` : *create a cell above*
1. `esc`+`b` : *create a cell below*
1. `esc`+`dd` : *delete a cell*
1. `esc`+`m` : *convert a cell to markdown*
1. `esc`+`y` : *convert a cell to code*

## How to Answer Math Questions in this Assignment

$\text{Below is an example question to demonstrate the type of work and justification}$
$\text{that is required for full points when answering math questions using LaTex}$.
## $\color{red}{\text{EXAMPLE QUESTION:}}  $

Write a full solution to the following question using LaTex (i.e. do NOT use code to calculate the answer).

What values of $x$ solve the equation: $\phantom{xx}x^2-5x=-6$?

### $\color{red}{\text{This correct answer would be worth zero points}}:$

$x=2$ and $x=3$

### $\color{red}{\text{This answer would be worth full points}}:$


Solution to Example Question:

$x^2-5x=-6$ 

$\implies x^2-5x+6=0$

$\implies (x-2)(x-3)=0$

$\implies \boxed{x=2}$ or $\boxed{x=3}$.

### Preliminary: LaTex ###
You should use LaTeX to format math in your answers. If you aren't familiar with LaTeX, not to worry. It's not hard to use in a Jupyter notebook. Just place your math in between dollar signs:

\\$ f(x) = 2x \\$ becomes $ f(x) = 2x $.

If you have a longer equation, use double dollar signs to place it on a line by itself:

\\$\\$ \sum_{i=0}^n i^2 \\$\\$ becomes:

$$ \sum_{i=0}^n i^2 $$.

Here is some handy LaTex:

| Output | Latex   |
|:--|:--|
| $$x^{a + b}$$  | `x^{a + b}` |
| $$x_{a + b}$$ | `x_{a + b}` |
| $$\frac{a}{b}$$ | `\frac{a}{b}` |
| $$\sqrt{a + b}$$ | `\sqrt{a + b}` |
| $$\{ \alpha, \beta, \gamma, \pi, \mu, \sigma^2  \}$$ | `\{ \alpha, \beta, \gamma, \pi, \mu, \sigma^2  \}` |
| $$\int_{x=2}^{\infty} \frac{1}{x} \,dx $$ | `\sum_{x=2}^{\infty} \frac{1}{x} \,dx` |
| $$\sum_{x=1}^{100} x$$ | `\sum_{x=1}^{100} x` |
| $$\frac{d}{d x} $$ | `\frac{d}{dx} ` |



[For more about basic LaTeX formatting, you can read this article.](https://www.sharelatex.com/learn/Mathematical_expressions)


[Back to top](#top)

<a id='p1'></a>

## Question 1:  Practice with the  Babynames Dataset
We'll be using the babynames dataset from Lecture 3. The babynames dataset contains a record of the given names of babies born in the United States each year.

First let's run the following cells to build the DataFrame `baby_names`.
The cells below download the data from the web and extract the data into a DataFrame. 

### `fetch_and_cache` Helper

The following function downloads and caches data in the `data/` directory and returns the `Path` to the downloaded file. The cell below the function describes how it works. You are not expected to understand this code, but you may find it useful as a reference as a practitioner of data science after the course. 

In [None]:
import requests
from pathlib import Path

def fetch_and_cache(data_url, file, data_dir="data", force=False):
    """
    Download and cache a url and return the file object.
    
    data_url: the web address to download
    file: the file in which to save the results.
    data_dir: (default="data") the location to save the data
    force: if true the file is always re-downloaded 
    
    return: The pathlib.Path to the file.
    """
    data_dir = Path(data_dir)
    data_dir.mkdir(exist_ok=True)
    file_path = data_dir/Path(file)
    if force and file_path.exists():
        file_path.unlink()
    if force or not file_path.exists():
        print('Downloading...', end=' ')
        resp = requests.get(data_url)
        with file_path.open('wb') as f:
            f.write(resp.content)
        print('Done!')
    else:
        import time 
        created = time.ctime(file_path.stat().st_ctime)
        print("Using cached version downloaded at", created)
    return file_path

In Python, a `Path` object represents the filesystem paths to files (and other resources). The `pathlib` module is effective for writing code that works on different operating systems and filesystems. 

To check if a file exists at a path, use `.exists()`. To create a directory for a path, use `.mkdir()`. To remove a file that might be a [symbolic link](https://en.wikipedia.org/wiki/Symbolic_link), use `.unlink()`. 

This function creates a path to a directory that will contain data files. It ensures that the directory exists (which is required to write files in that directory), then proceeds to download the file based on its URL.

The benefit of this function is that not only can you force when you want a new file to be downloaded using the `force` parameter, but in cases when you don't need the file to be re-downloaded, you can use the cached version and save download time.

Below we use `fetch_and_cache` to download the `namesbystate.zip` zip file, which is a compressed directory of CSV files. 

**This might take a little while! Consider stretching.**

In [None]:
data_url = 'https://www.ssa.gov/oact/babynames/state/namesbystate.zip'
namesbystate_path = fetch_and_cache(data_url, 'namesbystate.zip')

The following cell builds the final full `baby_names` DataFrame. It first builds one DataFrame per state, because that's how the data are stored in the zip file. Here is documentation for [pd.concat](https://pandas.pydata.org/pandas-docs/version/1.2/reference/api/pandas.concat.html) if you want to know more about its functionality. As before, you are not expected to understand this code. 

In [None]:
import zipfile
zf = zipfile.ZipFile(namesbystate_path, 'r')

column_labels = ['State', 'Sex', 'Year', 'Name', 'Count']

def load_dataframe_from_zip(zf, f):
    with zf.open(f) as fh: 
        return pd.read_csv(fh, header=None, names=column_labels)

states = [
    load_dataframe_from_zip(zf, f)
    for f in sorted(zf.filelist, key=lambda x:x.filename) 
    if f.filename.endswith('.TXT')
]

baby_names = states[0]
for state_df in states[1:]:
    baby_names = pd.concat([baby_names, state_df])
baby_names = baby_names.reset_index().iloc[:, 1:]

#### Question 1a) #### 
How many rows are in the dataframe baby_names?  (i.e. replace the ellipses (`...`) with one line of code that will calculate this).

In [1]:
baby_names_len = len(baby_names) #SOLUTION

NameError: name 'baby_names' is not defined

In [None]:
assert baby_names_len==6408041

#### Question 1b) ####  
Select the first 5 rows from baby_names

In [None]:
baby_names_first5 = baby_names.head(5) #SOLUTION
baby_names_first5

In [None]:
# TEST
assert baby_names_first5.iloc[4,3] == "Helen"

In [None]:
# TEST
assert baby_names_first5.shape[0] == 5

### Selection Examples on Baby Names

We can use `loc` and `iloc` to select rows and columns of interest from our dataset.

In [None]:
baby_names.loc[2:5, 'Name']

Notice the difference between the following cell and the previous one, just passing in `'Name'` returns a Series while `['Name']` returns a DataFrame.

In [None]:
baby_names.loc[2:5, ['Name']]

The code below collects the rows in positions 1 through 3, and the column in position 3 ("Name").

In [None]:
baby_names.iloc[1:4, [3]]

**REVIEW: Selecting Rows and Columns in Pandas:** 

`[]` and `loc` are quite similar. 

Because it yields more concise code, you'll find that our code and your code both tend to feature `[]`. However, there are some subtle pitfalls of using `[]`. If you're ever having performance issues, weird behavior, or you see a `SettingWithCopyWarning` in pandas, switch from `[]` to `loc` and this may help.

To avoid getting too bogged down in indexing syntax, we'll avoid a more thorough discussion of `[]` and `loc`.

For more on `[]` vs `loc`, you may optionally try reading:
1. https://stackoverflow.com/questions/48409128/what-is-the-difference-between-using-loc-and-using-just-square-brackets-to-filte
2. https://stackoverflow.com/questions/38886080/python-pandas-series-why-use-loc/65875826#65875826
3. https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas/53954986#53954986


#### Question 1c) ####  

Use `[ ]` to select `Name` and `Year` **in that order** from the `baby_names` table.

Then repeat the same selection using the  `.loc` notation instead.




In [None]:
name_and_year1=baby_names[['Name', 'Year']] #SOLUTION
name_and_year1.head()

In [None]:
name_and_year2 = baby_names.loc[:, ['Name', 'Year']] #SOLUTION

name_and_year2.head()

## Filtering Data

### **REVIEW**: Filtering with boolean arrays

Filtering is the process of removing unwanted material.  In your quest for cleaner data, you will undoubtedly filter your data at some point: whether it be for clearing up cases with missing values, for culling out fishy outliers, or for analyzing subgroups of your data set.  Example usage looks like `df[df['column name'] < 5]`.

For your reference, some commonly used comparison operators are given below.

Symbol | Usage      | Meaning 
------ | ---------- | -------------------------------------
==   | a == b   | Does a equal b?
<=   | a <= b   | Is a less than or equal to b?
&gt;=   | a >= b   | Is a greater than or equal to b?
<    | a < b    | Is a less than b?
&#62;    | a &#62; b    | Is a greater than b?
~    | ~p       | Returns negation of p
&#124; | p &#124; q | p OR q
&    | p & q    | p AND q
^  | p ^ q | p XOR q (exclusive or)

#### Question 1d) #### 

Use one line of code to construct a new DataFrame containing the first 10 names in baby_names registered in Colorado and the columns `Name` and `Count` **in that order**

In [4]:
co = baby_names.loc[baby_names["State"]=="CO",["Name","Count"]].head(10) #SOLUTION
co

NameError: name 'baby_names' is not defined

In [None]:
#TEST
assert co.iloc[6,1] == 46

In [None]:
#TEST 
assert len(co) == 10

#### Question 1e) #### 

Using a boolean array, select the names in Year 2000 (from `baby_names`) that have larger than 3000 counts. Keep all columns from the original `baby_names` DataFrame.

Note: Note that compound expressions have to be grouped with parentheses. That is, any time you use `p & q` to filter the DataFrame, make sure to use `df[(df[p]) & (df[q])]` or `df.loc[(df[p]) & (df[q])]`. 

You may use either `[]` or `loc`. Both will achieve the same result. For more on `[]` vs. `loc` see the stack overflow links from the intro portion of this lab.


In [None]:
result = baby_names[(baby_names["Year"] == 2000) & (baby_names["Count"] > 3000)] # SOLUTION
result.head()

In [None]:
# TEST
assert len(result) == 11

In [None]:
# TEST
assert result["Count"].sum() == 39001

In [None]:
# TEST
assert result["Count"].iloc[0] == 4342

[Back to top](#top)

<a id='p1'></a>

## Question 2: Discrete Structures Review ##

You will need a solid understanding of the following concepts from Discrete Structures to succeed in this course. See the Prerequisite Review Resources posted on Canvas if you need to review any of these key concepts.


#### Question 2a) ####  

A coin is flipped 10 times.  How many possible outcomes have exactly 2 heads?  Use LaTeX (not code) in the cell below to show all of your steps and fully justify your answer.  

**Note: In this class, you must always put your answer in the cell that immediately follows the question. DO NOT create any cells between this one and the one that says** _Write your answer here, replacing this text._

**Solution**

This is combination of 10 choose 2 (because once we've chosen the two spots the other 8 spots will automatically be tails).  $C(10,2) = \frac{10!}{2!8!} = \frac{10*9}{2} = \boxed{45}$

2a Answer Check).  To check your final answer to part 2a above, enter the answer you came up with (just the number) in the cell below. Note that this is just a built-in public test so you can check your work and determine if you are on the right track.  To receive credit on problem you must show all steps in part 2a above using LaTeX and fully justifying your answer using correct mathematical notation.  

In [None]:
q2a_answer = 45 # SOLUTION

In [None]:
assert q2a_answer == 45

#### Question 2b) ####  

What is the probability that if I roll two 6-sided dice they add up to **at most** $9$? Use LaTeX (not code) in the cell directly below to show all of your steps and fully justify your answer.


**Solution**

To find the probability of at most adding to $9$ we will calculate:

$1-P(\text{sum of 10})- P(\text{sum of 11})- P(\text{sum of 12}) $

$P(\text{sum of 12}) = \frac{1}{36}$ (since there are (6)(6)=36 possible outcomes and only the outcome with 2 sixes adds to 12)

$P(\text{sum of 11}) = \frac{2}{36}$ (i.e. either 5 and 6 or 6 and 5)

$P(\text{sum of 10}) = \frac{3}{36}$ (either (6, 4), (4, 6), (5,5))

Thus $P(\text{sum of at most 9}) = 1-\frac{3}{36}-\frac{2}{36}-\frac{1}{36} = \frac{30}{36} =\boxed{ \frac{5}{6}}$


2b Answer Check).  To check your final answer to 2b, enter the answer you came up with (just the number) in the cell below. Note that this is just a built-in public test so you can check your work and determine if you are on the right track.  To receive credit on this problem you must show all steps in part 2b above using LaTeX and fully justifying your answer using correct mathematical notation.  

In [None]:
q2b_answer = 5/6  # SOLUTION

In [None]:
assert np.isclose(q2b_answer,0.8333333333333334)

#### Question 2c) #### 

Suppose you show up to a quiz completely unprepared.  The quiz has 10 questions, each with 5 multiple choice options. You decide to guess each answer in a completely random way.  What is the probability that you get exactly 3 questions correct?  Use LaTeX (not code) in the cell directly below to show all of your steps and fully justify your answer. 


**Solution**

On any given question, you have a $1/5$ probability of guessing the correct answer and a $4/5$ probability of guessing incorrectly. 

There are $C(10,3) = \frac{10!}{7!3!}=\frac{10*9*8}{3*2} = 120$ different ways to select the exact 3 questions out of 10 you get correct.  

The probability of any one of these is $(\frac{1}{5})^3(\frac{4}{5})^7$.

Thus the total probability is $C(10,3)(\frac{1}{5})^3(\frac{4}{5})^7 =\boxed{120\left(\frac{4^7}{5^{10}}\right)}$


2c Final Answer Check).  To check your final answer to 2c, enter the final answer you came up with in the cell below. Note that this is just a built-in public test so you can check your work and determine if you are on the right track.  To receive credit on this problem you must show all steps in part 2c above using LaTeX and fully justifying your answer using correct mathematical notation.  

In [None]:
q2c_answer = 120*(4**7)/(5**10) # SOLUTION

In [None]:
assert np.isclose(q2c_answer,0.201326592)

[Back to top](#top)

<a id='p3'></a>

## Question 3: Calculus Review ##

You will need a solid understanding of the following concepts from Calculus 1 and 2 to succeed in this course. See the Prerequisite Review Resources posted on Canvas if you need to review any of these key concepts.


### Preliminary: Sums ###

Here's a recap of some basic algebra written in sigma notation. The facts are all just applications of the ordinary associative and distributive properties of addition and multiplication, written compactly and without the possibly ambiguous "...". But if you are ever unsure of whether you're working correctly with a sum, you can always try writing $\sum_{i=1}^n a_i$ as $a_1 + a_2 + \cdots + a_n$ and see if that helps.

- You can use any reasonable notation for the index over which you are summing, just as in Python you can use any reasonable name in `for name in list`. Thus $\sum_{i=1}^n a_i = \sum_{k=1}^n a_k$.
- $\sum_{i=1}^n (a_i + b_i) = \sum_{i=1}^n a_i + \sum_{i=1}^n b_i$
- $\sum_{i=1}^n d = nd$
- $\sum_{i=1}^n (ca_i + d) = c\sum_{i=1}^n a_i + nd$


#### Question 3a) #### 

We commonly use sigma notation to compactly write the definition of the arithmetic mean (commonly known as the average):

$$\bar{x} = \frac{1}{n}\left(x_1+x_2+ ... + x_n \right) = \frac{1}{n}\sum_{i=1}^n x_i$$



The $i$th *deviation from average* is the difference $x_i - \bar{x}$. Prove that the sum of all these deviations is 0 that is, prove that $\sum_{i=1}^n (x_i - \bar{x}) = 0$ (write your full solution in the box directly below showing all steps and using LaTeX).


**Solution**




#### Question 3b) ####  

Let $x_1, x_2, \ldots, x_n$ be a list of numbers. You can think of each index $i$ as the label of a household, and the entry $x_i$ as the annual income of Household $i$. 

Consider the function  $$f(c) = \frac{1}{n} \sum_{i=1}^n (x_i-c)^2$$


In this scenario, suppose that our data points $x_1, x_2, \ldots, x_n$ are fixed and that $c$ is the only variable.

Using calculus, determine the value of $c$ that minimizes $f(c)$.  You must use calculus to justify that this is indeed a minimum, and not a maximum.


**Solution**




[Back to top](#top)

<a id='p4'></a>

## Question 4).  Applying Those Prereqs:  A Maximum Likelihood Estimate ##

In this problem we're going to apply your calculus and discrete math prerequisite knowledge to introduce a data science concept called a maximum likelihood estimate.

Data scientists use coin tossing as a visual image for sampling at random with replacement from a binary population.

#### Question 4a) ####
A coin that lands heads with chance 0.7 is tossed six times. What is the chance of the sequence HHHTHT? Assign your answer to the variable `p_HHHTHT`.

In [None]:
p_HHHTHT = 0.7**4*(0.3**2)  #SOLUTION
p_HHHTHT

In [None]:
assert np.isclose(p_HHHTHT, 0.021608999999999996)

#### Question 4b) ####
I have a coin that lands heads with an unknown probability $p$. 

Suppose I toss it 10 times and get the sequence TTTHTHHTTH.

If you toss this coin 10 times, the chance that you get the sequence above is a function of $p$. That function is called the *likelihood* of the sequence TTTHTHHTTH, so we will call it $L(p)$.

What is $L(p)$ for the sequence TTTHTHHTTH?

Write your answer using LaTeX below (i.e. your answer should be of the form:  $L(p)$=some function of p)



**Solution **

#### Question 4c) ####

Below is a section of code that will help you plot the function $L(p)$ that you defined above.
Replace the ellipses with your function of $p$


In [None]:
p = np.linspace(0, 1, 100) 
#This creates an array of 100 values equally spaced between 0 and 1

likelihood = (p**4)(1-p)**6 #SOLUTION

plt.plot(p, likelihood, lw=2, color='darkblue') 
#This plots the likelihood function  

plt.plot([0, 1], [0, 0], lw=1, color='grey')  
#This plots a horizontal axis

plt.xlabel('$p$')
#This labels the x axis
plt.ylabel('$L(p)$', rotation=0)
#This labels the y-axis

plt.title('Likelihood of TTTHTHHTTH');
#This titles the plot

#### Question 4d) ####

The value $\hat{p}$ at which the likelihood function attains its maximum is called the *maximum likelihood estimate* (MLE) of $p$. Among all values of $p$, it is the one that makes the observed data most likely.

Using your plot above, what is the value of $\hat{p}$?   

Provide a simple interpretation of that value in terms of the data TTTHTHHTTH.


**SOLUTION:**




#### Question 4e) ####

Let's prove what you observed graphically above.  That is, let's use calculus to find $\hat{p}$.  

But wait before you start trying to find the value $p$ where $L'(p)=0$ (trust us, the algebra is not pretty...)

TIP:  
The value $\hat{p}$ at which the function $L$ attains its maximum is the same as the value at which the function $\log(L)$ attains its maximum. To clarify, $\log(L)$ is the composition of $\log$ and $L$: $\log(L)$ at $p$ is $\log(L(p))$. Even though it doesn't make a difference for this problem, $\log$ is now and forevermore the $\log$ to the base $e$, not to the base 10.


This tip is hugely important in data science because many probabilities are products and the $\log$ function turns products into sums. It's much simpler to work with a sum than with a product.


Armed with that tip use calculus to find $\hat{p}$. You don't have to check that the value you've found produces a max and not a min – we'll spare you that step.







*SOLUTION*

<br/><br/>
<hr style="border: 5px solid #003262;" />
<hr style="border: 1px solid #fdb515;" />

## Congratulations! You have finished Homework 2!

If you discussed this assignment with any other students in the class (in a manner that is acceptable as described by the Collaboration policy above) please **include their names** here:

**Collaborators**: *list collaborators here*

### Submission Instructions

Before proceeding any further, **save this notebook.**

After running the `grader.export()` cell provided below, **2 files will be created**: a zip file and pdf file.  You can download them using the links provided below OR by finding them in the same folder where this juptyer notebook resides in your JuptyerHub.

To receive credit on this assignment, **you must submit BOTH of these files
to their respective Gradescope portals:** 

* **Homework 2 Autograded**: Submit the zip file that is output by the `grader.export()` cell below to the HW1 Autograded assignment in Gradescope.

* **Homework 2 Manually Graded**: Submit your hw01.PDF to the HW1 Manually Graded assignment in Gradescope.  


**You are responsible for ensuring your submission follows our requirements. We will not be granting regrade requests nor extensions to submissions that don't follow instructions.** If you encounter any difficulties with submission, please don't hesitate to reach out to staff prior to the deadline.