# Lesson 8: Functions & Loops

Today:
1. Functions
    + Why define our own functions
    + How to define your own functions in python
    + Application to classification
2. Loops
    + Understanding the `for` loop
        + Tracing how variables change values during loops
    + Accessing entries of a data frame using loops
3. Application to classification

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns

## 1. Defining your own functions

**Example**

You decided to check out a very popular East Village ramen restaurant for dinner on Friday night.  After waiting in line for two hours, you are finally seated.  As you are reading the menu, you realized that this restaurant is cash-only.  You have $28.75 with you and need to make sure that you have enough cash to pay for the dinner, including the 8.875% tax and the tip.

You are considering ordering a $\$$15 dish and a $\$$6 beverage.  How much would you have to pay if you are giving an 18% tip?

**Example, continued**

With 28.75 in your pocket, knowing that you still have some cash leftover, you wonder if you could afford a $\$$17 dish and a $\$$6 beverage, with the same 8.875\% tax and 18\% tip.

**Example, continued**

Since you only have $\$$28.75 but really want that $\$$17 dish and the $\$$6 beverage, you wonder if you can afford this meal if you only give a 15\% tip. 

Remarks:
+ The above examples are all similar (and **repetitive**!)
+ Same method of computation, just different numbers
+ Wouldn't it be nice if there is a python function that allows us to do the above repetitions easily?  Something like

        calculate_bill( LISTOFPRICESOFITEMS, TIPPERCENTAGE)
  that calculates the total bill, given a list of prices/costs of items and how many percent tip we want to give

**Example**

### Activity 1

Define a new function called `my_function2`, which
+ takes three numbers as inputs: `a`, `b`, `c`, and
+ if `a` is strictly greater than zero, then it returns as an output `b + c`;
+ otherwise, if `a` is zero or negative, then it returns as an output `b - c`.

### Activity 2

Define a new function called `compound_interest()` which takes three inputs
+ `initial_deposit`: the amount you deposited in the savings account
+ `interest_rate`: the annual interest rate of the account (in decimal)
+ `num_years`: the number of years the initial deposit stays in the account

and outputs/returns `account_total`, the total amount in the savings count after the specified number of years,

Then, check by using the function to compute the total amount in the account if:
+ we initially deposited 1000 dollars, the annual interest rate is 0.02, and we keep the account for 2 years 
+ we initially deposited 1250 dollars, the annual interest rate is 0.03, and we keep the account for 30 years 
+ try other inputs!

## 2. Loops

### 2.1. Understanding `for` loops

To repeat TASK for each VALUE in the list LIST

    for( VALUE in LIST ):
        TASKS

**Example**

Suppose we want to display the text:  

"1 squared is 1"

"2 squared is 4"

... up to

"20 squared is 400"

### Activity

Suppose we want to display the text:  

"1 cubed is 1"

"2 cubed is 8"

"3 cubed is 27"

... up to

"20 cubed is 8000"

**Write a for loop that accomplishes this task.**

### Activity

Suppose we want to display the text:  

"1 cubed is 1"

"3 cubed is 27"

"5 cubed is 125"

... up to

"25 cubed is 15625"

**Write a for loop that accomplishes this task.**

### Activity: Tracing how variables change values during loops

**Trace what's going on with the following for-loop.**

In [None]:
mylist = [-3, 5, 0, 7, 10]
y = 1
for x in mylist:
    y = x+y
    z = y ** 2
    print(z)

**Example**

Recall our `calculate_bill()` function from above, reproduced below.

In [None]:
# copy and paste function below


Suppose that we would like to compute possible bills for a few different tip percentages, from 10\%, 11%, 12%, ..., to 25\%, if we order the following items:
+ an \$7 appetizer
+ a \$15 entree
+ a \$17 entree
+ two \$6 beverages.

### 2.2. Accessing entries of a data frame during loops

Suppose that we would like to store the values that we computed during loops into a table.

**Example**

Create a data frame called `squares` which has 20 rows and 2 columns:

<table>
    <tr>
        <th>n</th>
        <th>n_squared</th>
    </tr>    
    <tr>
        <td>1</td>
        <td>1</td>
    </tr>    
    <tr>
        <td>2</td>
        <td>4</td>
    </tr>    
    <tr>
        <td>3</td>
        <td>9</td>
    </tr>    
    <tr>
        <td>...</td>
        <td>...</td>
    </tr>    
    <tr>
        <td>19</td>
        <td>361</td>
    </tr>
    <tr>
        <td>20</td>
        <td>400</td>
    </tr>
</table>

In [None]:
squares

# fill in the empty data frame row by row





In [None]:
# note: the above can be done without a for loop
#  using a method we learned earlier in the semester







# so why did we use a for loop?  
#  There are similar tasks that cannot be done using this more straightforward method
#   See the next example

**Activity**

we defined a new function called `my_function2`, which
+ takes three numbers as inputs: `a`, `b`, `c`, and
+ if `a` is strictly greater than zero, then it returns as an output `b + c`;
+ otherwise, if `a` is zero or negative, then it returns as an output `b - c`.

We want to record the outputs of `my_function2` for various values of a, b, and c in the data frame called `records` below:

<table>
    <tr>
        <th>a</th>
        <th>b</th>
        <th>c</th>
        <th>output</th>
    </tr>
    <tr>
        <td>3</td>
        <td>1</td>
        <td>3</td>
        <td> </td>
    </tr>
    <tr>
        <td>-2</td>
        <td>10</td>
        <td>3</td>
        <td> </td>
    </tr>
    <tr>
        <td>1</td>
        <td>4</td>
        <td>9</td>
        <td> </td>
    </tr>
    <tr>
        <td>0</td>
        <td>4</td>
        <td>8</td>
        <td> </td>
    </tr>
</table>


In [None]:
# do not modify this cell

records = pd.DataFrame( {'a': [3, -2, 1, 0],
                         'b': [1, 10, 4, 4],
                         'c': [3, 3, 9, 8],
                         'output': [0, 0, 0, 0]} )
records

In [None]:
# filling in the output column in the records data frame "by hand" / row by row
## this is fine because we have only four rows!
## we probably don't want to do this if we have hundreds of rows











In [None]:
# do not modify this cell

records = pd.DataFrame( {'a': [3, -2, 1, 0],
                         'b': [1, 10, 4, 4],
                         'c': [3, 3, 9, 8],
                         'output': [0, 0, 0, 0]} )
records

**Example**

Suppose that we would like to compute possible bills for a few different tip percentages, from 10\%, 11%, 12%, ..., to 25\%, if we order the following items:
+ an \$7 appetizer
+ a \$15 entree
+ a \$17 entree
+ two \$6 beverage.

We would like to create a data frame with 2 columns and one row for each possible tip percentages.  The first column is the tip percentage itself and the second column is the total bill:

<table>
    <tr>
        <th>tip_percentage</th>
        <th>total</th>
    </tr>    
    <tr>
        <td>10</td>
        <td>60.62625</td>
    </tr>    
    <tr>
        <td>11</td>
        <td>61.64625</td>
    </tr>    
    <tr>
        <td>12</td>
        <td>62.15625</td>
    </tr>    
    <tr>
        <td>...</td>
        <td>...</td>
    </tr>
    <tr>
        <td>24</td>
        <td>67.76625</td>
    </tr>
    <tr>
        <td>25</td>
        <td>68.27625</td>
    </tr>
</table>



**Exercise**

We defined a new function called `compound_interest()` which takes three inputs
+ `initial_deposit`: the amount you deposited in the savings account
+ `interest_rate`: the annual interest rate of the account (in decimal)
+ `num_years`: the number of years the initial deposit stays in the account

and outputs/returns the total amount in the savings count after the specified number of years.  (The function definition is included in the code cell below.)

Suppose that you invested 1000 dollars in a savings account that has a 2% annual interest rate that is added annually.

Create a data frame called `account` which has 
+ two columns: `num_years` and `amount`; the `amount` column will store the amount in the account for the given number of years, from year 0 to year 30.

<table>
    <tr>
        <th>num_years</th>
        <th>amount</th>
    </tr>    
    <tr>
        <td>0</td>
        <td>1000</td>
    </tr>    
    <tr>
        <td>1</td>
        <td>1020</td>
    </tr>    
    <tr>
        <td>2</td>
        <td>1040.4</td>
    </tr>    
    <tr>
        <td>...</td>
        <td>...</td>
    </tr>
    <tr>
        <td>29</td>
        <td>1775.84469029741 </td>
    </tr>
    <tr>
        <td>30</td>
        <td>1811.36158410335</td>
    </tr>
</table>

### Application to Classification

Consider the second simple classifier which we did last class


**Example: Encoding a simple classifier (version 2)**

<table>
    <tr>
        <td><img src="images/lec20-knn-illustration2_wline2.jpg" width="600"></td>
        <td><img src="images/dec_tree1b.jpg" width="600"></td>
    </tr>
</table>  

In [None]:
cancerdata = pd.read_csv('../../shared/datasets/cancer.csv')





In [None]:
# this is a simple classifier that we constructed in the notebook for Lesson07










**We will "wrap" our classifier as a task done by a new function** which we will name `predict_tumor_class()`

+ inputs: two numbers: `marginal_adhesion` and `clump_thickness`
+ output: one number: 0 if we predict the tumor to be benign, 1 otherwise

`Z = predict_tumor_class( X, Y )`

where
+ X = marginal adhesion value
+ Y = clump thickness value
+ Z = the prediction that your decision tree classifier makes for the given values of X and Y

#### Miscellaneous Jupyter Notebook Tips

To increase the indentation of an entire block of code: highlight the code, then
+ Ctrl + ]

To decrease indentation:
+ Ctrl + [

To comment/uncomment an entire block of code:
+ Ctrl + /