# Week 4 in class


#### Learning goals
- Applying and extending your skills in control flow: `for` and `while` loops, and `break` and `continue`.
- Applying your (simulation) skills to economic problems, in particular to investment and risk management.
- Introducing you to vectorization and concerns about speed.

In [None]:
import numpy as np

This is the last week we study mathematical economics and finance, before we move on to data analysis in the next two weeks. You will notice that this class is hard, and that you need many of the techniques you learned so far, plus some creativity and insight. This is what programming is about! However, you may find it comfortable to know that the exam will be slightly easier than this class notebook.

### Break and Continue

#### `break` out of a loop

Sometimes we want to stop a loop early if some condition is met.

Let’s look at the example of finding the smallest `N` such that
$ \sum_{i=0}^N i > 1000 $.

Clearly `N` must be less than 1000, so we know we will find the answer
if we start with a `for` loop over all items in `range(1001)`.

Then, we can keep a running total as we proceed and tell Python to stop
iterating through our range once total goes above 1000.

In [4]:
total = 0
for i in range(1001):
    total = total + i
    if total > 1000:
        break

print("The answer is", i)

The answer is 45


**Exercise 1**

Consider the code below that draws 10000 random numbers between 0 and 1. Try to find the index of the first value in `x`
that is greater than 0.999 using a for loop and `break`. 
*Hint*: try iterating over `range(len(x))`.

In [91]:
x = np.random.rand(10_000)
# YOUR CODE HERE
#raise NotImplementedError()

for i in range(len(x)):
    y = x[i]
    if y > 0.999:
        break
    
    
        
    
print(i)
print(x[i])

1220
0.999851916590159


#### `continue` to the next iteration

Sometimes we might want to stop the *body of a loop* early if a condition is met.

To do this we can use the `continue` keyword.

The basic syntax for doing this is:

```python
for item in iterable:
    # always do these operations
    if condition:
        continue

    # only do these operations if condition is False
```

Inside the loop body, Python will stop that loop iteration of the loop and continue directly to the next iteration when it encounters the `continue` statement.

For example, suppose I ask you to loop over the numbers 1 to 10 and print out
the message “{i} An odd number!” whenever the number `i` is odd, and do
nothing otherwise.

You can use continue to do this as follows:

In [36]:
for i in range(1, 11):
    if i % 2 == 0: # an even number... This is modulus division
        continue   # does the if statement until it is false and then continues
    print(i, "is an odd number!")

1 is an odd number!
3 is an odd number!
5 is an odd number!
7 is an odd number!
9 is an odd number!


**Exercise 2**


Again consider the code below. Write a for loop that adds up all values in `x` that are greater than
or equal to 0.5.

Use the `continue` word to end the body of the loop early for all values
of `x` that are less than 0.5.

*Hint*: Try starting your loop with `for value in x:` instead of
iterating over the indices of `x`.

In [97]:
x = np.random.rand(10_000)
# YOUR CODE HERE
#raise NotImplementedError()
total = 0
for value in x:
    if value < 0.5:
        continue
    total = total + value



print(total)

3778.884788871776


### `for` and `while` and Investment

**Exercise 3**

In economics, when an individual has some knowledge, skills, or education
which provides them with a source of future income, we call it [human
capital](https://en.wikipedia.org/wiki/Human_capital).

When a student graduating from high school is considering whether to
continue with post-secondary education, they may consider that it gives
them higher paying jobs in the future, but requires that they don't begin
working until after graduation.

Consider the simplified example where a student has perfectly forecastable
employment and is given two choices:
1. Begin working immediately and make 40,000 a year until they retire 40
years later.
2. Pay 5,000 a year for the next 4 years to attend university, then
get a job paying 50,000 a year until they retire 40 years after making
the college attendance decision.

Should the student enroll in school if the discount rate is r = 0.05? Assume that costs and benefits occur at the end of a year.

In [None]:
# Discount rate
r = 0.05


# High school wage
w_hs = 40_000

# College wage and cost of college
c_college = 5_000
w_college = 50_000


def netpresentvalue(income, period, rate=r):
    npv = 0
    for i in range(period):
        npv = npv + income/((1+rate)**(i+1))
    return npv
# Compute npv of being a hs worker
hs = netpresentvalue(w_hs, 40)
print(hs)
    

# Compute npv of attending college
c = netpresentvalue(w_college, 36)/((1+r)**4)
print(c)
# Compute npv of being a college worker
cost = netpresentvalue(c_college, 4)
print(cost)
# Is npv_collegeworker - npv_collegecost > npv_hsworker
#raise NotImplementedError()
c - cost > hs

**Exercise 4**

Companies often invest in training their employees to raise their
productivity. Economists sometimes wonder why companies
spend this money when this incentivizes other companies to poach
their employees away with higher salaries since employees gain human capital from training.

Let's say that it costs a company 25,000 dollars to teach their
employees Python, but it raises their output by 2,500 per month. How
many months would an employee need to stay for the company to find it
profitable to pay for their employees to learn Python if their discount
rate is r = 0.01?

Assume that the cost is immediate, but that the extra output occurs at the end of a month.

In [None]:
# Define cost of teaching python
cost = 25_000
r = 0.01

# Per month value
added_value = 2500

n_months = 0
total_npv = 0.0
total_npv + (added_value)/(1+r)**(n_months)
# Put condition below here
for n_months i range(99)
    while (cost > total_npv): # (replace False with your condition here)
        n_months = n_months + 1 # Increment how many months they've worked
    else:
        print(n_months)

    # Increase total_npv

#raise NotImplementedError()




## Loan performance, and vectorization

### Loan performance

Consider a bank offering loans to small businesses. The bank’s loan requires a repayment of $25,000 and must be repaid 1 year after the loan was made. The bank discounts the future at 5%. 

However, the loans made are repaid in full with only 75\% probability, while with a probability of 20% only $12,500 is repaid, and with 5% probability no repayment is made at all.

In this simple case, you can compute the net present value of a loan by hand. The amount repaid, on average, is: $ 0.75(25000) + 0.2(12500) + 0.05(0) = 21250 $.

Since we’ll receive that amount in one year, we have to discount it:
$ \frac{1}{1+0.05} 21,250 \approx 20238 $.

However, we can also verify this amount by simulating the performance of many loans.

### Why Do We Need Randomness?

As economists and data scientists, we study complex systems. These systems have inherent randomness, but they do not always readily reveal their underlying distribution to us.

In cases where we face this difficulty, we turn to a set of tools known as Monte Carlo
methods. These methods effectively boil down to repeatedly simulating some event (or events) and looking at
the outcome distribution. This tool is used to inform decisions in search and rescue missions, election predictions, sports,
and even by central banks.

The reasons that Monte Carlo methods work is the *Law of Large Numbers* that we saw in the second week.

Let's have a look at the code below. It defines a function that simulates the amount repaid on N loans. By taking the average over a large number of simulations, we can (roughly) check our analytical result.

In [None]:
def simulate_loan_repayments(N, r=0.05, repayment_full=25_000.0, repayment_part=12_500.0):
    """
    Simulate present value of N loans given values for discount rate and
    repayment values
    """
    repayment_sims = np.zeros(N)
    for i in range(N):
        x = np.random.rand()  # Draw a random number

        # Full repayment 75% of time
        if x < 0.75:
            repaid = repayment_full
        elif x < 0.95:
            repaid = repayment_part
        else:
            repaid = 0.0

        repayment_sims[i] = (1 / (1 + r)) * repaid

    return repayment_sims

print(np.mean(simulate_loan_repayments(250_000)))

### Vectorization

You can see that the code results in an approximation of the expectation. However, this simulation is much slower than necessary. The cell below shows how much time it takes Python to compute 250,000 simulations.

In [None]:
%timeit simulate_loan_repayments(250_000)

This function is simple enough that its speed is acceptable, but it is important to learn how to speed up your code for more complicated operations.

One important technique to speed up your code is *vectorization*, which is when computations operate on an entire array at a time. In general, numpy code that is vectorized will perform better than numpy code that operates on one element at a time. The idea is to use numpy arrays to perform computations instead of only storing the values.

----------

**Exercise 5**

Complete the code below using vectorization to speed up your simulations. Time your new function. How much faster is your vectorized code?

*Hint:* Get rid of the `for` loop, and the `if`, `elif` and `else` statements, and create an array of Booleans instead.

In [None]:
def simulate_loan_repayments_fast(N, r=0.05, repayment_full=25_000.0, repayment_part=12_500.0):
    """
    Simulate present value of N loans given values for discount rate and
    repayment values using vectorization
    """
    random_numbers = np.random.rand(N)

raise NotImplementedError()

YOUR ANSWER HERE